Help support

Should you have any question, please check the Gaia FAQ section or contact the Gaia Helpdesk

ICRF2 sources (DR1)

Author: Alcione Mora

This tutorial was developed for Data Release 1 and is kept for legacy (DR1 old data still available).

​​​​​​​​

Gaia DR1 contains information on ICRF reference sources and variable stars, in addition to the main gaia_source table, which contains the astrometry and average photometry for 1.14 billion sources in the sky, and the selection of pulsating variables inn the Large Magellanic Cloud. The following sections provide hints on how to work with these data using the Archive.

This is an intermediate level tutorial that assumes a basic knowledge of the general interface and workflow. The introductory tutorials White dwarfs exploration and Cluster analysis are recommended in case of difficulties following this exercise.

  1. ICRF sources

    Mignard et al. 2016 A&A 595A, 5M presents the Gaia astrometric solution created to align DR1 with the ICRF. The abstract context and aims are reproduced below.

    Context. As part of the data processing for Gaia Data Release 1 (Gaia DR1) a special astrometric solution was computed, the so-called auxiliary quasar solution. This gives positions for selected extragalactic objects, including radio sources in the second realisation of the International Celestial Reference Frame (ICRF2) that have optical counterparts bright enough to be observed with Gaia. A subset of these positions was used to align the positional reference frame of Gaia DR1 with the ICRF2. Although the auxiliary quasar solution was important for internal validation and calibration purposes, the resulting positions are in general not published in Gaia DR1.

    Aims. We describe the properties of the Gaia auxiliary quasar solution for a subset of sources matched to ICRF2, and compare their optical and radio positions at the sub-mas level.

    Table gaiadr1.aux_qso_icrf2_match in the archive contains the data used in that paper. It is relatively small (2191 entries), so a full download can be carried out using the following ADQL query within the Search→ADQL Form tab.

    select * from gaiadr1.aux_qso_icrf2_match
    

    The table is fully functional within the archive, though. In the following, it will be shown how to reproduce Figures 5, 6 and 7 of Mignard et al. 2016 using the Archive and Topcat.

  2. Get ICRF catalogue from CDS and upload as user table 'icrf2'

    Follow steps 1-7 from ‘White dwarfs exploration’ tutorial, but using the ICRF2 catalogue (Fey et al. 2015 AJ 150, 58F) and 'icrf2' as the user table name. The CDS catalogue is J/AJ/150/58/icrf2

  3. Gaia QSO to ICRF2 cross-match on source name

    Execute the following ADQL query in the Search→ADQL Form tab, replacing <username> by the appropriate Archive user name.

    SELECT gaia.ra AS gaia_ra, gaia.dec AS gaia_dec,
      icrf2.icrf2, icrf2.source,
      icrf2.raj2000 AS icrf2_ra,
      icrf2.dej2000 AS icrf2_dec
    FROM gaiadr1.aux_qso_icrf2_match AS gaia
    JOIN user_<username>.icrf2 AS icrf2
      ON gaia.icrf2_match = icrf2.icrf2
    

    A quick look on the results ('White dwarfs exploration' step 10) shows the combined output:

  4. Gaia to ICRF2 positional differences computation

    The position difference between Gaia and ICRF2 can also be computed within the archive adding two new columns to the output: ra_diff and dec_diff (units: mas)

    SELECT gaia.ra AS gaia_ra, gaia.dec AS gaia_dec,
      icrf2.icrf2, icrf2.source,
      icrf2.raj2000 AS icrf2_ra, icrf2.dej2000 AS icrf2_dec,
      (gaia.ra - icrf2.raj2000) * cos(radians(icrf2.dej2000)) * 3600000 AS ra_diff,
      (gaia.dec - icrf2.dej2000) * 3600000 AS dec_diff
    FROM gaiadr1.aux_qso_icrf2_match AS gaia
    JOIN user_<username>.icrf2 AS icrf2
      ON gaia.icrf2_match = icrf2.icrf2
    
  5. Data export to Topcat and representation

    The table can be exported to the Topcat visualization tool via SAMP following ‘White dwarfs tutorial’ steps 15-17. The data will be divided into three subsets according to the contents of column 'source'

    ​​​​​​​

    ​​​​​​​​​​​​​​​​​​​​​

    ​​​​​​​

    We will now represent dec_diff vs ra_dif as a 2D plot

    ​​​​​​​

    Initially, only the points presenting big discrepancies are apparent. Changing the plot range to ±10 mas in both axes, a much more informative plot is obtained.

    ​​​​​​​

    Finally, each subset can be plotted using different colours to reveal the behaviour of the different subsamples. This plot is equivalent to Fig.7 top right panel in Mignard et al. 2016.

    ​​​​​​​

  6. Gaia to ICRF2 positional differences histogram

    The 2D data studied in the previous sections can also be displayed as a collection of 1D histograms. This can directly be carried out within most plotting programs, such as Topcat. However, this approach might become impractical when the number of data points reach billions in size (far from the case for ICRF sources). The following query shows how to compute a histogram directly within the Archive:

    SELECT 0.2 * index AS dec_diff, n from (
      SELECT floor(5 * (gaia.dec - icrf2.dej2000) * 3600000 + 0.1) AS index,
        count(*) AS n
      FROM gaiadr1.aux_qso_icrf2_match AS gaia
      JOIN user_<username>.icrf2 AS icrf2
        ON gaia.icrf2_match = icrf2.icrf2
      GROUP BY index
    ) AS subquery
    ORDER BY dec_diff
    

    where an intermediate integral index is defined such that the difference in declination is 0.2 mas for each increment of a full init. The accumulations are carried out in a subquery. The outer query reverses the multiplying factor to recover the original mas scale. The output data can then be exported to Topcat and represented as a 1D histogram, using the weighting the 'dec_diff' column as a function of the 'n' number of objects in that bin.

    ​​​​​​​

    After adjusting the x-axis range to ±10 mas

    ​​​​​​​

    and the bin size to 0.2 mas (the value used in the Archive query), the plot is now equivalent to Fig. 6 left panel in Mignard et al. 2016.

    ​​​​​​​

​​​​​​​

Cluster Analysis GUI

Authors: Raúl Gutiérrez, Alcione Mora, José Hernández

This is a tutorial is focused on possible scientific exploration exercise using the Gaia Archive. Realistic science use cases created from users are really welcome and they could be shared in this section with the proper reference/contact point.

We are going to explore a known cluster as the Pleiades (m45) using Gaia data. First, we are going to retrieve all the available data in the region of interest:

  1. As we are going to use the private storage area, we have to log in the archive.

  2. Go to Search → Simple Form. In positional search, enter "pleiades" in the field "Name". Once it is resolved, select a 2 degrees search radius and make sure "Gaia source" is selected. Click on "Show Query" button.

    ​​​​​
  3. Enter "m45" in the field Job name . Edit the ADQL query and remove the TOP 500 restriction. The query should be:

     
    SELECT *
    FROM gaiadr1.gaia_source
    WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1						

    Execute the query. Around 1e5 results are found.

  4. The number of results are small enough to be represented by a local application. Open Topcat and click Send to SAMP button. As the job is private, Topcat will ask for your credentials.
    Once the data have been loaded, you could show the results in the sphere, or create a proper motion plot to identify the cluster.

  5. Go to the archive and filter data by quality. For that, create a new DB table in your local environment from the job results clicking on the corresponding upload icon (see image):
    ​​​​​

    ​​​​​​​​​​​​​​​​​​​​​ In this upload operation, no heavy network traffic occurs between the user machine and the server. Job results are stored in the server, so data to be ingested in the user area from job results do not leave the server in the upload process.

    Enter "m45PmFilter" as job name and perform the next query (do not forget to replace <username> with your user name):

     
    SELECT * FROM user_<username>.m45
    WHERE abs(pmra_error/pmra)<0.10
    AND  abs(pmdec_error/pmdec)<0.10
    AND pmra IS NOT NULL AND abs(pmra)>0
    AND pmdec IS NOT NULL AND abs(pmdec)>0;						

    Execute the query. You can send the results to Topcat via SAMP and plot the new results over the proper motion plot to see the sources with sufficient proper motion quality.

  6. Create the table m45pmfilter from m45PmFilter job results. Now we are going to take the candidate objects to be in the cluster. Based on the proper motion plot, we execute the next filter:

    SELECT * FROM user_<username>.m45pmfilter
    WHERE pmra BETWEEN 15 AND 25
    AND pmdec BETWEEN -55 AND -40;					

    Name the job as m45cluster and execute the job. You could send the results to Topcat and plot over the previous proper motion plots.

  7. Using the ADQL interface we can perform analysis queries on the results. Create the table m45cluster from m45cluster job results. Name the job as m45clusterParallaxAvg . Execute the next query:

    SELECT avg(parallax) as avg_parallax FROM user_<username>.m45cluster				

    Execute the query and show the results using the corresponding button from the job results.

    ​​​​​​​
  8. Now, we want to add information from other catalogues. To do so, we will make use of the crossmatch functionality of the archive. We need to identify which columns of our table have the geometrical information ( ra and dec in our case). Select m45cluster table by checking its check box and click on the Edit table button. Find column ra and set the flag Ra . Find the column dec and set the flag Dec . Click on Update button .

    ​​​​​This action will create a positional index on Ra and Dec, and the table will be identified as a geometrical table. This allows the use of the crossmatch functionality.

    ​​​​​​​
  9. Click on crossmatch button. Select user_<username>.m45cluster for TableA and gaiadr1.twomass_original_valid for TableB . Click on Execute .
    This will create a new job of crossmatch type and a new table called xmatch_m45cluster_tmass_original_valid . This table is a joint table between m45cluster and twomass_original_valid tables. A helper function is available in the crossmatch job to create a join query between the two tables.

     
    Positional crossmatch is performed for the time being. All counterparts falling in the search radius are considered matches. Distance is provided for further filtering.
  10. To create a table with all the information from the two tables, click in the Show join query button of the crossmatch job. A query like this is automatically loaded in the ADQL query panel:

     
    SELECT c."dist", a."dec", a."m45cluster_oid", a."ra", a."astrometric_chi2_ac", a."astrometric_chi2_al", a."astrometric_delta_q", a."astrometric_excess_noise",
    		a."astrometric_excess_noise_sig", a."astrometric_go_f", a."astrometric_n_obs_ac", a."astrometric_n_obs_al", a."astrometric_n_outliers_ac",
    		a."astrometric_n_outliers_al", a."astrometric_params_solved", a."astrometric_primary_flag", a."astrometric_priors_used", a."astrometric_rank_defect",
    		a."astrometric_relegation_factor", a."astrometric_weight_ac", a."astrometric_weight_al", a."dec_error", a."dec_parallax_corr", a."dec_pmdec_corr",
    		a."dec_pmra_corr", a."dec_pmradial_corr", a."m45_oid", a."m45pmfilter_oid", a."matched_observations", a."parallax", a."parallax_error",
    		a."parallax_pmdec_corr", a."parallax_pmra_corr", a."parallax_pmradial_corr", a."phot_bp_mean_flux", a."phot_bp_mean_flux_error", a."phot_bp_mean_mag",
    		a."phot_bp_n_obs", a."phot_g_mean_flux", a."phot_g_mean_flux_error", a."phot_g_mean_mag", a."phot_g_n_obs", a."phot_rp_mean_flux",
    		a."phot_rp_mean_flux_error", a."phot_rp_mean_mag", a."phot_rp_n_obs", a."phot_variable_flag", a."pmdec", a."pmdec_error", a."pmdec_pmradial_corr",
    		a."pmra", a."pmradial", a."pmradial_error", a."pmra_error", a."pmra_pmdec_corr", a."pmra_pmradial_corr", a."ra_dec_corr", a."radial_velocity",
    		a."radial_velocity_constancy_probability", a."radial_velocity_error", a."ra_error", a."random_index", a."ra_parallax_corr", a."ra_pmdec_corr",
    		a."ra_pmra_corr", a."ra_pmradial_corr", a."ref_epoch", a."scan_direction_mean_k1", a."scan_direction_mean_k2", a."scan_direction_mean_k3",
    		a."scan_direction_mean_k4", a."scan_direction_strength_k1", a."scan_direction_strength_k2", a."scan_direction_strength_k3", a."scan_direction_strength_k4",
    		a."solution_id", a."source_id",
    		b."dec", b."ra", b."designation", b."err_ang", b."err_maj", b."err_min", b."ext_key", b."h_m", b."h_msigcom", b."j_date", b."j_m", b."j_msigcom",
    		b."k_m", b."k_msigcom", b."ph_qual", b."tmass_oid"
    FROM user_<username>.m45cluster AS a, public.tmass_original_valid AS b, user_<username>.xmatch_m45cluster_tmass_original_valid AS c
    WHERE (c.m45cluster_m45cluster_oid = a.m45cluster_oid AND c.tmass_original_valid_tmass_oid = b.tmass_oid)	
    All the columns of the joined tables are shown for convenience. This way, the user has only to remove colums to get the desired output.

    Name the job as xmatch and execute it. Show the results and verify that both Gaia and 2Mass data are present.

  11. Now, we are going to use the sharing functionality of the archive to share this results with some colleagues. First, create a new table cluster_2mass from the previous job. Go to the tab SHARE > Groups and create a group called cluster . The new group appears in the tree. Select the group and click on Edit . Use the User to include field to search for your colleague click on add. Repeat this search to add any other colleagues you want. Then click on Update . The group has been updated with the new members.
    Return to the ADQL search page, check the cluster_2mass table and click on Share button. Select group cluster , click Add and then Update . You will see that a little Share icon is added to the table icon. From this moment, the users in cluster group will be notified and they will have access to this table.

     
    ​​​​​​​ Size of shared tables is only accounted for the quota of the table owner.

Cluster Analysis Python

Authors: Deborah Baines

This tutorial has taken the Cluster analysis tutorial and adapted it to python. The tutorial uses the Gaia TAP+ (astroquery.gaia) module .

 

This tutorial is focused on a possible scientific exploration exercise for a known cluster, the Pleiades (M45), using data from the Gaia Archive.

You can import and run this tutorial in your own Jupyter Notebook using this file: Download

First, we import all the required python modules:

 

In [1]:
import astropy.units as u
from astropy.coordinates.sky_coordinate import SkyCoord
from astropy.units import Quantity
from astroquery.gaia import Gaia
 
Created TAP+ (v1.0) - Connection:
	Host: gea.esac.esa.int
	Use HTTPS: True
	Port: 80
	SSL Port: 443
In [2]:
%matplotlib inline
import matplotlib.pyplot as plt​​​​​​​
import numpy as np

# Suppress warnings. Comment this out if you wish to see the warning messages
import warnings
warnings.filterwarnings('ignore')
 

Do the following to load and look at the available Gaia table names:

In [3]:
from astroquery.gaia import Gaia
tables = Gaia.load_tables(only_names=True)
for table in (tables):
    print (table.get_qualified_name())
 
Retrieving tables...
Parsing tables...
Done.
public.dual
public.tycho2
public.igsl_source
public.hipparcos
public.hipparcos_newreduction
public.hubble_sc
public.igsl_source_catalog_ids
tap_schema.tables
tap_schema.keys
tap_schema.columns
tap_schema.schemas
tap_schema.key_columns
gaiadr1.phot_variable_time_series_gfov
gaiadr1.ppmxl_neighbourhood
gaiadr1.gsc23_neighbourhood
gaiadr1.ppmxl_best_neighbour
gaiadr1.sdss_dr9_neighbourhood
gaiadr1.rrlyrae
gaiadr1.allwise_neighbourhood
gaiadr1.gsc23_original_valid
gaiadr1.tmass_original_valid
gaiadr1.allwise_best_neighbour
gaiadr1.cepheid
gaiadr1.urat1_neighbourhood
gaiadr1.ppmxl_original_valid
gaiadr1.tmass_neighbourhood
gaiadr1.ucac4_best_neighbour
gaiadr1.ucac4_neighbourhood
gaiadr1.aux_qso_icrf2_match
gaiadr1.phot_variable_time_series_gfov_statistical_parameters
gaiadr1.sdssdr9_original_valid
gaiadr1.urat1_best_neighbour
gaiadr1.variable_summary
gaiadr1.ucac4_original_valid
gaiadr1.tmass_best_neighbour
gaiadr1.gsc23_best_neighbour
gaiadr1.gaia_source
gaiadr1.ext_phot_zero_point
gaiadr1.sdss_dr9_best_neighbour
gaiadr1.tgas_source
gaiadr1.urat1_original_valid
gaiadr1.allwise_original_valid
 

Next, we retrieve all the available data in the region of interest.

To do this we perform an asynchronous query (asynchronous rather than synchronous queries should be performed when retrieving more than 2000 rows) centred on the Pleides (coordinates: 56.75, +24.1167) with a search radius of 2 degrees and save the results to a file.

Note: The query to the archive is with ADQL (Astronomical Data Query Language). For a description of ADQL and more examples see the Gaia DR1 ADQL cookbook.​​

In [4]:
job = Gaia.launch_job_async("SELECT * \
FROM gaiadr1.gaia_source \
WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1;" \
, dump_to_file=True)

print (job)
 
Launched query: 'SELECT * FROM gaiadr1.gaia_source WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1;'
Retrieving async. results...
Jobid: 1495017686224O
Phase: None
Owner: None
Output file: async_20170517124518.vot
Results: None
 

Inspect the output table and number of rows (around 1e5 results are found):

In [5]:
r = job.get_results()
print (r['source_id'])
 
    source_id
-----------------
66926207631181184
66818318054203520
66917823855519360
66830859358837888
66809423175240448
66944761890240000
66980191076373760
66781621852927232
66827805636652928
66947545031024640
              ...
66649989694512256
65666785781176576
64014803920669568
64137880504644992
66542306274948224
64005909043397504
66689881351473664
66436615718993792
65645757620918912
66718434293042560
64103559419447296
Length = 98538 rows
 

To identify the cluster, create a proper motion plot of proper motion in RA (pmra) versus proper motion in DEC (pmdec) in the range pmra [-60,80] and pmdec [-120,30]:

In [6]:
plt.scatter(r['pmra'], r['pmdec'], color='r', alpha=0.3)
plt.xlim(-60,80)
plt.ylim(-120,30)

plt.show()
 
 
 

Perform another asynchronous query to filter the results by quality:

In [7]:
job2 = Gaia.launch_job_async("SELECT * \
FROM gaiadr1.gaia_source \
WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 \
AND abs(pmra_error/pmra)<0.10 \
AND abs(pmdec_error/pmdec)<0.10 \
AND pmra IS NOT NULL AND abs(pmra)>0 \
AND pmdec IS NOT NULL AND abs(pmdec)>0;", dump_to_file=True)
 
Launched query: 'SELECT * FROM gaiadr1.gaia_source WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 AND abs(pmra_error/pmra)<0.10 AND abs(pmdec_error/pmdec)<0.10 AND pmra IS NOT NULL AND abs(pmra)>0 AND pmdec IS NOT NULL AND abs(pmdec)>0;'
Retrieving async. results...
 

Again, inspect the output table and number of rows:

In [8]:
j = job2.get_results()
print (j['source_id'])
 
    source_id
-----------------
66623395256627712
66581957412169728
65614730777165824
64053561704835584
65828207832006400
66858862543722112
66863054431798272
66588520122169728
66657823714473856
66592437132341248
              ...
66912360656318208
66471215975411200
66729257611496704
64013704408496896
64013807487711360
66506331628024832
66715101399291392
66724447247218048
66610957031353856
66857281995760000
66570549979009280
Length = 218 rows
 

Plot these new filtered results on the same plot as the previous search:

In [9]:
plt.scatter(r['pmra'], r['pmdec'], color='r', alpha=0.3)
plt.scatter(j['pmra'], j['pmdec'], color='b', alpha=0.3)
plt.xlim(-60,80)
plt.ylim(-120,30)

plt.show()
 
 
 

Now we are going to take the candidate objects to be in the cluster. Based on the proper motion plot, we execute the same job with the following constraints on the proper motions in RA and DEC: pmra between 15 and 25, pmdec between -55 and -40:

In [10]:
job3 = Gaia.launch_job_async("SELECT * \
FROM gaiadr1.gaia_source \
WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 \
AND abs(pmra_error/pmra)<0.10 \
AND abs(pmdec_error/pmdec)<0.10 \
AND pmra IS NOT NULL AND abs(pmra)>0 \
AND pmdec IS NOT NULL AND abs(pmdec)>0 \
AND pmra BETWEEN 15 AND 25 \
AND pmdec BETWEEN -55 AND -40;", dump_to_file=True)
 
Launched query: 'SELECT * FROM gaiadr1.gaia_source WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 AND abs(pmra_error/pmra)<0.10 AND abs(pmdec_error/pmdec)<0.10 AND pmra IS NOT NULL AND abs(pmra)>0 AND pmdec IS NOT NULL AND abs(pmdec)>0 AND pmra BETWEEN 15 AND 25 AND pmdec BETWEEN -55 AND -40;'
Retrieving async. results...
 

Again, inspect the output table and number of rows, and call the job 'm45cluster':

In [11]:
m45cluster = job3.get_results()
print (m45cluster['parallax'])
 
     parallax
    Angle[mas]
------------------
7.4545648282310184
7.5239065408350996
6.9301093717323772
7.5788921365825894
 7.328286272889863
7.0233646883680159
8.1389671643305235
8.1445555234101779
7.3648468313130566
7.1694418609395765
               ...
7.2826513541337299
7.4681735932280162
 7.658985698496882
7.2023395286122085
8.2920957945484393
7.1461398221089798
7.8361028050064272
7.8762955633015945
 7.066701274479442
7.3330438493994405
Length = 106 rows
 

Plot these new filtered results on the same plot as the previous search:

In [12]:
plt.scatter(r['pmra'], r['pmdec'], color='r', alpha=0.3)
plt.scatter(j['pmra'], j['pmdec'], color='b', alpha=0.3)
plt.scatter(m45cluster['pmra'], m45cluster['pmdec'], color='g', alpha=0.3)
plt.xlim(-60,80)
plt.ylim(-120,30)

plt.show()
 
 
 

Calculate the average parallax and standard deviation of the parallax for the M45 cluster candidates:

In [13]:
avg_parallax = np.mean(m45cluster['parallax'])
stddev_parallax = np.std(m45cluster['parallax'])
print (avg_parallax, stddev_parallax)
 
7.4686695575 0.834822732559
 

Now, we want to add information from other catalogues, in this example from 2MASS. To do this we make use of the pre-computed cross-matched tables provided in the Gaia archive.

We obtain the 2MASS photometric data by using the Gaia - 2MASS cross-matched best neighbour table (gaiadr1.tmass_best_neighbour) to identify the sources and the 2MASS original table (gaiadr1.tmass_original_valid) to retrieve the photometry:

In [14]:
job4 = Gaia.launch_job_async("SELECT * \
FROM gaiadr1.gaia_source AS g, gaiadr1.tmass_best_neighbour AS tbest, gaiadr1.tmass_original_valid AS tmass \
WHERE g.source_id = tbest.source_id AND tbest.tmass_oid = tmass.tmass_oid \
AND CONTAINS(POINT('ICRS',g.ra,g.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 \
AND abs(pmra_error/pmra)<0.10 \
AND abs(pmdec_error/pmdec)<0.10 \
AND pmra IS NOT NULL AND abs(pmra)>0 \
AND pmdec IS NOT NULL AND abs(pmdec)>0 \
AND pmra BETWEEN 15 AND 25 \
AND pmdec BETWEEN -55 AND -40;", dump_to_file=False)
 
Launched query: 'SELECT * FROM gaiadr1.gaia_source AS g, gaiadr1.tmass_best_neighbour AS tbest, gaiadr1.tmass_original_valid AS tmass WHERE g.source_id = tbest.source_id AND tbest.tmass_oid = tmass.tmass_oid AND CONTAINS(POINT('ICRS',g.ra,g.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 AND abs(pmra_error/pmra)<0.10 AND abs(pmdec_error/pmdec)<0.10 AND pmra IS NOT NULL AND abs(pmra)>0 AND pmdec IS NOT NULL AND abs(pmdec)>0 AND pmra BETWEEN 15 AND 25 AND pmdec BETWEEN -55 AND -40;'
Retrieving async. results...
Query finished.
 

Finally, confirm the output table has Gaia and 2MASS photometry and check the number of rows in the table:

In [15]:
p = job4.get_results()
print (p['phot_g_mean_mag', 'j_m', 'h_m', 'ks_m'])
 
 phot_g_mean_mag        j_m            h_m            ks_m
  Magnitude[mag]   Magnitude[mag] Magnitude[mag] Magnitude[mag]
------------------ -------------- -------------- --------------
10.757800849005008      9.6029997      9.2019997      9.0939999
6.0733019438444877      5.9679999      6.0510001      5.9759998
10.643169127592188      9.5349998      9.2189999      9.1370001
6.8275023878511796      6.6989999      6.7329998      6.6919999
 10.15654586177056      9.1169996          8.868      8.7580004
7.5371472521505787      7.2800002           7.29          7.257
8.1328652600027205      7.5879998      7.5279999      7.4699998
8.5268119167756886      7.8920002      7.7670002      7.7379999
9.0234235063721506      8.2360001          8.033      8.0019999
7.0246160054011426          6.848      6.9200001          6.895
               ...            ...            ...            ...
10.511033521483427      9.4770002      9.1400003      9.0719995
11.597470392962894         10.261          9.835      9.7299995
10.591447284586406      9.5310001      9.1920004      9.1230001
10.325731863675678      9.3500004          9.092      8.9960003
9.9103711452615997      8.9829998      8.7110004          8.632
9.3335117945827335          8.533      8.3290005      8.2819996
8.1126755868950937      7.6729999          7.599      7.5760002
6.3275477760376688      6.2280002      6.2480001          6.257
8.1878498252772722          7.526      7.3930001      7.3520002
11.149659997470195      9.9490004      9.5640001      9.4390001
Length = 106 rows
 

All of the above has been performed as an anonymous user to the Gaia archive. To log in to the archive, keep and share your results, see the following instructions: http://astroquery.readthedocs.io/en/latest/gaia/gaia.html#authenticated-access

 

Additional information

The above query to obtain the 2MASS catalogue data can also be performed by using an 'INNER JOIN' in the ADQL query. For example:

In [16]:
job5 = Gaia.launch_job_async("SELECT * \
FROM gaiadr1.gaia_source \
INNER JOIN gaiadr1.tmass_best_neighbour ON gaiadr1.gaia_source.source_id = gaiadr1.tmass_best_neighbour.source_id \
INNER JOIN gaiadr1.tmass_original_valid ON gaiadr1.tmass_original_valid.tmass_oid = gaiadr1.tmass_best_neighbour.tmass_oid \
WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 \
AND abs(pmra_error/pmra)<0.10 \
AND abs(pmdec_error/pmdec)<0.10 \
AND pmra IS NOT NULL AND abs(pmra)>0 \
AND pmdec IS NOT NULL AND abs(pmdec)>0 \
AND pmra BETWEEN 15 AND 25 \
AND pmdec BETWEEN -55 AND -40;", dump_to_file=True)
 
Launched query: 'SELECT * FROM gaiadr1.gaia_source INNER JOIN gaiadr1.tmass_best_neighbour ON gaiadr1.gaia_source.source_id = gaiadr1.tmass_best_neighbour.source_id INNER JOIN gaiadr1.tmass_original_valid ON gaiadr1.tmass_original_valid.tmass_oid = gaiadr1.tmass_best_neighbour.tmass_oid WHERE CONTAINS(POINT('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),CIRCLE('ICRS',56.75,24.1167,2))=1 AND abs(pmra_error/pmra)<0.10 AND abs(pmdec_error/pmdec)<0.10 AND pmra IS NOT NULL AND abs(pmra)>0 AND pmdec IS NOT NULL AND abs(pmdec)>0 AND pmra BETWEEN 15 AND 25 AND pmdec BETWEEN -55 AND -40;'
Retrieving async. results...
 

Confirm the output table has Gaia and 2MASS photometry and check the number of rows in the table is the same as above (106 rows):

In [17]:
test = job5.get_results()
print (test['phot_g_mean_mag', 'j_m', 'h_m', 'ks_m'])
 
 phot_g_mean_mag        j_m            h_m            ks_m
  Magnitude[mag]   Magnitude[mag] Magnitude[mag] Magnitude[mag]
------------------ -------------- -------------- --------------
10.757800849005008      9.6029997      9.2019997      9.0939999
6.0733019438444877      5.9679999      6.0510001      5.9759998
10.643169127592188      9.5349998      9.2189999      9.1370001
6.8275023878511796      6.6989999      6.7329998      6.6919999
 10.15654586177056      9.1169996          8.868      8.7580004
7.5371472521505787      7.2800002           7.29          7.257
8.1328652600027205      7.5879998      7.5279999      7.4699998
8.5268119167756886      7.8920002      7.7670002      7.7379999
9.0234235063721506      8.2360001          8.033      8.0019999
7.0246160054011426          6.848      6.9200001          6.895
               ...            ...            ...            ...
10.511033521483427      9.4770002      9.1400003      9.0719995
11.597470392962894         10.261          9.835      9.7299995
10.591447284586406      9.5310001      9.1920004      9.1230001
10.325731863675678      9.3500004          9.092      8.9960003
9.9103711452615997      8.9829998      8.7110004          8.632
9.3335117945827335          8.533      8.3290005      8.2819996
8.1126755868950937      7.6729999          7.599      7.5760002
6.3275477760376688      6.2280002      6.2480001          6.257
8.1878498252772722          7.526      7.3930001      7.3520002
11.149659997470195      9.9490004      9.5640001      9.4390001
Length = 106 rows
 

Visually inspect the results are the same by plotting the same as above:

In [18]:
plt.scatter(r['pmra'], r['pmdec'], color='r', alpha=0.3)
plt.scatter(j['pmra'], j['pmdec'], color='b', alpha=0.3)
plt.scatter(m45cluster['pmra'], m45cluster['pmdec'], color='g', alpha=0.3)
plt.scatter(test['pmra'], test['pmdec'], color='y', alpha=0.3)
plt.xlim(-60,80)
plt.ylim(-120,30)

plt.show()
 
​​​​​​​

 

White Dwarfs Exploration

Authors: Jesús Salgado, Juan-Carlos Segovia

This is a tutorial is focused on possible scientific exploration exercise using the Gaia Archive. Realistic science use cases created from users are really welcome and they could be shared in this section with the proper reference/contact point.

We are going to explore white dwarfs observed by Gaia. First, we have a typical known white dwarfs catalogue at Vizier:

  1. Go to Vizier to:
    IR photometry of 2MASS/Spitzer white dwarfs
    http://vizier.u-strasbg.fr/viz-bin/VizieR-3?-source=J/ApJ/657/1013&-out.max=50&-out.form=HTML%20Table&-out.add=_r&-out.add=_RAJ,_DEJ&-sort=_r&-oc.form=sexa

  2. Download the catalogue by:
    • Select in Preferences: max=unlimited, output format = VOTable.
    • Click on Submit.
    • A file called vizier_votable.vot will be created.
  3. Open in a different tab the Gaia Archive:
    https://archives.esac.esa.int/gaia/

  4. Log in Gaia Archive.

  5. Click on SEARCH Tab and, inside it, the Advanced ADQL Form.

  6. Click on the upload table button

    and select the vizier_votable.vot VOTable.

  7. Use as table name "dwarfs" and click on the Upload button. As some of the column names of the downloaded VOTable are not compatible with TAP/ADQL, the system will automatically rename them showing the next notice:

  8. Select the uploaded table user_<your_login_name>.dwarfs (under 'User tables') and click on the edit table button:

    ​​​​​​​​​​​​​​

  9. For column col_raj2000 select the flag Ra. And for column col_dej2000 select the flag Dec. Then click UPDATE. The table icon in the tree will change to a "Positional indexed table"

    ​​​​​​​

  10. Inspect the table content inside Gaia Archive. Type in the form at the top of the page:

    select top 100 * from user_<your_user_name>.dwarfs

    where <your_user_name> is your own username, and click 'Submit Query'. A new job will start to be exectued and, when finished, the table result could be inspected by clicking on the "Display top 2000 results" button:

    ​​​​​​​

    Now, we want to obtain metadata of Gaia catalogues for these sources. In order to do that, counterparts should be found by the execution of a crossmatch operation. A positional crossmatch (identification by position) will be executed. More complex algorithms will be offered in future Gaia Archive versions.

  11. Click on the crossmatch button:

    ​​​​​​​

  12. Select as Table A: public.igsl_source and as Table B: user_<your_user_name>.dwarfs with a radius of 1 arcsecond

    Note: IGSL is a combination catalogue from other external catalogues. It has the size of the expected future Gaia catalogue and a synthetic photometry on band G (Gaia). The calculation of this photomtetry could fail for peculiar objects. See more info on IGSL at: http://www.cosmos.esa.int/web/gaia/iow_20131008

  13. Execute the crossmatch. A new job will start. At the end of the execution, a new join table (called xmatch_igsl_source_dwarfs by default) will be created between IGSL and "dwarfs" catalogues.

  14. When finished, click on the "Show join query"

    ​​​​​​​

    This query is an example on how to contain all the metadata of the two catalogues. Reduce the content of the metadata by replacing the SELECT part of the ADQL sentence as follows

    SELECT a."source_id", b."dwarfs_oid", b."name", a."ra", a."dec", b."col_raj2000",  b."col_dej2000",
    a."mag_g", b."f_hmagc", b."f_jmagc", b."f_kmagc", b."hmag2", b."hmagc", b."jmag2",
    b."jmagc", b."kmagc", b."ksmag2"

    Note: FROM and WHERE conditions must be preserved
    This query will contain the ids of the source in both catalogues (to explore possible duplications), the name of the dwarf stars and magnitudes from both catalogues.

    Click on "Submit Query" to launch a new job.

  15. Open the VO application Topcat:
    http://www.star.bris.ac.uk/~mbt/topcat/topcat-lite.jnlp

  16. Download results and open them in Topcat.

    Click on Download button

    ​​​​​​​

    Open results with Topcat.

  17. on the "Plane Plotting Window" button

    ​​​​​​​

  18. Add, by clicking on the "Add a new positional control to the stack", to create three plots:

    • hmagc versus kmagc
    • jmagc versus kmagc
    • jmagc versus mag_g

    The result should be as follows

    ​​​​​​​

    Most of the points are located in a clear line except some few up and two of them clearly below it. That could imply that they emits on the G band more than the expected (as the G band is synthetic for IGSL, this is not fully clear) or that the crossmatch is not fully correct for these sources.

    Click on the "Display table cell data" button on the main Topcat window.

    ​​​​​​​

  19. Click on the Plan Plot, one at once, on the two strange sources. Once you click on one source, the two plots will be synchronized

    ​​​​​​​​​​​​​​

    ​​​​​​​

  20. Checking the name column in the Table browser, the two sources are:

    • LTT 4816
    • L745-46A

     

    The first object has been identified as a pulsating white dwarf and the second as a simple white dwarf.

    The analysis of the result could suggest a failure on the calculation of the synthetic G magnitude for these objects or some peculiarity on the emission.

    http://simbad.u-strasbg.fr/simbad/sim-id?Ident=LTT%204816

Variable sources (DR1)

Author: Alcione Mora

This tutorial was developed for Data Release 1 and is kept for legacy (DR1 old data still available). The variable stars data model has evolved substantially, and the light curves are not accessible via TAP anymore, but through the DataLink and Massive data services. Please take a look at the DataLink and light curves tutorial if interested in epoch photometry.

​​

Gaia DR1 contains information on a selection of pulsating variables, mostly in the Large Magellanic Cloud, in addition to the other deliverables, mainly the gaia_source table, which contains the astrometry and average photometry for 1.14 billion sources in the sky.

This is an intermediate level tutorial that assumes a basic knowledge of the general interface and workflow. The introductory tutorials White dwarfs exploration and Cluster analysis are recommended in case of difficulties following this exercise.

Variable sources

The main references dealing with the analysis of variable star light curves in DR1 are Eyer et al. 2017 and Clementini et al. 2016 A&A 595A, 133C. The former provides an overview of the all the variability analyses carried out, while the latter focuses on the Cepheids and RR Lyrae pipeline and the results finally published in DR1. The Clementini et al. 2017 abstract is reproduced below.

Context. The European Space Agency spacecraft Gaia is expected to observe about 10,000 Galactic Cepheids and over 100,000 Milky Way RR Lyrae stars (a large fraction of which will be new discoveries), during the five-year nominal lifetime spent scanning the whole sky to a faint limit of G = 20.7 mag, sampling their light variation on average about 70 times.

Aims. We present an overview of the Specific Objects Study (SOS) pipeline developed within the Coordination Unit 7 (CU7) of the Data Processing and Analysis Consortium (DPAC), the coordination unit charged with the processing and analysis of variable sources observed by Gaia, to validate and fully characterise Cepheids and RR Lyrae stars observed by the spacecraft. The algorithms developed to classify and extract information such as the pulsation period, mode of pulsation, mean magnitude, peak-to-peak amplitude of the light variation, subclassification in type, multiplicity, secondary periodicities, and light curve Fourier decomposition parameters, as well as physical parameters such as mass, metallicity, reddening, and age (for classical Cepheids) are briefly described.

Methods. The full chain of the CU7 pipeline was run on the time series photometry collected by Gaia during 28 days of ecliptic pole scanning law (EPSL) and over a year of nominal scanning law (NSL), starting from the general Variability Detection, general Characterization, proceeding through the global Classification and ending with the detailed checks and typecasting of the SOS for Cepheids and RR Lyrae stars (SOS Cep&RRL). We describe in more detail how the SOS Cep&RRL pipeline was specifically tailored to analyse Gaia's G-band photometric time series with a south ecliptic pole (SEP) footprint, which covers an external region of the Large Magellanic Cloud (LMC), and to produce results for confirmed RR Lyrae stars and Cepheids to be published in Gaia Data Release 1 (Gaia DR1).

Results. G-band time series photometry and characterisation by the SOS Cep&RRL pipeline (mean magnitude and pulsation characteristics) are published in Gaia DR1 for a total sample of 3194 variable stars (599 Cepheids and 2595 RR Lyrae stars), of which 386 (43 Cepheids and 343 RR Lyrae stars) are new discoveries by Gaia. All 3194 stars are distributed over an area extending 38 degrees on either side from a point offset from the centre of the LMC by about 3 degrees to the north and 4 degrees to the east. The vast majority are located within the LMC. The published sample also includes a few bright RR Lyrae stars that trace the outer halo of the Milky Way in front of the LMC.

Gaia Archive tables

The variability data are distributed among the following tables:

  • variable_summary. It contains a list of the stars classified in gaia_source as variables, together with the first fundamental frequency and best classification.
  • cepheid. Additional fit parameters for Cepheids, including best sub-class and light curve Fourier analysis (period, peak-to-peak amplitude, first to second harmonic ratio, ...)
  • rrlyrae. Similar results, but for RR Lyrae.
  • phot_variable_time_series_gfov. The G-band light curves: fluxes, errors and magnitudes as a function of time.
  • phot_variable_time_series_gfov_statistical_parameters. Basic statistical analysis of each light curve: number of points, first fourth moments, minimum, maximum, median, ...

In addition, gaia_source has the specific field phot_variable_flag set to 'VARIABLE' for all variable stars in this data release.

Getting summary data

The following queries show how to retrieve a basic summary of the first 10 Cepheids and RR Lyrae in the archive. Note the contents of four different tables is joined to provide a full overview.

Cepheids
SELECT TOP 10
  gaia.source_id, gaia.ra, gaia.dec, gaia.parallax,
  variable.classification, variable.phot_variable_fundam_freq1,
  phot_stats.mean,
  cepheid.peak_to_peak_g, cepheid.num_harmonics_for_p1, cepheid.r21_g, cepheid.phi21_g,
  cepheid.type_best_classification, cepheid.type2_best_sub_classification, cepheid.mode_best_classification
FROM gaiadr1.gaia_source AS gaia
INNER JOIN gaiadr1.variable_summary AS variable
  ON gaia.source_id = variable.source_id
INNER JOIN gaiadr1.phot_variable_time_series_gfov_statistical_parameters AS phot_stats
  ON gaia.source_id = phot_stats.source_id
INNER JOIN gaiadr1.cepheid AS cepheid
  ON gaia.source_id = cepheid.source_id
RR Lyrae
SELECT TOP 10
  gaia.source_id, gaia.ra, gaia.dec, gaia.parallax,
  variable.classification, variable.phot_variable_fundam_freq1,
  phot_stats.mean,
  rrlyrae.peak_to_peak_g, rrlyrae.num_harmonics_for_p1, rrlyrae.r21_g, rrlyrae.phi21_g,
  rrlyrae.best_classification
FROM gaiadr1.gaia_source AS gaia
INNER JOIN gaiadr1.variable_summary AS variable
  ON gaia.source_id = variable.source_id
INNER JOIN gaiadr1.phot_variable_time_series_gfov_statistical_parameters AS phot_stats
  ON gaia.source_id = phot_stats.source_id
INNER JOIN gaiadr1.rrlyrae AS rrlyrae
  ON gaia.source_id = rrlyrae.source_id

The results for the RR Lyrae query are summarised below:

 

 

 

Light curve reconstruction and folding

G-band light curves are only provided for objects classified as variable stars in DR1. They are included in table phot_variable_time_series_gfov. It contains one row per star and observing time. The main fields are:

  • source_id. The source identifier.
  • observation_time. The time scale is TCB, measured in Julian days. The zero point is 2010-01-01T00:00:00.
  • g_flux, g_error. Units are electrons per second.
  • g_magnitude. Vega scale. Converted from g_flux using the zero points in table ext_phot_zero_point.
  • rejected_by_variability_processing. Identified outliers.

Magnitude errors are not provided, because they cannot be easily quantified with a single number for low signal to noise fluxes. In the high signal regime, the following approximate relation can be used.

ΔG ≈ 2.5/log(10) * Δf/f

where G is the magnitude and f the flux. Note that using fluxes instead of magnitudes is recommended whenever precise analyses are required (e.g. photometric system cross-calibration, spectral energy distribution construction, ...).

The following query illustrates how to retrieve the light curve in magnitudes of a given variable: the RR Lyrae with source_id = 5284240582308398080 in this example.

SELECT
  curves.observation_time,
  mod(curves.observation_time - rrlyrae.epoch_g, rrlyrae.p1)/ rrlyrae.p1 as phase,
  curves.g_magnitude,
  2.5/log(10)* curves.g_flux_error/ curves.g_flux
     AS g_magnitude_error,
  rejected_by_variability_processing AS rejected
FROM gaiadr1.phot_variable_time_series_gfov AS curves
INNER JOIN gaiadr1.rrlyrae AS rrlyrae
  ON rrlyrae.source_id = curves.source_id
WHERE rrlyrae.source_id = 5284240582308398080

The output contains the time (TCB), phase, G-band magnitude and estimated error and a flag indicating whether this point has been used by the variability processing. The phase is estimated folding the time using the best fit period. The origin is taken at the epoch of maximum flux in the fitted harmonic model.

The data can then be exported for further use. Topcat plots of the unfolded and folded light curves with error bars are provided below.

 

Miscellaneous plots from Clementini et al. 2016

Additional ADQL queries are provided to reproduce some plots (using Topcat) originally included in the Clementini et al. (2016)

Fig. 28. Histogram of RR Lyrae periods (all Gaia sources)
SELECT floor(p1 * 50) / 50 AS period, count(*) AS n
FROM gaiadr1.rrlyrae
GROUP BY period
ORDER BY period

 

Fig. 30 top panel. RR Lyrae period – G-band amplitude diagram
SELECT p1, peak_to_peak_g, best_classification
FROM gaiadr1.rrlyrae

Two exclusive subsets are created based on the best_classification column.

 

Fig. 34 top panel. Cepheids period-luminosity diagram
SELECT p1, int_average_g,
type_best_classification, type2_best_sub_classification, mode_best_classification
FROM gaiadr1.cepheid

 

Topcat views of the query result shown as a table and the corresponding subset definition used in the plot above are included below.

​​​​