Demos & Tutorials - Gaia Users
Help supportShould you have any question, please check the Gaia FAQ section or contact the Gaia Helpdesk |
Graphical User Interface
The Graphical User Interface (GUI) of the ESA Gaia Archive offers the possibility to carry out basic and advanced queries using ADQL (Astronomical Data Query Language). The outputs of these queries can be easily downloaded in a variety of formats, including VOTable, FITS and CSV. Below you can find a brief description of the main features offered by the Archive, as well as two video tutorials explaining how to use it.
|
Graphic User Interface main page.
|
|
Basic query form. This form allows to easily search for data in all the catalogues hosted by the Archive. Restrictions can be added to the query using the 'Extra conditions' wizard. The output fields can be selected by means of the 'Display columns' option panel. |
|
Advanced (ADQL) query form. This form allows to execute ADQL queries. Each query generates a job at server side. The jobs executed by the user can be inspected in the list provided in this page. All the public tables and the user-uploaded tables are visible in the left side of the web. |
|
Query Results.
The output of the queries are displayed in this window. The ADQL query that generated these results can be inspected by clicking on the 'Show query in ADQL form' link. |
Video tutorial: How to use the Archive Author: Deborah Baines |
Video tutorial: How to use the simple form of the Archive Author: Alcione Mora |
Tutorial: Basic queries
Authors: Héctor Cánovas & Alcione Mora
This tutorial was developed for Early Data Release 3. A new version might be available in the future. Even if the interface has evolved from EDR3, many changes may be cosmetic and this introduction could still be valuable for newcomers.
The main function of the Gaia Archive is to provide data to the astronomers. The Search
tab in the GUI landing page provides two different ways of accesing the Archive for Basic
(default option) and Advanced (ADQL)
queries. The main objectives for the Basic
tab are:
- To ease the exploration of the Archive catalogues for simple use cases, and
- To help users into the transition towards the
Advanced (ADQL)
tab for complex use cases.
To that end, the Basic
tab allows to perform two of the most common operations executed when exploring an astronomy archive in a very simple and intuitive manner. These operations are the ADQL cone search, which allows to search by coordinates for one or more sources in a given catalogue, and an ADQL query to retrieve all the sources encompassed by a circular or box-like region in the projected sky.
All Basic
queries are synchronous, which means that they will time out after 60 and 90 seconds for non-registered and registered users, respectively (see this FAQ). Furthermore, the output of these queries is limited to 2000 sources. Therefore, we recommend to use the Advanced (ADQL)
tab to execute complex and/or long queries. Through this tutorial you will learn to use the Basic
tab to:
- Retrieve data for a single source.
- Retrieve data for multiple sources.
- Search for data inside a sky region.
- Understanding the query results.
1. single source data retrieval
The most basic use case could be formulated as "I want the most relevant Gaia results for a single object". It can be accomplished using the Basic > Position
subtab. The first step is to fill in the "Name" box with either the object identifier (e.g. "UX Ori") or its ICRS coordinates. The accepted input formats are described in the pop-up window that appears when clicking on top of the "Name" tooltip (see Fig. 1). The single object search launches an ADQL cone search around the celestial coordinates of the input object, which are provided by the Sesame name resolver. The drop-down menu highlighted by the thin solid circle in Fig. 1 allows to choose the service that will be queried to obtain the object coordinates (by default the system tries Simbad, then NED, and then Vizier). Once the name or coordinates are successfully resolved the object box turns green and a confirmation message is shown. The cone search radius can be tuned using the "Radius" box, and its units can be adjusted using its associated drop-down menu (highlighted by the dashed circle in Fig. 1).
Figure 1: Content of the Basic > Position
subtab (single source resolver). The vertical arrows highlight the tooltips with explanatory text, while the circles and the horizontal arrow highlight the drop-down menus and extra options available to customize the query, respectively.
The cone search is centred around the coordinates provided by the name resolver. These coordinates are propagated to the target catalogue epoch if the proper motions of the object are known.
The next step is to choose the catalogue that is going to be explored. The latest Gaia data release is the default one, but all the catalogues hosted by the Archive (e.g., previous Gaia data releases, external catalogues) containing geometric information in the form of celestial coordinates can be explored by clicking on the drop-down menu highlighted by the thick circle in Fig. 1. Registered users can also access to their user-uploaded tables provided that their tables contain indexed celestial coordinates (see this tutorial). By default, only a few pre-selected columns of the choosen catalogue are shown in the query outputs. Therefore, you may want to verify that the output of your search will contain the columns that you are interested in. To do so, simply click on the "Display columns" menu (indicated by the dashed horizontal arrow in Fig. 1) and mark the columns that you want to retrieve.
Now you are ready to hit the "Submit Query" button. If you are interested in learning how your query is expressed in the ADQL language, you can hit the "Show Query" button. The query results are shown in the Query Results
tab, whose contents are explained in the Understanding the query results section below.
2. Multiple source data retrieval
Three common use cases are:
- I want to look for Gaia counterparts on my list of known objects,
- I want to look for the neighbours of my favourite source, and
- I want to look for the neighbours of my list of favourite sources.
2.1 Search for the Gaia counterparts of my source list
This popular use case can be accomplished using the Basic > File
subtab of the GUI. First of all, you must prepare a single column ascii file (without header) with the names or ICRS oordinates of the objects you are interested in. The input file must be formatted as described in the pop-up window that appears when clicking on top of the "Select a file with Target names" tooltip (see Fig. 2). Second, select the file that you want to upload using the wizard that appears when clicking on top of the "Choose file" button. Be aware that the query output is limitted to 2000 sources. If you aim to retrieve a larger dataset you should upload your target list to your user space as explained in this tutorial, and then use the Advanced (ADQL)
tab to perform a cross-match as explained in this other tutorial.
Figure 2: Content of the Basic > File
subtab (multi source resolver). The arrow highlight the tooltip with explanatory text.
2.2 Search for the neighbours of my favourite source
Imagine that we want to retrieve all the sources from the Gaia EDR3 catalogue that, on-sky, are separated by less than 5 arc minutes from UX Ori. This can easily be achieved by simply updating the cone search radius units (using the drop-down menu highlighted by the red dashed circle in Fig. 1). This query outputs 195 sources, including several sources without parallax data. The Basic > Position
subtab allows to apply additional selection criteria to e.g., filter out the targets with parallax signal-to-noise below a given threshold. To do so simply click on the "Extra conditions" drop-down menu as highlighted by the horizontal solid arrow in Fig. 1. A menu that allows to apply and combine different filters will show up as illustrated by Fig. 3. Applying the condition exemplified in Fig. 3 reduces the output of the previous query to 30 sources.
Figure 3: Content of the "Extra conditions" menu of the Basic > Position
subtab. The arrow indicate how to select the operators that will be included in the query. The solid and dashed circles highlight the drop-down menus allowing to select the column over which the operator is going to be applied and the filter combination, respectively.
2.3 Search for the neighbours of my source list
This use case is a combination of the previous two cases. If you are interested in retrieving the neighbours of a pre-computed source list you should 1) upload the target list as explained in Sect 2.1 and 2) then adjust the cone search radius as described in Sect 2.2. As before, you should keep in mind that the output catalogue of the query is limitted to 2000 sources so you may want to apply a selection criteria using the "Extra conditions" menu as described in Sect 2.2.
3. Retrieve sources in a region of the sky
A cone search with an arbitrary radius can be generated around any point on the celestial sphere by entering the target coordinates in the "Name" box and adjusting the cone search radius as explained in the Single source data retrieval section. Alternatively, it is also possible to use the "Equatorial" button under the Basic > Position
subtab (see Fig. 1) and choosing the "Circle" option (see Fig. 4). The Basic > Position
subtab also allows to retrieve the sources encompassed by a box region in the projected sky. To do so, simply select the "Box" option, as illustrated by Fig. 4. The accepted formats of the input coordinates are described in the RA and Dec tooltips of the coordinate boxes. As with the previous options, it is also possible to add different selection criteria to filter out the query output by means of the "Extra conditions" menu.
Figure 4: Content of the "Box" menu of the Basic > Position
(Equatorial) subtab. The arrows indicate the tooltips with explanatory text.
A box is defined as a spherical quadrilateral delimited by great circle segments. This can provide counter-intuitive results, and require extra care, when the area analysed is large or close to the celestial poles. The reason is parallels (constant declination loci) are not great circles, and thus unsupported by ADQL.
4. Query Results
The query results are presented in tabular format in the Query Results
subtab that is automatically opened once the query is finished, as shown by Fig. 5. The columns provide units when available. Further details on the meaning of each field can be found in the "<Target Catalogue> Data Model" accessible in the bottom part. Figure 5 shows the results of searching for the Gaia EDR3 counterparts of a list of sources (Sect. 2.1) and therefore the link points to the authoritative reference on the Gaia EDR3 contents. For this example only five columns ("source_id", "ra", "dec", "parallax", and "phot_g_mean_mag") were selected for displaying (using the "Displayed columns" menu as explained in Sect. 1). When the query consist in finding counterparts of an input source list in a given catalogue (as in this example), the output table contains addicional columns that are automatically added by the Archive to aid the user in finding the correspondence between the input targets and the query results. These columns are:
- "target_id": contains the input target name as provided by the user in the input target list.
- "target_id, target_ra, target_dec, target_parallax, target_pm_ra, target_pm_dec, target_radial_velocity": contain data provided by the Name resolver (see Sect. 1).
- "target_distance": contains the on-sky angular separation (in units of degrees) between the target coordinates provided by the Name resolver and the target coordinates of the selected catalogue (Gaia EDR3 in this example).
Figure 5: Query Results
subtab (see text for detailed explanation). The arrows indicate the tooltips with explanatory text.
Clicking over the "Show query in ADQL form" button will open the Advanced (ADQL)
tab to show the ADQL query that has been launched. This utility can be helpful for non-expert users aiming to learn the ADQL query language. Finally,the query output can be downloaded using the "Download results" menu (indicated by a vertical arrow in Fig. 5). The format of the output file can be set by means of the drop-down menu highlighted by the red circle in Fig. 5.
ADQL syntax
ADQL (Astronomical Data Query Language) is based on SQL (Structured Query Language) which is a language to retrieve information from tables stored in a database.
(References: ADQL 2.0and SQL 92)
A very concise introduction to ADQL is provided in the next sections. There are a number of tutorials and resources providing a gentler learning curve for newcomers, specially those without previous experience with SQL or relational data bases. A small selection is provided below
- The DPAC ADQL Cookbook
- GAVO ADQL reference card
- GAVO ADQL short course for Gaia
- The Gaia Archive White dwarfs exploration and Cluster analysis tutorials
1. Basic syntax
In ADQL/SQL you write queries to the database. A query is compound of the table columns you want to retrieve (the SELECT part), the table or tables that store the data (the FROM part) and the conditions to restrict the data you obtain (the WHERE part). E.g.
SELECT <columns> FROM <tables> WHERE <conditions>
So, if you want to obtain Right Ascension and Declination of all items from the table gaia_source, you may write:
SELECT ra, dec FROM gaiadr1.gaia_source
ra
is the column name of Right Ascension in gaia_source table.
dec
is the column name of Declination in gaia_source table.
gaiadr1
is the database schema name where gaia_source table belongs to. It is a good practice to add schema names to tables to avoid name clashes.
Probably, you want to obtain also the object identification, so you can modify the query as follows:
SELECT source_id, ra, dec FROM gaiadr1.gaia_source
If you want to know all the column names associated to a table, you may use GACS GUI, clicking on the plus sign next to a table name.
|
|
|
Or, in TAP+, you may obtain all the columns and descriptions of a table using the following syntax:
curl "https://gea.esac.esa.int/tap-server/tap/tables?tables=gaiadr1.gaia_source"
Now, suppose you are interested in an specific region (e.g. ra=266.41683, dec=-29.00781, radius=0.083333 arc.min.). So you want to restrict the results to that region. In order to do that, you may execute a 'cone search' to obtain all the objects where ra,dec are inside a cone:
SELECT source_id, ra, dec FROM gaiadr1.gaia_source
WHERE 1=CONTAINS(POINT('ICRS',ra,dec),
CIRCLE('ICRS',266.41683,-29.00781, 0.08333333))
(You may read the ADQL recommendation to obtain the list of functions that can be used).
One way to create a complex query could be to use the 'Simple Form' page to create the basic query graphically. Then you should press 'Show query' button to show the query as ADQL in the 'ADQL Form'. And then, you can modify it.
2. Selecting complex data
You are not restricted to obtain column names only. You can obtain complex values too.
For instance, you may want to obtain the distance of each source to the center of an specific region. Then, you may type:
SELECT source_id, ra, dec, DISTANCE(POINT('ICRS',ra,dec),
POINT('ICRS',266.41683,-29.00781)) AS dist,
FROM gaiadr1.gaia_source
WHERE 1=CONTAINS(POINT('ICRS',ra,dec),CIRCLE('ICRS',266.41683,-29.00781, 0.08333333))
source_id
, ra
and dec
are the Source Identifier, Right ascension and Declination of each item of the gaia_source table.
'DISTANCE(POINT('ICRS',ra,dec), POINT('ICRS',266.41683,-29.00781)) AS dist' is a created on the fly column, with the name 'dist' that contains the distance of the item to the specified point.
3. Additional functions available
Apart from the standard ADQL functions, ESA Gaia TAP+ service offers the next functions:
Table Gaia TAP+ ADQL functions
Function |
Return Type |
Description |
Example |
Result |
STDDEV(expression) |
Numeric |
Standard deviation function |
STDEV(column) | |
GAIA_HEALPIX_INDEX(norder,source_id) |
bigint |
Returns the healpix index of the given.
|
GAIA_HEALPIX_INDEX(4, 2060294888487267584) | 914 |
GREATEST(v1,v2[,v3,..,vn]) |
Same as input |
Greatest value among the given arguments |
GREATEST(10.55, 9.12323, 11.2, 7.8) | 11.2 |
LEAST(v1,v2[,v3,..,vn]) |
Same as input |
Least value among the given arguments |
LEAST(10.55, 9.12323, 11.2, 7.8) | 7.8 |
SIGN(x) |
Integer |
Sign of the argument (-1, 0, +1) |
SIGN(-10.55) | -1 |
COALESCE(v1,[]v2,v3,..,vn]) |
Same as input |
Returns first argument that is not null. If all arguments are null, it will return null. |
COALESCE(NULL, 1, 2) | 1 |
NULLIF(v1,v2) |
Same as input |
Returns a null value if v1 equals to v2, otherwise it returns v1. |
NULLIF(1, 1) | NULL |
WIDTH_BUCKET(operand,min,max,buckets) |
Integer |
Returns the bucket number to which operand would be assigned in a histogram having count equal-width buckets spanning the range min to max; returns 0 or count+1 for an input outside the range |
WIDTH_BUCKET(5.35, 0.024, 10.06, 5) | 3 |
Also, the ESA Gaia TAP+ service offers functions for array handling:
Table Gaia TAP+ array handling functions
Function |
Description |
Example |
Result |
ARRAY_ELEMENT(array_column, index1 [,index2....,indexN]) |
Returns the requested element inside the array. Indexes beginning in 1 |
Example. Given the array [4,5,6] |
SELECT array_element(array,2) : 5
|
ARRAY_NDIMS(array_column) |
Returns the number of dimensions of the array (integer) |
Example. Given the array [[1,2,3], [4,5,6]] |
SELECT array_ndims(array) : 2
|
ARRAY_DIMS(array_column) |
Returns a text representation of array's dimensions. Initial and end index for each dimension are given |
Example. Given the array [[1,2,3], [4,5,6]] |
SELECT array_dims(array) : [1:2][1:3]
|
ARRAY_LENGTH(array_column, index) |
Returns the length of the requested array dimension (integer), 1 being the first index value |
Example. Given the array [1,2,3] |
SELECT array_length(array,1) : 3
|
CARDINALITY(array_column) |
Returns the total number of elements in the array (integer), or 0 if the array is empty |
Example. Given the array [[1,2],[3,4]] |
SELECT cardinality(array) : 4
|
4. ADQL mathematical functions
ADQL defines some mathematical functions. Also, other mathematical functions have been defined in our service to support scientific queries
Table Mathematical ADQL functions
Function |
Return Type |
Description |
Example |
Result |
ABS(x) |
Same as input |
Absolute value |
ABS(-19.4) | 19.4 |
CBRT(x) |
numeric |
Cube Root |
CBRT(27.0) | 3 |
DEGREES(x) |
numeric |
Radians to Degrees |
DEGREES(0.5) | 28.64788975654116 |
DIV(y,x) |
numeric |
Integer quotient of y/x |
DIV(9,4) | 2 |
EXP(x) |
same as input |
Exponential |
EXP(1.0) | 2.718281828459045 |
FLOOR(x) |
same as input |
Nearest integer less than or equal to argument |
FLOOR(-42.8) | -43 |
LOG(x) |
same as input |
Natural Logarithm |
LOG(2.0) | 0.6931471805599453 |
LOG(b,x) |
numeric |
Logarithm to base b |
LOG(2.0, 64.0) | 6.0000000000 |
LOG10(x) |
numeric |
Base 10 logarithm |
LOG10(100.0) | 2 |
MOD(y,x) |
same as arguments |
Remainder of y/x |
MOD(9, 4) | 1 |
PI() |
numeric |
Pi constant |
PI() | 3.141592653589793 |
POWER(x,y) |
numeric |
x raised to the power of y |
POWER(9.0, 3.0) | 729 |
RADIANS(x) |
numeric |
Degrees to radians |
RADIANS(45.0) | 0.7853981633974483 |
RAND(x) |
numeric |
Random number in the range 0<=x<1 |
RAND() | |
ROUND(x,s) |
numeric |
Round x to s decimal places, where s is an integer |
ROUND(45.2191,2) | 45.22 |
SQRT(x) |
numeric |
Square root |
SQRT(2.0) | 1.414213562373095 |
TRUNCATE(x) |
numeric |
Truncate toward zero |
TRUNCATE(48.8) | 48 |
TRUNCATE(x, s) |
numeric |
Truncate to s decimal places |
TRUNCATE(48.8328, 3) | 48.832 |
5. ADQL trigonometric functions
ADQL defines some trigonometric functions considered important for astronomical queries
Table trigonometric ADQL functions
Function |
Return Type |
Description |
Example |
Result |
ACOS(x) |
Numeric |
Inverse cosine or arc cosine |
ACOS(0.12582) | 1.4446419701843678 |
ASIN(x) |
Numeric |
Inverse sine or arc sine |
ASIN(0.12582) | 0.1261543566105288 |
ATAN(x) |
Numeric |
Inverse tangent or arc tangent |
ASIN(10.28527) | 1.4738745386849255 |
ATAN2(x,y) |
Numeric |
Inverse tangent of x/y |
ATAN2(10.28527,3.1) | 1.2780538751678443 |
COS(x) |
Numeric |
Cosine of x |
COS(10.28527) | -0.6520645009291157 |
SIN(x) |
Numeric |
Sine of x |
SIN(10.28527) | -0.7581634959743598 |
TAN(x) |
Numeric |
Tangent of x |
TAN(10.28527) | 1.1627124232251034 |
COT(x) |
Numeric |
Cotangent of x |
COT(0.785) | 1.000796644031489 |
6. Data type casting ADQL functions
Some User Defined Functions have been implemented in order to allow the casting of values between different data types. The casting functions relies in the underlying PostgreSQL functions so errors received during casting will be, in most of the cases, PosgreSQL errors.
Table Data type casting ADQL functions
Function |
Return Type |
Description |
Example |
Result |
TO_SMALLINT(x) |
int2 |
Conversion of valid values into smallint. From -2^15 to 2^15-1 |
TO_SMALLINT(17.4) | 17 |
TO_INTEGER(x) |
int4 |
Conversion of valid values into integer. From -2^31 to 2^31-1 |
TO_INTEGER(1713112213.4123) | 1713112213 |
TO_BIGINT(x) |
int8 |
Conversion of valid values into bigint. From -2^63 to 2^63-1 |
TO_BIGINT(1713112213.4123) | 1713112213 |
TO_REAL(x) |
float4 |
Conversion of valid values into real. 8 decimal digits precision |
TO_REAL(91323.1231) | 91323.125 |
TO_DOUBLE(x) |
float8 |
Conversion of valid values into double precision. 16 decimal digits precision |
TO_DOUBLE(91321213.112212231) | 91321213.11221223 |
TO_CHAR(x) |
char |
Convert valid values into char data type |
TO_CHAR(1123) | '1123' |
TO_CHAR(v1, v2) |
char |
Convert valid values into char data type, following the format defined in v2. For a full list of valid format, check Formats |
TO_CHAR(-125.8, '999D99S') | '125.80-' |
TO_BOOLEAN(v1) |
boolean |
Convert valid values into boolean data type |
TO_BOOLEAN(1) | true |
7. ADQL extension: conditional expressions
Some conditional expressions have been implemented as User Defined Functions.
Table Conditional Expressions ADQL functions
Function |
PostgreSQL expression replicated |
Example |
CASE_CONDITION(default_value, condition1, value1, condition2, value2, ...) |
CASE WHEN condition THEN result
[WHEN ...]
[ELSE result]
END
|
case_condition(astrometric_n_obs_al, dec < -40, -astrometric_n_obs_al, dec > 40, astrometric_n_obs_al / 2) |
CASE_EXPRESSION(input_column, default_value, target_value1, value1, target_value2, value2, ...) |
CASE a
WHEN 1 THEN 'one'
WHEN 2 THEN 'two'
ELSE 'other'
END
|
case_expression(astrometric_n_obs_al, 'unknown', 85, 'eighty five', 78, 'seventy eight', 228, 'two hundred and twenty eight', 3, 'three', 4, 'four') |
IF_THEN_ELSE(condition, value, [default_value]) |
CASE WHEN condition THEN result
[ELSE result]
END
|
if_then_else(dec < 0, astrometric_n_obs_al, -astrometric_n_obs_al) |
8. Deviations of ADQL functions implementation from standard
ADQL standard is implemented in compliance with IVOA ADQL 2.0 plus the ADQL Errata available by 16th April 2018. Some implementation limitations apply in the following functions:
- BOX(coordsys, longitudeCentre, latitudeCentre, longitudeExtent, latitudeExtent): it is interpreted as follows:
- As defined into the standard when arguments are fixed values (A cross at the central position with arms extending, parallel to the coordinate axes at the center position, for half the respective sizes on either side).
- When the arguments are variable (e.g. table columns), the sides of the box are line segments or great circles intersecting the arms of the cross in its end points at right angles with the arms.