Image of the Week
Interactive and statistical visualisation of Gaia DR1 with VAEX
The size of the Gaia DR1 datasets requires sophisticated analysis and plotting tools. For example, a simple plot of the positions of the stars on the sky in Gaia DR1 miserably fails, even if only 10 000 points or 1 000 000 points (top left and right panels, respectively) are shown. However, a density plot as that in the bottom adequately reveals the rich structure of the data, including all 1 142 679 769 sources in Gaia DR1, and can be generated in less than a second with VAEX. This density plot reveals, for example, structure in the galactic disk and artefacts due to the scanning nature of the observations performed with Gaia.
With the arrival of large catalogues such as the Gaia DR1, which contains more than a billion objects, new methods of handling and visualising these data volumes are needed. For many science cases, as well as for quality checks of the data, one needs to visualise all or large parts of the data. While scatter plots would suffice for small catalogues, it would not work for the full Gaia catalogue. Apart from the long time it takes to render each individual point as a glyph, overplotting makes the plot useless, as demonstrated in the figure above, presented by Maarten Breddels at the Astro-informatics IAU Symposium in 2016 at Sorrento, Italy.
This figure demonstrates how plotting a random subset of 10,000 stars (0.001% of the data, top left panel) shows structure in the galactic disk, while plotting just a million stars (0.1% of the data, top right panel), already starts to hide many of the structures present in the data. Instead, in the bottom panel of this figure, showing all data (more than a billion stars) in the form of a density plot reveals much more structure: dust lanes are clearly visible in the disk, our neighbouring galaxies (the Large and Small Magellanic Clouds) stand out clearly against the background, but also artefacts in the data due to the scanning law of the satellite become visible. In this plot, low densities correspond to black, and high densities to white, where the scaling is logarithmic.
To visualise and explore large catalogues such as Gaia DR1, Maarten Breddels from the Kapteyn Astronomical Institute (University of Groningen) developed a software package to perform the calculations needed for these visualisations efficiently. The calculations to compute the number of stars in each pixel take only about a second on a high-end desktop machine. Statistics, such as minimum, maximum, mean, moments, etc., can also be calculated efficiently in any number of dimensions.
The software packaged called VAEX exists of two parts. The first is a Python package, allowing fast calculations of statistics for any property of the data (or any mathematical operations on them), and their visualisation in, for instance, the Jupyter notebook. Built on top of this, is a graphical program for Linux and Mac OS X, enabling interactive exploration of the data including zooming, panning, and screen selections. VAEX is open source, available under a MIT License.
This work has been carried out in collaboration with Amina Helmi. It has been funded by a grant from the Netherlands Research School for Astronomy (NOVA), and a Vici grant from the Netherlands Organisation for Scientific Research (NWO).
Credits: ESA/Gaia/DPAC/CU9, Maarten Breddels, Amina Helmi
Image of the Week Archive