Machine Learning and Astronomy


The amount of data in astronomy is constantly growing and we need to find an efficient way of automatic process and analyse the data. Machine Learning provides a wide spectrum of algorithms, which can be used in different data analysis tasks.

The development of new ground-based and sky surveys are bringing us not just new knowledge about the universe but also questions how we will deal with such amount of the data. Astronomy is now in the big data era. Big data are bringing us opportunities and challenges and we need to change methods which we were using for scientific research.  

Some of these challenges resulted in creation of citizen science project (such as zooniverse), where public is included in projects as classifications of galaxies, asteroid trails, providing very valuable training sets for further Machine Learning projects. Citizen science projects are also very beneficial for public outreach and education. 

Big data can by characterized by:

  • Volume
  • Variety
  • Velocity
  • Veracity
  • Exhaustive
  • Fine-grained and uniquely lexical
  • Relational
  • Extensional
  • Scalability
  • Value
  • Variability



Image 1: Long-term MAST Data Growth. From presentation

You can read more about Big Data  here and here.