Identification and Classification of Young Star Clusters
The GAIA DR3 presents scientists with new opportunities to investigate star clusters due to the increased number of light sources (~1.6 billion) and improved precision of the data. A natural ansatz is now to attempt to create a comprehensive catalogue of star clusters using Machine Learning tools. The specific tools to be used are clustering algorithms, such as DBSCAN, HDBSCAN, and OPTICS. These divide a number of (unlabeled) data points in a metric space (e.g., 3D Euclidean position space or 6D position-velocity space) into mathematical clusters based on their spatial proximity in a process called cluster analysis (or simply clustering). Ideally, a mathematical cluster found by such an algorithm shall correspond to a physical star cluster. To make this happen reliably is the very aim. A major problem lies in the fact that due to the sheer size of the data, the results delivered by such algorithms when applied to the GAIA DR3 cannot be manually verified. Clustering is, per se, an unsupervised Machine Learning task. Therefore, it must be made sure that the results of the cluster analysis are reliable. The intended solution for this problem is to use generated data, created through simulations of the development of young star clusters, in order to calibrate the algorithms appropriately, thus turning this into a supervised task. A range of different algorithms with different parameters may be suitable for finding different types of young star clusters with different physical properties, such as age, size, SFE, etc. If such generated data are reasonably reliable for representing actual star clusters, and if a selection of algorithms with corresponding parameters can reliably find such generated clusters, it can be concluded that these algorithms are also reasonably good at finding real star clusters. When applying such a selection of clustering algorithms and parameters to the GAIA DR3, the results can be compared to known and suspected young star clusters and a catalogue of young star clusters, including ideally many previously unknown ones, can be created. I will be discussing my plans, as well as challenges and opportunities, regarding this endeavour.