Event Date and Time
Emille Ishida, Laboratoire de Physique de Clermont
The era of big data brought astronomy and computer science together by the adoption of machine learning techniques applied to large, and complex, astronomical data sets. However, the nature of astronomical data poses important constraints on how far the use of traditional learning techniques can go. In this scenario, one of the main challenges is the time consuming and expensive labeling process required to build training samples in astronomy. Moreover, observational requirements over the sample we are able to label (spectroscopic) and the one we wish to classify (photometric) are very different, thus making representativity between training and test samples impossible. In this talk I will discuss how we can optimize the construction of training (spectroscopic) samples for classification purposes, while still taking into account complexities derived from the observational process. I will present results on how the design of training samples from the beginning of the survey can achieve optimal classification results with a much lower number of labels (spectra) and show how this strategy is being applied to the current alert stream of the Zwicky Transient Facility survey by the Fink broker. I will also describe how such strategies have proven to be effective also in search for scientifically interesting anomalies within the efforts of the SNAD collaboration.