Volume 606, October 2017
|Number of page(s)||13|
|Section||Catalogs and data|
|Published online||05 October 2017|
Automated novelty detection in the WISE survey with one-class support vector machines⋆
1 National Center for Nuclear Research, ul. Hoża 69, 00-681 Warsaw, Poland
e-mail: firstname.lastname@example.org; email@example.com
2 Leiden Observatory, Leiden University, 2333 CA Leiden, The Netherlands
3 Janusz Gil Institute of Astronomy, University of Zielona Góra, 65-417 Zielona Góra, Poland
4 Warsaw University Astronomical Observatory, 00-001 Warszawa, Poland
5 The Astronomical Observatory of the Jagiellonian University, 31-007 Kraków, Poland
Received: 10 April 2017
Accepted: 15 June 2017
Wide-angle photometric surveys of previously uncharted sky areas or wavelength regimes will always bring in unexpected sources – novelties or even anomalies – whose existence and properties cannot be easily predicted from earlier observations. Such objects can be efficiently located with novelty detection algorithms. Here we present an application of such a method, called one-class support vector machines (OCSVM), to search for anomalous patterns among sources preselected from the mid-infrared AllWISE catalogue covering the whole sky. To create a model of expected data we train the algorithm on a set of objects with spectroscopic identifications from the SDSS DR13 database, present also in AllWISE. The OCSVM method detects as anomalous those sources whose patterns – WISE photometric measurements in this case – are inconsistent with the model. Among the detected anomalies we find artefacts, such as objects with spurious photometry due to blending, but more importantly also real sources of genuine astrophysical interest. Among the latter, OCSVM has identified a sample of heavily reddened AGN/quasar candidates distributed uniformly over the sky and in a large part absent from other WISE-based AGN catalogues. It also allowed us to find a specific group of sources of mixed types, mostly stars and compact galaxies. By combining the semi-supervised OCSVM algorithm with standard classification methods it will be possible to improve the latter by accounting for sources which are not present in the training sample, but are otherwise well-represented in the target set. Anomaly detection adds flexibility to automated source separation procedures and helps verify the reliability and representativeness of the training samples. It should be thus considered as an essential step in supervised classification schemes to ensure completeness and purity of produced catalogues.
Key words: infrared: galaxies / infrared: stars / galaxies: statistics / stars: statistics / Galaxy: fundamental parameters
The catalogues of outlier data are only available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (220.127.116.11) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/606/A39
© ESO, 2017
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.