Probabilistic multi-catalogue positional cross-match

F.-X. Pineau; S. Derriere; C. Motch; F. J. Carrera; F. Genova; L. Michel; B. Mingo; A. Mints; A. Nebot Gómez-Morán; S. R. Rosen; A. Ruiz Camuñas

doi:10.1051/0004-6361/201629219

Home

All issues

Volume 597 (January 2017)

A&A, 597 (2017) A89

Abstract

Free Access

Issue		A&A Volume 597, January 2017


Article Number		A89
Number of page(s)		28
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/201629219
Published online		10 January 2017

A&A 597, A89 (2017)

Probabilistic multi-catalogue positional cross-match

F.-X. Pineau¹, S. Derriere¹, C. Motch¹, F. J. Carrera², F. Genova¹, L. Michel¹, B. Mingo³, A. Mints⁴^,5, A. Nebot Gómez-Morán¹, S. R. Rosen³ and A. Ruiz Camuñas²

¹ Observatoire astronomique de Strasbourg, Université de Strasbourg, CNRS, UMR 7550, 11 rue de l’Université, 67000 Strasbourg, France
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
² IFCA (CS-IC-UC), Avenida de los Castros, 39005 Santander, Spain
³ Department of Physics & Astronomy, University of Leicester, Leicester, LEI 7RH, UK
⁴ Leibniz-Institut für Astrophysik Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany
⁵ Max-Planck Institute for Solar System Research, Justus-von-Liebig-Weg 3, 37077 Göttingen, Germany

Received: 30 June 2016
Accepted: 23 August 2016

Abstract

Context. Catalogue cross-correlation is essential to building large sets of multi-wavelength data, whether it be to study the properties of populations of astrophysical objects or to build reference catalogues (or timeseries) from survey observations. Nevertheless, resorting to automated processes with limited sets of information available on large numbers of sources detected at different epochs with various filters and instruments inevitably leads to spurious associations. We need both statistical criteria to select detections to be merged as unique sources, and statistical indicators helping in achieving compromises between completeness and reliability of selected associations.

Aims. We lay the foundations of a statistical framework for multi-catalogue cross-correlation and cross-identification based on explicit simplified catalogue models. A proper identification process should rely on both astrometric and photometric data. Under some conditions, the astrometric part and the photometric part can be processed separately and merged a posteriori to provide a single global probability of identification. The present paper addresses almost exclusively the astrometrical part and specifies the proper probabilities to be merged with photometric likelihoods.

Methods. To select matching candidates in n catalogues, we used the Chi (or, indifferently, the Chi-square) test with 2(n−1) degrees of freedom. We thus call this cross-match a χ-match. In order to use Bayes’ formula, we considered exhaustive sets of hypotheses based on combinatorial analysis. The volume of the χ-test domain of acceptance – a 2(n−1)-dimensional acceptance ellipsoid – is used to estimate the expected numbers of spurious associations. We derived priors for those numbers using a frequentist approach relying on simple geometrical considerations. Likelihoods are based on standard Rayleigh, χ and Poisson distributions that we normalized over the χ-test acceptance domain. We validated our theoretical results by generating and cross-matching synthetic catalogues.

Results. The results we obtain do not depend on the order used to cross-correlate the catalogues. We applied the formalism described in the present paper to build the multi-wavelength catalogues used for the science cases of the Astronomical Resource Cross-matching for High Energy Studies (ARCHES) project. Our cross-matching engine is publicly available through a multi-purpose web interface. In a longer term, we plan to integrate this tool into the CDS XMatch Service.

Key words: methods: data analysis / methods: statistical / catalogs / astrometry

© ESO, 2017

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.