Free Access
Volume 570, October 2014
Article Number A87
Number of page(s) 8
Section Catalogs and data
Published online 22 October 2014

© ESO, 2014

1. Introduction

The Initial Gaia Source List (IGSL) was commissioned by the Gaia Data Processing and Analysis Consortium (DPAC) in 2006 to be a combination of the best optical astrometry and photometry information on celestial objects available at the Gaia Launch. A snapshot of the sky as we know it before Gaia. The method adopted was to cross match large area catalogs into one database then select the best parameters based on the typical precisions for each contributing catalogs.

Originally the main purposes of the IGSL were three fold: (i) to provide a comparison for science alerts for the first year before the first release of the Gaia results were available; (ii) to simplify the interconnection of the numerous Gaia auxiliary catalogs; (iii) to provide a list of objects fainter than the Gaia limit for calibration of the charge transfer efficiency. To these ends the first versions of the IGSL delivered in 2007 (IGSL1) and 2010 (IGSL2) included all entries from the input catalogs even if there was a chance they were not real.

In 2011 a decision was taken to make the IGSL the starting point for the initial data treatment cross-matching routines to simplify the operational procedures. As a result of this decision, it was decided to reduce the size of the IGSL by removing objects fainter than the Gaia limit. The result is a smaller, cleaner catalog that, for operational reasons, was frozen before the launch that is considered the IGSL3, hereafter simply IGSL.

The formal DPAC mandate for the IGSL is to fulfill the following broad requirements: all-sky positions, proper motions, and magnitudes for objects to a limit of Gaia magnitude G = 21 where possible, e.g., where there are large (>10 000 square degrees) catalogs that reach that limit. The proper motions and magnitudes will be provided on a best effort basis, nominally with precisions of 10 mas/yr and 0.3 mag, respectively, but obviously limited by the currently available large catalogs. The DPAC Core Processing Coordination Unit (CU3) catalogs of the Quasi Stellar Objects and the Ecliptic Pole Catalog should be included with no selection on magnitudes to directly support the CU3 processes that require those resources. The Hipparcos objects will be included with no selection on magnitudes to aid in the production of the Hundred Thousand Proper Motion catalog (de Bruijne & Eilers 2012).

2. Source catalogs

A quick perusal of compilation catalogs available in 2007 would suggest that the Naval Observatory Merged Astrometric Dataset (NOMAD, Zacharias et al. 2005) would have been a good choice for an IGSL. The adoption of NOMAD does not, however, diminish the work required to fulfill the goals of the IGSL. Nearly all the astrometric and photometric parameters would have needed updating, often returning to the original sources. It was decided that it was easier to construct the IGSL from scratch.

The main driving requirement was to find all objects all sky brighter than magnitude 21 in the expected Gaia G band. We have included publicly available large (e.g., >10 000 square degrees) catalogs and some smaller special-area catalogs. To obtain an estimate of the G magnitude, we require an optical color (i.e., BV, VI, gr etc.) and apply the transformations in Jordi (2012). Many public catalogs were included in the original database, and the final decision on which ones would actually be used to construct the IGSL was taken only after various beta versions were produced.

The procedure we adopted was to start with the Guide Star Catalog (GSC) as a master list and add the entries from large catalogs to this list with a large matching radius of 5′′ with positions at J2000. This epoch was chosen because it is close to the epoch of the large catalogs that do not have proper motions, and the option to use the epoch of observation was considered too complicated (for example the GSC23 has observations at over 8000 epochs) for any possible gain in matching precision. If more than one of the added entries matched one of the master list, the closest was considered the correct match, and a new entry was generated from the other entries. In this way the master list grew with each new catalog. The source catalogs and their order of inclusion for producing the IGSL version 3 were:

  • GSC2.3 – The Second Guide Star Catalogversion 2.3 (Laskeret al. 2008): this catalog forms thebulk of the photometry and defines the red and blue magnitudes(BJ and RF) because this is the sky survey with the largest coverage in a homogenous photometric system. The only variation with the public version is that we removed the duplicate entries discussed in Sect. 4.2 of Lasker et al. (2008). This was done by insisting that only the Tycho-2 or first occurrence of entries within 10 mas were kept. As described in Sect. 4.3 of Lasker et al. (2008), the errors are not formal errors and cannot be used to set confidence limits, so we set the position errors to 400 mas.

  • Tycho-2 (Høg et al. 2000): this catalog forms the backbone of all the major ground based catalogs currently available. It was made from a combination of the Tycho star mapper observations on the Hipparcos satellite (Hoeg et al. 1997), the Astrographic Catalogue, and 143 other ground-based catalogs.

  • UCAC4 – USNO CCD Astrograph Catalog version 4 (Zacharias et al. 2013): this is the most precise all-sky astrometric catalog in the range V = 10 − 16 currently available. This catalog provides a red magnitude but is not in a standard bandpass.

  • SDSS – Sloan Digital Sky Survey data release 91: we used the positions, magnitudes, and epoch of observation for all entries with r< 24 from the PhotoPrimary catalog of the ninth data release. This amounted to 4.7 × 108 entries in the SDSS footprint. We prioritized the astrometric and photometric data from this dataset for all the objects fainter than UCAC4. Since there were no proper motions in the SDSS when we used these positions, we propagated from the epoch of observation to 2000 using PPMXL proper motions. Owing to an error interpretating this catalog, the classifications are inverted; e.g., stars are classed 1 and non-stars 0.

  • 2MASS – Two Micron All-Sky Survey Point Source Catalog (Skrutskie et al. 2006): this catalog is mainly a subset of the large Schmidt catalogs except for the red stars that were too faint. Magnitudes were calculated from a very rough extrapolation of the J and K infrared magnitudes to the BJ and RF listed in Appendix A. The photometric errors of the extrapolated estimates were assumed to be 0.5 mag, and the positional errors were set to 300 mas. This positional error is higher than the quoted one in the 2MASS catalog, but because these objects do not have proper motions and the IGSL will be used now, e.g. 2014/2015, we have set the error high to be conservative, also considering that the 2MASS positions that enter are only for the faintest red objects. Since we only considered the PSC, all objects in this catalog were classified as stellar.

  • PPMXL – Positions and Proper Motions “Extra Large” Catalog, (Roeser et al. 2010). Produced as a combination of the USNO-B (Monet et al. 2003) and the Two Micron Sky Survey point source catalog (Epchtein et al. 1999). The positions and proper motions should be ones that are the most precise available for the objects fainter than the UCAC4 limit.

The order of this list follows the evolution of the production of the IGSL. As we are compiling catalogs, we naturally include false and erroneous entries and for some objects this leads to confusion. For example a bright star may get matched to noise rather than it’s true bright counterpart. The order of the inclusion can be optimized to minimize confusion at, for example, the bright end of the IGSL. The faint end, being dominated by the GSC23 and PPMXL where most of the false entries originated, will suffer the same confusion regardless of the order of inclusion. This is discussed below in the conclusions. The original IGSL deliveries used the GSC23 as the primary Schmidt-plate-based all-sky catalog, where the IGSL version 3 used the PPMXL because the former did not provide proper motions.

For operational reasons a number of smaller mission specific catalogs were also included. These were:

  • GEPC – The Gaia Ecliptic Pole Catalog, version 3.0(Altmann 2013): an approximatelyone-square-degree catalog around the northern and southernecliptic poles produced specifically for calibrating Gaia. The tworegions were defined as 62 squares centered onRA, Dec 06:00:00, 66:33:41 and 18:00:00, +66:33:41. The observations were done with the MEGACAM on the Canadian French Hawaii Telescope for the northern ecliptic pole and with WFI on the ESO 2.2 m telescope for the southern ecliptic pole. In this catalog there is a “stellarity” parameter that goes from 0 to 1, all objects that have a stellarity greater than 0.7 were considered a star and the rest non-star. An error in the translation meant that the IGSL galaxy classification for the non-stars is sometimes larger than 1, the stellar sources are all correctly classified as 0. Since the GEPC did not include all bright objects, all entries in the IGSL in this region of the sky are indicated as GEPC objects. The extra objects will therefore have aux EPC = 1 but idEPC = 0 as a source identification in the part of the IGSL.

  • LQRF – The CU3 early version of Large Quasar Reference Frame (Andrei et al. 2009): this is a compilation of QSOs with precise positions produced as part of the Gaia auxiliary catalog development. For these objects we have defaulted to the LQRF positions and set the proper motions to zero because they are all confirmed QSOs.

  • OGLE – Optical Gravitational Lensing Experiment version III: on the request of the Gaia science alerts team, we included objects in the OGLE bulge, LMC, SMC and Southern-EPC catalogs to improve the incompleteness of the IGSL in very crowded regions. The OGLE data are provided as catalogs of overlapping observations so many objects on the borders of chips were repeated and ended up as duplicates. To remove any duplicates, if a detection was within an arcsecond of another detection, we only kept the first entry.

There were also ~100 objects in the following catalogs that were not found in the above source catalogs mostly bright objects and they were forced to be included in the IGSL regardless of magnitude:

3. Production of the IGSL

There is no “correct” way to match large catalogs. All procedures will in certain cases fail, and the goal of any adopted procedure is to minimize the failures in the sense that is most beneficial to the purpose of the result. A very small matching radius will miss some real matches and result in some double entries, while a large one will have erroneous matches – especially when matching observations or catalogs from different epochs. In the case of the IGSL we adopt a large matching radius to minimize the number of double entries at the risk of making erroneous matches. In a compilation catalog such as the IGSL, the result is also a “compilation” of the individual catalog errors. The requirement that objects have at least two magnitudes from which to calculate a G magnitude cleans slightly the final result, and overall statistically the IGSL is reliable; however individual objects will have mismatches.

All the source catalogs were included in a MySQL database, indexed using the DIF facility (Nicastro & Calderone 2010), and each catalog matched to a master list using a nearest-neighbor approach with a limit of 5′′. The large matching radius was adopted to minimize the number of duplicate entries and to match the resolution limit of the GSC23, which was the original starting catalog. If, from a new catalog, two entries are matched to the same master-list object, a new entry is generated, and only the closest of the two is listed as a match. Once the master list is generated and all the catalogs have been matched to it, the production of the IGSL is just a combination of a few SQL scripts.

Initially all entries are included. Positions are assigned following the priority order: LQRF, UCAC4, Tycho-2, SDSS, PPMXL, GEPC, OGLE, GSC23. Proper motions are assigned following the priority order: UCAC4, Tycho-2, PPMXL, GSC23 and the LQRF objects have proper motions set to zero.

The classification is set to star for all UCAC/Tycho-2/OGLE as they have no classification indicator. The QSOs in the LQRF are also classified as stellar. The classification is then assigned in the following priority order: SDSS, GSC23, GEPC. We note that the classification as discussed earlier is inverted for the SDSS and sometimes more than one for the GEPC non-stars. In summary if a object is classed as zero it is a star, if not it is a non-star, except for classifications from the SDSS, e.g., where sourceClassification = 2, where the rule is inverted.

The magnitudes are assigned following the equations in Appendix A. The red and blue magnitudes are taken from magnitudes in the published catalogs following the priority order: Tycho, GSC23, PPMXL, SDSS, GEPC, OGLE, sky2000, 2MASS. The Gaia magnitudes follow this priority order: SDSS, Tycho-2, GEPC, OGLE, or transforms of the red and blue magnitudes as derived previously. The priorities for assignment are summarized in Table 1.

Table 1

Summary of the different catalog contributions in the IGSL.

thumbnail Fig. 1

Distribution of object density per heal-pixel 6th-level region of the Initial Gaia Source List. The stripes across the galactic plane are due to the SDSS Segue scans that are more complete and have a higher resolution that the other large catalogs based on Schmidt photographic plates that are also visible.

Open with DEXTER

Once a Gaia G magnitude has been calculated, all objects with G< 21 are included into the IGSL. There are many objects that do not have a G magnitude because the source catalogs do not provide a red and blue magnitude from which to calculate G. There are for example many entries with magnitudes brighter than RF = 20 or BJ = 21 that could be reasonably assumed to have G< 21, however, they may also be defects, unmatched entries or have unreliable magnitudes, so we have not included them. The exception to this rule are the objects in the LQRF, GEPC, HIP, SKY2000, and SPSS catalogs that are included with no constraint on magnitudes.

All entries are provided with a sourceID as described in De Angeli et al. (2007) and further developed in Bastian (2013). The running number and position for the heal-pixel (Górski et al. 2005) identification is provided at the first match of the contributing catalog to the master list. The sourceID is a combination of the source heal-pixel level 12 ID (healpix12ID2) and the running number in the heal-pixel level 6 (runningnumber) bit combined via the MySQL relation SourceID = (healpix12ID ≪ 35) + (runningnumber ≪ 7). This intrinsically indicates that the objects are entered in the main data base by the Data Processing Center Torino which has a DPAC center identification of 0. This SourceID will remain the identification of the Gaia objects unless the object is found to be multiple in which case new SourceIDs are generated for all components. We do not indicate components since we did not try to identify binary systems. Since the position in the IGSL is very rarely from the first catalog, the heal-pixel corresponding to the published IGSL position is not always consistent with the one used in the sourceID.

4. The Catalog

4.1. The main catalog

The IGSL has 1 222 598 530 entries and is currently included in the Gaia main database and at the CDS Strasbourg3 and is available for bulk download4. It is composed of two tables:

  • 1)

    IgslSource: the main data with positional, photometric, andclassification information as listed inAppendix A. Each entry is147 bytes.

  • 2)

    SourceCatalogIDs: a list of the identifiers in the various input catalogs, as well as the Hipparcos catalog on request from the Gaia team. Each entry is 98 bytes. We note that the identifiers are sometimes reported verbatim and are sometimes simplified as shown in Appendix A.

Due to differing precision and completeness in the source catalogs the distribution of the IGSL is biased. This is shown visibly by the outline of the SDSS footprint and the Schmidt plate boundaries in the distribution of object densities per region of the IGSL in Fig. 1. The checkered pattern is already noted in the PPMXL5 and a similar pattern in the GSC236, which are due to extra photometric information on multiple plates, better object detection on multiple plates, or multiple entries of objects at the plate boundaries.

4.2. Known problems

After the IGSL was frozen and delivered to the DPAC a number of problems were found. Some are trivial for example the ecliptic coordinates were calculated with a 1950B rather than J2000 transformation or the inversion of the SDSS classifications. Others were more complicated, for example the inclusion of the Tycho-2 used the catalog “mean right ascension” and “mean declination”; however, for some systems found to be binary in the Tycho catalog, these positions are identical (Hoeg et al. 1997). The software that generated the IGSL did not forsee the possibility of objects with identical positions, and the first entry of each system was overwritten by the second entry. If the first entry was in one of the other input catalogs, it was included later without the link to the Tycho-2, if the entry was only in the Tycho-2, it was not included in the IGSL7.

4.3. The Attitude Star Catalog toggle

The IGSL when delivered also included a binary toggle “toggleASC” that indicated that entry was considered sufficiently isolated and bright enough to be used in the on-ground attitude reconstruction of Gaia. Recent use found that the subset included many repeated entries from the source catalogs, and also many entries violated the isolation criteria. This subset was regenerated and will be the subject of a future article.

5. IGSL4

The IGSL was developed to satisfy the operational requirements of the Gaia mission until the Gaia observations themselves produced an improved all sky catalog of celestial objects. It is, however, also being employed in various other scientific uses, and as such we have produced a version of the IGSL that corrected any known problems when possible. In particular for the bright objects starting from the faint catalogs before, the bright catalogs lead to the mismatch of many bright objects to faint noise in the denser catalogs. A better procedure, and one that has been adopted for the IGSL4, is to build the master list starting with the bright sparse catalogs. The IGSL4 has been built by combining the catalogs in the following order: Hipparcos, Tycho-2, SKY2000, UCAC4, GSC23, PPMXL, LQRF, TMASS, SDSS, GEPC, OGLE. It has also been delivered to the CDS Strasbourg and is linked as an update of the DPAC version.

Since the matching of the contributing catalogs in the IGSL4 is different from that of the IGSL, the running numbers and sometimes the heal-pixel changed. The sourceIDs therefore changed from the IGSL3 to IGSL4. For consistency we matched the IGSL4 objects to the IGSL3 and assigned the sourceIDs of the best matched from the DPAC version to the new version. There will be ambiguous matches where the new matching produced a different combination, and in the IGSL4, there are entries that were not in the IGSL3 – these were assigned new sourceIDs. In addition we have added estimates of the Gaia red and blue magnitudes based on the transformations provided in Jordi (2012) and included in Appendix A.

The IGSL3 has been made public to allow preparation for the results of Gaia. The SourceIDs used by Gaia will be those of the IGSL unless the object is found to be multiple, in which case new SourceIDs are generated. The IGSL4 has been made public to allow pre-Gaia simulations and astrometric usage with a catalog free of the problems in the version delivered to the DPAC8.


and Nside = 2k, where k is the map level (or order). Pixel area is . For level 6 (12), it is Npix = 49152(201,326,592) and Ωpix = 0.84 deg2 (0.74arcmin2).


As problems are indicated they are compiled on the web page and accounted for in the Gaia processing downstream when possible.


Future updates and identification of problems will be provided at


We would like to thank Mario G. Lattanzi, Alessandro Spagna, Roberto Morbidelli, Rosario Messineo, Ulrich Bastian, Jose Hernandez, Uwe Lammers, Claus Fabricius, and Thomas Boch for constructive comments and feedback during the construction of the IGSL. The authors acknowledge the support of ASI under contract I/058/10/0 (Gaia Mission – The Italian Participation to DPAC); INAF through the TECNO- INAF CRA (Customizable Instrument Workstation Software); the Marie Curie 7th European Community Framework Program grant No. 247593 Interpretation and Parameterization of Extremely Red COOL dwarfs (IPERCOOL) International Research Staff Exchange Scheme. This research has made use of: the Centre de Données astronomiques de Strasbourg; the Second Guide Star Catalog developed as a collaboration between the Space Telescope Science Institute and the Osservatorio Astrofisico di Torino; the Two Micron All Sky Survey which is a joint project of the University of Massachusetts and the Infrared Processing and Analysis Center/California Institute of Technology; and the Sloan Digital Sky Survey funded by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, and the US Department of Energy Office of Science.


Appendix A: The IGSL parameters, cross identifications, and photometric transformations

Table A.1

IGSL parameters.

Table A.2

IGSL cross catalog identifiers.

Table A.3

Photometric transformations blue/red magnitudes (BJ/Rf) to Gaia G/RVS/red/blue magnitudes (G/GRVS/GRP/GBP).

All Tables

Table 1

Summary of the different catalog contributions in the IGSL.

Table A.1

IGSL parameters.

Table A.2

IGSL cross catalog identifiers.

Table A.3

Photometric transformations blue/red magnitudes (BJ/Rf) to Gaia G/RVS/red/blue magnitudes (G/GRVS/GRP/GBP).

All Figures

thumbnail Fig. 1

Distribution of object density per heal-pixel 6th-level region of the Initial Gaia Source List. The stripes across the galactic plane are due to the SDSS Segue scans that are more complete and have a higher resolution that the other large catalogs based on Schmidt photographic plates that are also visible.

Open with DEXTER
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.