Galaxy classification: deep learning on the OTELO and COSMOS databases

José A. de Diego; Jakub Nadolny; Ángel Bongiovanni; Jordi Cepa; Mirjana Pović; Ana María Pérez García; Carmen P. Padilla Torres; Maritza A. Lara-López; Miguel Cerviño; Ricardo Pérez Martínez; Emilio J. Alfaro; Héctor O. Castañeda; Miriam Fernández-Lorenzo; Jesús Gallego; J. Jesús González; J. Ignacio González-Serrano; Irene Pintos-Castro; Miguel Sánchez-Portal; Bernabé Cedrés; Mauro González-Otero; D. Heath Jones; Joss Bland-Hawthorn

doi:10.1051/0004-6361/202037697

Home

All issues

Volume 638 (June 2020)

A&A, 638 (2020) A134

Full HTML

Free Access

Issue		A&A Volume 638, June 2020


Article Number		A134
Number of page(s)		15
Section		Extragalactic astronomy
DOI		https://doi.org/10.1051/0004-6361/202037697
Published online		25 June 2020

A&A 638, A134 (2020)

Galaxy classification: deep learning on the OTELO and COSMOS databases

José A. de Diego¹^,2, Jakub Nadolny²^,3, Ángel Bongiovanni⁴^,5, Jordi Cepa²^,5^,3, Mirjana Pović⁶^,7, Ana María Pérez García⁵^,8, Carmen P. Padilla Torres²^,3^,9, Maritza A. Lara-López¹⁰, Miguel Cerviño⁸, Ricardo Pérez Martínez⁵^,11, Emilio J. Alfaro⁷, Héctor O. Castañeda¹², Miriam Fernández-Lorenzo⁷, Jesús Gallego¹³, J. Jesús González¹, J. Ignacio González-Serrano¹⁴^,5, Irene Pintos-Castro¹⁵, Miguel Sánchez-Portal⁴^,5, Bernabé Cedrés²^,3, Mauro González-Otero²^,3, D. Heath Jones¹⁶ and Joss Bland-Hawthorn¹⁷

¹ Instituto de Astronomía, Universidad Nacional Autónoma de México, Apdo. Postal 70-264, 04510 Ciudad de México, Mexico
e-mail: jdo@astro.unam.mx
² Instituto de Astrofisica de Canarias (IAC), 38200 La Laguna, Tenerife, Spain
³ Departamento de Astrofisica, Universidad de La Laguna (ULL), 38205 La Laguna, Tenerife, Spain
⁴ Instituto de Radioastronomía Milimétrica (IRAM), Av. Divina Pastora 7, Local 20, 18012 Granada, Spain
⁵ Asociacion Astrofisica para la Promocion de la Investigacion, Instrumentacion y su Desarrollo, ASPID, 38205 La Laguna, Tenerife, Spain
⁶ Ethiopian Space Science and Technology Institute (ESSTI), Entoto Observatory and Research Center (EORC), Astronomy and Astrophysics Research Division, PO Box 33679 Addis Ababa, Ethiopia
⁷ Instituto de Astrofísica de Andalucía, CSIC, Glorieta de la Astronomía s/n, 18080 Granada, Spain
⁸ Depto. Astrofísica, Centro de Astrobiología (INTA-CSIC), ESAC Campus, Camino Bajo del Castillo s/n, 28692 Villanueva de la Cañada, Spain
⁹ Fundación Galileo Galilei, Telescopio Nazionale Galileo, Rambla José Ana Fernández Pérez, 7, 38712 Breña Baja, Santa Cruz de la Palma, Spain
¹⁰ DARK, Niels Bohr Institute, University of Copenhagen, Lyngbyvej 2, Copenhagen 2100, Denmark
¹¹ ISDEFE for European Space Astronomy Centre (ESAC)/ESA, PO Box 78 28690 Villanueva de la Cañada, Madrid, Spain
¹² Departamento de Fisica, Escuela Superior de Fisica y Matematicas, Instituto Politécnico Nacional, Mexico DF, Mexico
¹³ Departamento de Física de la Tierra y Astrofísica, Facultad CC Físicas, Instituto de Física de Partículas y del Cosmos, IPARCOS, Universidad Complutense de Madrid, 28040 Madrid, Spain
¹⁴ Instituto de Fisica de Cantabria (CSIC-Universidad de Cantabria), 39005 Santander, Spain
¹⁵ Department of Astronomy & Astrophysics, University of Toronto, Toronto, Canada
¹⁶ English Language and Foundation Studies Centre, University of Newcastle, Callaghan, NSW 2308, Australia
¹⁷ Sydney Institute for Astronomy, School of Physics, University of Sydney, Sydney, NSW 2006, Australia

Received: 10 February 2020
Accepted: 25 April 2020

Abstract

Context. The accurate classification of hundreds of thousands of galaxies observed in modern deep surveys is imperative if we want to understand the universe and its evolution.

Aims. Here, we report the use of machine learning techniques to classify early- and late-type galaxies in the OTELO and COSMOS databases using optical and infrared photometry and available shape parameters: either the Sérsic index or the concentration index.

Methods. We used three classification methods for the OTELO database: (1) u − r color separation, (2) linear discriminant analysis using u − r and a shape parameter classification, and (3) a deep neural network using the r magnitude, several colors, and a shape parameter. We analyzed the performance of each method by sample bootstrapping and tested the performance of our neural network architecture using COSMOS data.

Results. The accuracy achieved by the deep neural network is greater than that of the other classification methods, and it can also operate with missing data. Our neural network architecture is able to classify both OTELO and COSMOS datasets regardless of small differences in the photometric bands used in each catalog.

Conclusions. In this study we show that the use of deep neural networks is a robust method to mine the cataloged data.

Key words: galaxies: general / methods: statistical

© ESO 2020

1. Introduction

Galaxy morphological classification plays a fundamental role in descriptions of the galaxy population in the universe, and in our understanding of galaxy formation and evolution. Galaxy morphology is related to key physical, evolutionary, and environmental properties, such as system dynamics (Djorgovski & Davis 1987; Gerhard et al. 2001; Debattista et al. 2006; Falcón-Barroso et al. 2019; Romanowsky & Fall 2012), the stellar formation history (Kennicutt 1998; Bruzual & Charlot 2003; Kauffmann et al. 2003; Lovell et al. 2019), gas and dust content (e.g., Lianou et al. 2019), galaxy age (Bernardi et al. 2010), and interaction and merging events (e.g., Romanowsky & Fall 2012). Early galaxy classifications strategies were based on the visual aspect of the objects, differentiating among spiral, elliptical, lenticular, and irregular galaxy types according to their resolved morphology. Examples of these strategies are the original classification schemes by Hubble (1926) and De Vaucouleurs (1959). This methodology has reached historic marks during the last decade through the citizen science initiative known as Galaxy Zoo. It stands out for being the largest effort made to visually classify more than 900 000 galaxies from the Sloan Digital Sky Survey (SDSS; Fukugita et al. 1996) galaxies brighter than r_SDSS = 17.7 with proven reliability (Lintott et al. 2011). After this milestone, this crowd-sourced astronomy project also included the analysis of datasets from the Kilo-Degree Survey (KiDS) imaging data in the Galaxy and Mass Assembly (GAMA) fields, classifying typical edge-on galaxies at z < 0.15 (Holwerda et al. 2019), and the quantitative visual classification of approximately 48 000 galaxies up to z ∼ 3 in three Hubble Space Telescope (HST) fields of the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS; Simmons et al. 2017). Using the visual classification approach, the morphology and size of luminous, massive galaxies at 0.3 < z < 0.7 targeted by the Baryon Oscillation Spectroscopic Survey (BOSS; Dawson et al. 2013) of SDSS-III were also determined (Masters et al. 2011) using HST and Cosmic Evolution Survey¹ (COSMOS; Scoville et al. 2007) data.

However, the availability of larger telescopes and sophisticated instruments has made visual classification unfeasible because most galaxies are barely resolved, making identification of their morphological type very difficult, and the number of discovered galaxies has increased dramatically since the introduction of digital surveys dedicated to probing larger and deeper volumes in the universe. This issue will be even more critical in the near future when the next generation of large surveys such as the Large Synoptic Survey Telescope (Tyson 2002) or the results from Euclid mission (Laureijs et al. 2011) produce petabytes of information and trigger the need for time-domain astronomy (Hložek 2019) far exceeding the capacity of available human resources to manage this information. For this reason, the automated classification of galaxies has become an intense area of research in modern astronomy.

Previous research into automated galaxy-classification algorithms has focused on colors, shape, and morphological parameters related to galaxy light distribution, such as concentration and asymmetry (e.g., Abraham et al. 1994; Bershady et al. 2000; Conselice 2003, 2006; Pović et al. 2009, 2013, 2015; Deng 2013). Joint automated and visual classification procedures have been implemented in extragalactic surveys such as for example COSMOS (Cassata et al. 2007; Zamojski et al. 2007) and GAMA (Alpaslan et al. 2015). Another approach involves the fitting of spectral energy distributions (SEDs) using galaxy templates (Ilbert et al. 2009). In a complementary fashion, Strateva et al. (2001) investigated the dichotomous classification in early- and late-type (ET and LT) galaxies. For these authors, the ET group includes the E, S0, and Sa morphological types, while the LT group comprises Sb, Sc, and Irr galaxies. Furthermore, using the well-known tendency of the LT to be bluer than the ET galaxies, Strateva et al. (2001) propose the u − r color to separate between these galaxy types. Sérsic and concentration indexes have also been used, alone or in combination with the u − r color, to separate ET from LT galaxies (e.g., Conselice 2003; Kelvin et al. 2012; Deng 2013; Vika et al. 2015). A far more complicated and expensive classification, in terms of computational and observational resources, consists in fitting a set of either empirical or modeled SED templates to the galaxy continuum (e.g., Coleman et al. 1980; Kinney et al. 1996). Currently, there are some public codes that are able to perform such template-based classifications (e.g., LePhare: Arnouts et al. 1999; Ilbert et al. 2006).

Classification can be addressed in machine learning through supervised learning techniques, which consist in training a function that maps inputs to outputs learning from input–output pairs, and using this function to assign new observations in two or more predefined categories. Supervised learning techniques include decision trees (Barchi et al. 2020), random forests (Miller et al. 2017), linear discriminant analysis (LDA; Murtagh & Heck 1987), support vector machines (Huertas-Company et al. 2008), Bayesian classifiers (Henrion et al. 2011), and neural networks (Ball et al. 2004), among others.

Machine Learning algorithms are increasingly used for classification in large astronomical databases (e.g., Abolfathi et al. 2018). In particular, LDA is a common classifying method used in statistics, pattern recognition, and machine learning. Linear discriminant analysis classifiers attempt to find linear boundaries that best separate the data. Recently, LDA has being used for galaxy classification in spiral and elliptical morphological types (Ferrari et al. 2015), classification of Hickson’s compact groups of galaxies (Abdel-Rahman & Mohammed 2019), and galaxy merger identification (Nevin et al. 2019).

In recent years, neural networks have become very popular in different research areas because of their ability to perform outstanding accurate classifications, and regression and series analyses. A typical neural network is made up of a number of hidden layers, each with a certain quantity of neurons that perform tensor operations. There are several network types which are oriented to solve different issues (a brief explanation of different networks can be found in Baron 2019). Also, Busca & Balland (2018) used a one-dimensional convolutional neural network (CNN) for classification and redshift estimates of quasar spectra extracted from the BOSS. Much of the recent research has focused on two-dimensional CNN classification of galaxy images (e.g., Serra-Ricart et al. 1996; Huertas-Company et al. 2015; Dieleman et al. 2015; Domínguez Sánchez et al. 2018; Pérez-Carrasco et al. 2019; Walmsley et al. 2020).

In the future, neural networks will probably gain more importance and become the primary technique for classification of astronomical images. However, there are two drawbacks that limit the use of CNN in astronomical research at present. The first is the network bandwidth, which prevents the download of large amounts of heavy images obtained in remote observatories. The second drawback is the computational and hardware resources needed to train a two-dimension CNN with tens of thousands of images.

Dense (or fully connected) neural networks (DNN) are used to solve general classification problems applied to tabulated data. In astronomy, DNNs have been applied to morphological type classification in low-redshift galaxies. Thus, Storrie-Lombardi et al. (1992) designed a simple DNN architecture for morphological classification of 5217 galaxies drawn from the ESO-LV catalog (Lauberts & Valentijn 1989) using 13 parameters (most of them geometrical) in five different classes, obtaining an accuracy of 56%. Naim et al. (1995) used the same architecture for 830 bright galaxies (B ≤ 17) and 24 parameters, reducing the parameter space dimension through principal components analysis. Serra-Ricart et al. (1993) used DNN autoencoders for unsupervised classification of galaxies into three major classes: Sa+Sb, Sc+Sd, and SO+E. Sreejith et al. (2018) applied a DNN to a sample of 7528 galaxies at redshifts z < 0.06 extracted from the Galaxy And Mass Assembly survey (GAMA²) achieving an accuracy of 89.8% for spheroid- versus disk-dominated classification.

These earlier works showed that DNNs are capable of performing accurate classification tasks on processed data such as photometry, colors, and shape parameters of low-redshift galaxies (Storrie-Lombardi et al. 1992; Naim et al. 1995; Ball et al. 2004). However, compared with image-oriented CNNs, little attention has been paid recently to the use of DNNs for galaxy classification, even if these networks do not require the large quantity of resources used by the CNN. Moreover, both neural network software development (e.g., Tensorflow, Abadi et al. 2016) and hardware computation power (both central and graphics processing units) have increased dramatically, boosting the capabilities of DNN applications.

In this paper we extend the use of DNNs to the morphological classification of galaxies up to redshifts z ≤ 2. We compare the performance of different galaxy classification techniques applied to a sample of galaxies extracted from the photometric OTELO database (Bongiovanni et al. 2019) with a fitted Sérsic profile (Nadolny et al. 2020). These techniques are (1) the Strateva et al. (2001)u − r color algorithm; (2) the LDA machine learning algorithm, which includes both the u − r color and a shape parameter, either the Sérsic index or the concentration index (Kelvin et al. 2012); and (3) a DNN that uses optical and near-infrared photometry, and shape parameter for objects available in both OTELO and COSMOS catalogs. We find that a simple, easily trainable DNN yields a highly accurate classification for ET and LT OTELO galaxies. Moreover, we apply our DNN architecture to a set of tabulated COSMOS data with some differences in the photometric bands measured with respect to OTELO, and find that our architecture also performs accurate classification of COSMOS galaxies. Finally, we use the same DNN architecture but substituting the Sérsic index with the concentration index (Shimasaku et al. 2001) for both OTELO and COSMOS datasets.

This paper is organized as follows. Section 2 describes the different techniques used to classify galaxies. In Sect. 3 we show the results and compare the different techniques. Finally, in Sect. 4 we present our conclusions.

2. Methodology

The current investigation involves the automatic classification of galaxies into two dichotomous groups, namely ET and LT galaxies, using both photometric measurements and a factor that depends on the shape of the galaxys’ light distribution. Machine learning algorithms for automatic classification parse data and learn how to assign subjects to different classes. These algorithms require both training and test datasets that consist of labeled data. The training dataset is used to fit the model parameters, and the test dataset to provide an unbiased assessment of the model performance. If the algorithm requires tuning the model hyperparameters, such as the number of layers and hidden units in a DNN architecture, a third labeled dataset called the validation dataset is required to evaluate different model trials (the test dataset must be evaluated only by the final model). Once the final model architecture is attained, it is trained joining both the training and the validation dataset, and then evaluated using the test dataset.

In this section we present our samples of galaxies extracted from OTELO and COSMOS. We use the observed photometry and colors, that is, neither k nor extinction corrections were performed. In order to maximize the sample size while keeping a well-sampled set in redshift, data have been limited in photometric redshift (z_phot ≤ 2) but not in flux, thus no cosmological inferences can be performed from our sample. However, at the end of Sect. 3 we present a brief analysis of the results obtained for flux-limited samples. We describe the photometry and the shape factors of these data. We then present the implementation of the different classification methodologies used: the u − r color, LDA, and DNN. Finally, we present the bootstrap procedure that we use to compare the results obtained with these methodologies.

2.1. OTELO samples

OTELO is a very deep blind survey performed with the red tunable filter OSIRIS instrument of the 10.4 m Gran Telescopio Canarias (Bongiovanni et al. 2019). OTELO data consist of images obtained in 36 adjacent narrow bands (FWHM 12 Å) covering a window of 230 Å around λ = 9175 Å. The catalog includes ancillary data ranging from X-rays to far infrared. Point spread function-model photometry and library templates were used for separating stars, AGNs, and galaxies.

The OTELO catalog comprises 11 237 galaxies. Nadolny et al. (2020) matched OTELO with the output from GALAPAGOS2 (Häussler et al. 2007; Häußler et al. 2013) over high-resolution HST images. Not all the OTELO galaxies were detected by GALAPAGOS2, which returned a total of 8812 sources. Nadolny et al. (2020) account for automated detection of multiple matches produced by more than one source that lay inside the OTELO’s Kron radius in a high-resolution F814W band image. These latter authors attribute these multiple matches to close companions (gravitationally bounded or in projection), mergers, or resolved parts of the host galaxy. In any case, sources with multiple matches were excluded from our analysis because they could affect low-resolution photometry. Finally, we included further constraints to extract our OTELO samples (see below).

2.1.1. Sérsic index and photometry sample

OTELO uses LePhare templates to fit the SED of galaxies to obtain photometric morphological type classification and redshift estimates. We used this morphological classification to assign the galaxies to ET and LT classes. The best model fitting is recorded under the MOD_BEST_deepN numerical coded entries in the OTELO catalog (Bongiovanni et al. 2019). The ET class includes galaxies coded as “1” in the OTELO catalog, which were best fitted by the E/S0 template from Coleman et al. (1980). The LT class comprises OTELO galaxies coded from “2” to “10”, which were best fitted by different late-type galaxy templates, namely Sbc, Scd, and Irr (Coleman et al. 1980), and starburst-class templates from SB1 to SB6 (Kinney et al. 1996). Bongiovanni et al. (2019) estimate that the fraction of inaccurate SED fittings for the galaxies contained in the OTELO catalog may amount to up to ∼4%. Therefore, our results may be affected if there are ET galaxies miscoded differently from “1” in OTELO, or any of the LT galaxies miscoded as “1”. This could affect, for example, early-type spirals such as Sa galaxies, which are not explicitly included in the OTELO template set. However, the UV SED for ellipticals and S0 galaxies is completely different from Sa and other LT galaxies. The OTELO catalog also includes GALEX-UV data that allow us to identify ET galaxies even in the local universe. Thus, we conclude that recoding the OTELO classification in our galaxy sample as ET and LT classes yields a negligible number of misclassified objects (certainly much less than the OTELO fraction of ∼4%) and does not affect our results.

The Sérsic profile is a parametric relationship that expresses the intensity of a galaxy as a function of the distance from its center:

$\begin{matrix} I (R) = I_{e} e^{- b [(\frac{R}{R_{e}})^{1 / n} - 1]}, \end{matrix}$ $\begin{aligned} I(R) = I_{\rm e} e^{-b \big [ \big ( \frac{R}{R_{\rm e}} \big )^{1/n} -1 \big ]}, \end{aligned}$ (1)

where I is the intensity at a distance R from the galaxy center, R_e is the half-light radius, I_e is the intensity at radius R_e, b (∼2n − 1/3) is a scale factor, and n is the Sérsic index. Sérsic profiles have been employed for galaxy classification (e.g., Kelvin et al. 2012; Vika et al. 2015). This index provides a geometrical description of the galaxy concentration; for a Sérsic index n = 4 we obtain the de Vaucouleurs profile typical of elliptical galaxies, while setting n = 1 gives the exponential profile describing spiral galaxies.

Our OTELO Sérsic index and photometry (OTELO SP) sample consists of 1834 galaxies at redshifts z ≤ 2 extracted from the OTELO catalog (listed under the Z_BEST_deepN code). The sample includes ugriz optical photometry from the Canada-France-Hawaii Telescope Legacy Survey³ (CFHTLS), JHKs near-infrared photometry from the WIRcam Deep Survey⁴ (WIRDS), and Sérsic index estimates obtained using GALAPAGOS2/GALFIT (Häußler et al. 2013; Peng et al. 2002, 2010) on the HST-ACS publicly available data in the F814W band. The sample comprises only galaxies with Sérsic indexes between n = 0.22 and n = 7.9; Sérsic indexes out of this range are not reliable because of an artificial limit imposed by the Sérsic-profile-fitting algorithm in GALAPAGOS2 (Häussler et al. 2007). Besides, the sample does not include galaxies with Sérsic index values less than three times their estimate errors. For a detailed description of the Sérsic-profile-fitting process we refer to Nadolny et al. (2020).

Figure 1 shows the sample distributions of magnitudes in the r band and photometric redshifts extracted from the OTELO catalog. We note that the sample is not limited in flux, and therefore it is not a complete sample in the volume defined by the redshift limit z_phot ≤ 2 (see the discussion about magnitude-limited samples below). The redshift distribution presents concentrations at redshifts 0.04, 0.11, 0.34, 0.90, and 1.72 superimposed onto a bell-like distribution with a maximum around z_phot ≈ 0.8 and a strong decay from z_phot ≈ 1.3. The photometric data are incomplete, which affects the available number of galaxies for those classification procedures that cannot effectively manage missing data.

Fig. 1.

Comparative distribution of brightnesses in the r band and photometric redshifts for the 1834 galaxies in the OTELO SP sample. Bottom left panel: r magnitude vs. photometric redshift z_phot plot shows the galaxies in the sample, differentiating between LT galaxies (black circles) and ET galaxies (red squares). Top panel: SP sample photometric redshift distribution. Right panel: SP sample r magnitude distribution.

The sample is randomly divided in a training set (70% of the available galaxies) used for the algorithm training, and a test set (30%) used to yield an unbiased estimate of the efficiency of the model. Choosing the proportions of training and test sample sizes depends on a balance between the model performance and the variance in the estimates of the statistical parameters (in our case the accuracy, True ET Rate and True LT Rate, as explained below). Rule-of-thumb proportions often used in machine learning are 90:10 (i.e., 90% training, 10% testing), 80:20 (inspired by the Pareto principle), and 70:30 (our choice). In our case, the 70:30 proportion is justified because it fulfills the large enough sample condition (another rule of thumb) that the sample size must be at least 30 to ensure that the conditions of the central limit theorem are met. Thus, the number of expected ET galaxies in the SP test sample is: N_galp_etp_test ≈ 30, where N_gal = 1834 is the sample size, p_et = 1834 is the proportion of ET galaxies in the SP sample, and p_test = 0.3 is the proportion of galaxies in the test sample.

2.1.2. Concentration and photometry sample

The concentration is widely used to differentiate ET from LT galaxies. Concentration provides a direct measurement of the intensity distribution in the image of a galaxy. For that reason, the concentration is easier to obtain than the Sérsic index, which requires fitting several parameters to the Sérsic profile.

Here we use the definition (Bershady et al. 2000; Scarlata et al. 2007):

$\begin{matrix} C = 5 {log}_{10} (\frac{r_{80}}{r_{20}}), \end{matrix}$ $\begin{aligned} C = 5 \log _{10} \, \left( \frac{r_{80}}{r_{20}} \right), \end{aligned}$ (2)

where r₈₀ and r₂₀ are the 80% and 20% light Petrosian radii, respectively, obtained from the HST F814W band images. We chose the F814W band concentration for compatibility with COSMOS (Scarlata et al. 2007). The data were limited to a redshift z ≤ 2. The final OTELO concentration and photometry (OTELO CP) sample consists of 2292 galaxies, with 114 classified as ET and 2178 as LT. Figure 2 shows the sample distributions of magnitudes in the r band and photometric redshifts, which is similar to the case of the SP sample discussed above. The CP sample was also divided in two subsamples: a training subsample containing 1604 (70%) of the objects, and a test subsample with 688 (30%) of the galaxies.

Fig. 2.

As Fig. 1 but for the 2292 galaxies in the OTELO CP sample.

2.2. COSMOS samples

We expect that our DNN architecture can be applied to galaxy classification in other databases. Therefore, we checked its reliability using two COSMOS enhanced data products: the Zurich Structure & Morphology Catalog v1.0 (ZSMC, Scarlata et al. 2007; Sargent et al. 2007) and the COSMOS photometric redshifts v1.5 (CPhR, Ilbert et al. 2009). Those catalogs have 131 532 and 385 065 entries, respectively. We merged both databases, obtaining 128 442 matches, from which we chose a sample of galaxies with Sérsic indexes estimates in the range 0.2 < n < 8.8, and another sample with the same concentration radii used in OTELO. Both samples are limited to redshifts z < 2 and include photometry in the CFHT u, Subaru BVgriz, UKIRT J and CFHT K bands, along with classification entries. Thus, the galaxy records included all the available data from the CPhR bands except the CFGT i′ magnitudes (we chose the Subaru i band also included in the catalog).

The resulting COSMOS Sérsic index and photometry (COSMOS SP) sample consists of 34 688 galaxies, 28 951 of which had been classified as LT and 5737 as ET. With such a large number of galaxies, we can limit the training set to 5000 galaxies (a fraction of approximately 14% of the sample), and rise the fraction of the testing set up to 29 688 galaxies (approximately 86% of the sample) in order to reduce the variance of the results. Analogously, the COSMOS concentration and photometry (COSMOS CP) sample consists of 105 758 galaxies, distributed in 95 781 LT and 9977 ET. We set the corresponding training and testing sets to 10 000 and 95 758 galaxies, respectively.

2.3. Classification procedures

We used a classification baseline and three classification methods for the OTELO samples. The baseline consists in classifying all the galaxies into the most frequent morphological group. Any classification by a more sophisticated method should improve the baseline accuracy. For the COSMOS samples we only used the classification baseline and the DNN architecture developed for the OTELO samples, as we were interested only in probing this architecture.

2.3.1. Color classification

The first classification method uses a color discriminant. After testing several colors, we focus on the u − r color as proposed by Strateva et al. (2001). These authors use a simple color discriminant such that any galaxy with u − r color redder than 2.22 is classified as ET, and LT if u−r < 2.22. This method was applied only to both SP and CP samples drawn from OTELO. We also investigated other possible color discriminants that will be presented later. Data records with missing u − r colors were disregarded, reducing the SP sample to 1787 galaxies and the CP sample to 2189.

2.3.2. Linear discriminant analysis

The second classification method is LDA. The aim of LDA is to find a linear combination of features which separates different classes of objects. These features are interpreted as a hyperplane normal to the input feature vectors. We note that the Strateva et al. (2001)u − r color separation method can be regarded as a LDA which defines the u−r = 2.22 plane normal to u − g and g − r vectors. As in the previous method, LDA was only applied to SP and CP OTELO samples, and data records with missing u − r colors were disregarded.

Two problems with machine learning techniques are the management of missing data and the curse of dimensionality. Missing data (e.g., a photometric band) usually results in removing objects with incomplete records from the dataset. The curse of dimensionality appears because increasing the number of variables in a classification scheme means that the volume of the space increases very quickly and therefore the data become sparse and difficult to group. The curse of dimensionality can be mitigated by dimensionality reduction techniques such as principal component analysis (PCA), but dimensionality reduction may introduce unwanted effects (data loss, nonlinear relations between variables, and the number of components to be kept Carreira-Perpiñán 2001; Shlens 2014) that tend to blur differences between the groups. Alternative methodologies to deal with these problems are under development, for example by Cai & Zhang (2018) who introduce an adaptive classifier to cope with both missing data and the curse of dimensionality for high-dimensional LDA. To avoid these problems, we chose to limit our LDA model to the Sérsic index and the single highly discriminant u − r color, as it has been already addressed in the galaxy classification literature (e.g., Kelvin et al. 2012; Vika et al. 2015).

2.3.3. Deep neural network

The third method of classification involves a DNN. The sample was analyzed using the Keras library for deep learning. Keras is a high-level neural network application programming interface (API) written in Python under GitHub license. Currently, Keras is available for both Python and R computer languages (Chollet 2017; Chollet & Allaire 2017). In astronomy, Keras has already been used for image classification of galaxy morphologies (Pérez-Carrasco et al. 2019; Domínguez Sánchez et al. 2018) and spectral classification and redshift estimates of quasars (Busca & Balland 2018), and is included in the astroNN package⁵.

As in the other methods, we use a training and a test set to teach and check the DNN model, respectively. The difference from the other methods is that the structure of their learning discriminant function is predetermined, while the DNN architecture should be tuned on the fly. To achieve this goal, we split the training set in the OTELO samples in (i) a teaching set (80% of the original training set), and (ii) a validation set (the remaining 20%). Compared with OTELO, the COSMOS samples consist of many more galaxies. Therefore, we limited the number of the training sets to 5000 for the COSMOS SP sample, and 10 000 for the CP sample, and conserved the respective teaching set and validation set proportions. We use the teaching set to tune the DNN model, and the validation set to check the loss and accuracy functions that describe the DNN classification capability. Once we have achieved a satisfactory result, the DNN architecture has been optimized to classify the validation set, but the performance may be different for other datasets. To generalize the result, we use the whole original training set to retrain the tuned DNN model, and we then classify the test set galaxies. Therefore, the test set galaxies were used neither to train nor to fine tune the DNN model, but only to evaluate the DNN performance.

An appealing feature of DNNs is the easiness to deal with missing data. In practice, it is enough to substitute the missing values in each normalized variable by zeros to cancel their products on the network weights. The DNN then deals with missing values as if they do not carry any useful information and will ignore them. Of course, it is better if there are not missing values, but DNNs allow the user to treat them without the need of dropping data entries or estimating missing values from other variables.

Baron (2019) provides a succinct description of DNNs, and a complete explanation of Keras elements can be found in Chollet (2017) and Chollet & Allaire (2017). Tuning a DNN is a trial-and-error procedure aimed to find an appropriate architecture and setup. As the numbers of input variables, units, and layers increase, the DNN tends to overfit if the training set is small. For this reason, we kept our DNN model as simple as possible whilst obtaining a high-accuracy classification.

We use standard layers and functions for our model that are already available from Keras. For the interested reader, our DNN architecture consists of two dense layers of 64 units each with rectified linear unit (ReLU) activations, and an output dense layer of a single unit with sigmoid activation. The model was compiled using an iterative gradient descendent RMSprop optimizer, a binary-cross-entropy loss function, and accuracy metrics. We kept the default values for the Keras RMSprop optimizer, i.e., a learning rate of 0.001 and a weight parameter for previous batches of ρ = 0.9. These values are appropriate for most DNN problems, and moderate changes do not affect the results. We set the number of training epochs to avoid overfitting, and the training batch sizes to appropriate values for the number of records in the DNN training sample in each case.

2.3.4. Bootstrap

We used bootstrap (e.g., Efron & Tibshirani 1993; Chihara & Hesterberg 2018) to obtain reliable statistics that describe the performance of each classification technique. Bootstrapping is a widely used nonparametric methodology for evaluating the distribution of a statistic using random resampling with replacement. Thus, we calculated the classification accuracy and other classification statistics through 100 runs for the u − r color, LDA, and DNN methods. For each run, we also divided the bootstrap random sample in a training set (70%) and a test set (30%).

3. Results

To determine a minimal set of attributes that are able to classify between ET and LT galaxies, we focus on two directly observable characteristics: photometry and shape. Results obtained in previous studies were limited to nearby galaxies. Thus, Strateva et al. (2001) used photometry from 147 920 SDSS galaxies with magnitude g^* ≤ 21 and redshifts z ≲ 0.4 to build a binary classification model based in the u − r = 2.22 discriminant color, which they tested on a sample of 287 galaxies visually labeled as ET or LT, recovering 94 out of 117 (80%) ET, and 112 out of 170 (66%) LT galaxies. Deng (2013) used a sample of 233 669 SDSS-III DR8 galaxies with redshifts 0.01 < z < 0.25 and report a concentration index discriminant to separate ET from LT galaxies in the r-band that achieved an accuracy of 96.43 ± 0.04. Vika et al. (2015) used both the u − r color and the Sérsic index in the r-band to classify a sample of 142 nearby (z < 0.01) galaxies, dividing the u − r versus n_r plane in quadrants; most ET galaxies were located at the u − r > 2.3 and n_r > 2.5 quadrant (28 out of 34, i.e., 82% ETs were correctly classified).

3.1. Sérsic index and photometry samples

3.1.1. Baseline classification

The baseline classification is the simplest classification method. It assigns all the samples to the most frequent class. This classification is helpful for determining a baseline performance that is used as a benchmark for other classification methods. For this task, we selected all the galaxies in our OTELO SP sample. In total, there are 1834 galaxy records, 99 of them classified as ET galaxies (≈5.4%), and 1735 as LT galaxies (≈94.6%). The two groups are unevenly balanced, which results in the baseline classification achieving a high overall accuracy of 94.6%, which should be exceeded by any other classification method.

3.1.2. Color classification

A preliminary study was performed to decipher which colors yield a split between ET and LT galaxies that outperforms the baseline. Table 1 shows several examples of the measured accuracy for selecting appropriate single color discriminants. We note that several colors did not perform better than the baseline classification (94.6%), but those involving the u and a red band usually yield the most accurate results. Both u − J and u − i colors perform marginally better than u − r, although u − J has a larger number of missing records. We present the rest of the color analysis based on the u − r color in order for ease of direct comparison with the report of Strateva et al. (2001).

Table 1.

One color separation.

Table 2 shows an example of the confusion matrix for a single u − r color bootstrap run, yielding an accuracy of 0.959 ± 0.009. Table 3 shows the Accuracy, True ET Rate, and True LT Rate for the different databases and classification methods used in this paper, obtained through the bootstrap procedure. The True ET Rate and True LT Rate both indicate the proportion of ET and LT galaxies, respectively, recovered through the classification procedure. For the u − r color, the statistics yield an average Accuracy of 0.96 ± 0.02, a True ET Rate of 0.8 ± 0.3, and a True LT Rate of 0.97 ± 0.02. The True ET Rate is the least precise of all the statistics in all the samples because of the relatively low number of ET galaxies.

Table 2.

Color u − r confusion matrix.

Table 3.

Comparison of classification methods for SP samples.

Bootstrap yields a u − r color discriminant for ET and LT separation of 2.0 ± 0.2, as shown in Fig. 3. The agreement with Strateva et al. (2001), u − r = 2.22 (no error estimate is provided by these authors), is remarkable considering that the galaxies studied by these authors have redshifts in the interval 0 < z ≤ 0.4 while our sample expands to z ≤ 2. Figure 4 shows the u − r color distribution as a function of the redshift for the OTELO SP sample. It is worth noting that LT galaxies in this sample tend to be bluer at redshifts z ≲ 0.5, possibly due to an enhanced star-forming activity as also pointed out by Strateva et al. (2001). This feature, along with the scarcity of ET galaxies at z > 1 (about 13% of all the ET galaxies in the OTELO SP sample), justifies the agreement between (Strateva et al. 2001) results and ours despite the redshift differences.

Fig. 3.

Classification for the OTELO SP sample of 1834 galaxies through the u − r color and LDA algorithms. The u − r color vs. logarithm of the Sérsic index n plot shows the original morphological type classification in the OTELO catalog reduced to LT galaxies (black circles) and ET galaxies (red squares). The dotted blue line indicates the u − r color separation, and the dashed green line the LDA separation by the u − r color and the Sérsic index. The logarithmic scale for the Sérsic index makes the comparison with the concentration in Fig. 10 (which is already a logarithmic quantity) easier, but it bends the LDA line. The reader should take into account that this is not a flux limited sample.

Fig. 4.

Comparative distribution of the u − r color and redshifts for the 1834 galaxies in the OTELO SP sample. Bottom left panel: u − r color vs. photometric redshift z_phot plot shows the original morphological type classification in the OTELO catalog reduced to LT galaxies (black circles) and ET galaxies (red squares). The dotted blue line indicates the u − r color separation. Top panel: SP sample photometric redshift distribution. Right panel: SP sample u − r color distribution.

The distribution of the bootstrap Accuracy for the u − r color classification is shown in the upper panel of Fig. 5. Most of the u − r color accuracies are larger than the baseline, but the two extreme bootstrap runs with accuracies lying in the 0.915–0.92 interval fail to detect any ET galaxy.

Fig. 5.

Histogram of accuracies for 100 galaxy classification bootstrap runs for the OTELO SP sample. The solid black lines correspond to the baseline accuracy distribution of the whole sample of 1834 OTELO SP galaxies. The red histogram and the Gaussian fit represented by a dashed line show the u − r color accuracy distribution obtained from 100 bootstrap runs. As for the u − r color, the green histogram and the dotted line show the LDA accuracy distribution. Similarly, the blue histogram and the dash-dotted line shows the DNN accuracy distribution.

The True ET Rate and True LT Rate are analogous to the true positive rate and false positive rate (= 1 − True LT Rate) statistics. These statistics are used in receiver operating characteristic (ROC) curves to represent the ability to discriminate between two groups as a function of a variable threshold, usually the likelihood of the classification (e.g., Baron 2019). Figure 6 shows the distribution of bootstrap values in the True ET Rate versus True LT Rate plane. Every point in this figure corresponds to the 50% probability threshold of the ROC curve (not shown) for each bootstrap run. The closer the point to the top-right corner, the better the classification. The data point located at True LT Rate = 1, True ET Rate = 0 corresponds to the two u − r color bootstrap runs that failed to detect any ET galaxy. Below, for the LDA and DNN classification methods, we increase the number of predictor variables used to enhance the distinction between ET and LT galaxies.

Fig. 6.

True ET Rate vs. True LT Rate values for 100 bootstrap runs in each classification method on the OTELO SP sample. The closer to the upper-left corner, the best classification result. DNN runs yield consistently the best classifications.

3.1.3. Linear discriminant analysis classification

Although the use of colors is an improvement on the baseline classification, and the u − r plane method is very easy to implement, we dispose of additional data in order to aim for more powerful classification techniques. In particular, it will be very helpful to include a parameter associated with the galaxy morphology that can be inferred from optical or near-infrared observations. The Sérsic profile in Eq. (1) describes the intensity of a galaxy as a function of the distance from its center regardless of the galaxy colors, and thus can be useful for our purpose.

The dataset combining u − r colors and Sérsic indexes has been probed using linear discriminant analysis. The sample with complete records consisted of 1787 galaxies which have been split in a training group of 1251 and a test group of 536. Figure 3 shows the LDA separation in the u − r color versus Sérsic index n plane for the test galaxies. The logarithmic scale for the Sérsic index axis makes the visual comparison with the concentration index (which is already a logarithmic quantity) easier, but at the cost of showing a bent LDA line. The u − r color is the main discriminant, but the Sérsic index helps to separate the ET and LT sets more clearly. The separation line is located at u − r = (2.756 ± 0.002)−(0.14125 ± 0.00007)n, where n is the Sérsic index. An example for the confusion matrix for the test set LDA classification is shown in Table 4. For a total of 536 test galaxies, only 15 (7 + 8) were misclassified, yielding a classification accuracy of 0.972 ± 0.008 in this particular case.

Table 4.

LDA confusion matrix.

Linear discriminant analysis improves both the baseline and the u − r color classifications, as shown in Table 3 and Fig. 5. The average True ET Rate of 0.80 is similar to the u − r color, and the True LT Rate of 0.979 is marginally larger. Altogether, including the Sérsic index has helped to obtain a moderate improvement on the average accuracy (from 0.96 to 0.970) but reduces the accuracy uncertainty by 60% (from 0.02 to 0.008) with respect to the u − r color discriminant.

The LDA classification presented above is a simple machine learning methodology that shows the potential of this kind of algorithm. As with most machine learning methods, LDA does not incorporate an easy solution to deal with missing data, although the research in this area has been continuous over the last 50 years (e.g., Jackson 1968; Chan et al. 1976; Cai & Zhang 2018). Therefore, the usual way to deal with missing values is simply dropping incomplete records. This is a major problem when dealing with cross-correlated data gathered from multiple catalogs because missing data is a frequent characteristic of catalog entries. Thus, to prevent a drastic reduction in the amount of complete records, we are forced to put a limit on the number of photometric colors.

3.1.4. DNN classification

Classification based on DNNs allows us to overcome the missing data problem that limits the number of feasible variables of other machine learning solutions. This feature by itself justifies its application in astronomical databases, where records are often incomplete. In the following, we show the results obtained for both OTELO and COSMOS photometry and Sérsic index samples.

OTELO. We applied a very simple DNN to the OTELO catalog. First we computed the colors u − r, g − r, r − i, r − z, r − J, r − H, and r − Ks, and we introduced these colors as inputs in the DNN along with the r magnitude and the Sérsic index, that is, a total of nine input factors feeding the DNN. One example of the 100 random samplings analyzed with our DNN classification is shown in Table 5. For this particular example, the classification accuracy is 0.985 ± 0.006. We highlight the fact that, because of the missing data management, the number of cases included in the DNN classification (551) is larger than those for the u − r color and LDA methods (536), despite the differences in the number of input factors (9 for the DNN versus 2 for the LDA or 1 for the u − r color) which in most machine learning techniques would lead to a larger number of incomplete records being left out.

Table 5.

OTELO DNN confusion matrix.

The mean accuracy for our 100 DNN samplings is 0.985 ± 0.007, as shown in Table 3 and in Fig. 5. The True ET Rate is 0.84, marginally larger than the u − r and LDA values, and the True LT Rate is the highest of the three methods tested.

Table 6 shows the eight discrepancies between the DNN and OTELO classifications for the test sample data set presented in Table 5.

Table 6.

OTELO DNN missmatches.

Figure 7 presents the HST images in the F814W band for these eight galaxies. For the visual classification, we have taken into account the galaxy elongation and the light distribution in the HST image; the GALFIT model helps to indicate the shape and orientation, and the image residuals indicate a possible lack of fitting or possible substructures not visible in the HST image. Elongated and fuzzy images support a LT visual classification, while a round and soft appearance points to an ET galaxy. From our visual check, we conclude that six out of the eight galaxies with different class ascription are correctly classified by our DNN algorithm. Following is a brief description of each mismatched object.

Fig. 7.

OTELO – DNN discrepancies. Relative declination vs. right ascension coordinates in arcsec. First column: combined HST images from the F814W (reddish) and F606W (greenish) bands, with galaxy ID at the top-right corner. Second column: GALFIT models for the light distribution. Third column: HST minus GALFIT model residuals. Columns fourth, fifth and sixth: repeat the order of the previous columns.

– ID 267. A north–south oriented disk galaxy with a fuzzy northeast portion. The bulge of the galaxy broadly dominates the disk component. Compatible with a Sab class. Visual classification as LT.

– ID 496. A rounded smooth galaxy with a visual LT companion at the northwest and a star at the southwest. Visual classification as ET.

– ID 1895. Appears as a rounded and compact galaxy in the HST F814W image (our detection image used for GALAPAGOS). However, visual inspection of the HST F606W image shown in Fig. 8 reveals that there is a companion source not detected in F814W. Using our web-based graphic user interface⁶ we find that this companion, probably a LT galaxy neither detected in the OTELO deep image, enters the ellipse which was used to extract photometry. It is likely that a composite SED could be well fitted by LT templates instead single-population one. Visual classification as ET with an unresolved companion.

Fig. 8.

HST images of ID 1895 and GALFIT models. Fist column: high-resolution HST images in the F814W band. Second column: GALFIT models for the light distribution. Third column: HST minus GALFIT model residuals. Columns fourth, fifth and sixth: as the previous columns for the HST F606W images. There is a fuzzy object near the eastern border of the F606W image that is not resolved in the OTELO deep image.

– ID 2818. A round shaped galaxy, the image of residuals suggests possible over-subtraction. Visual classification as ET.

– ID 3680. A small, fuzzy, northwest to southeast oriented disk galaxy. Visual classification as LT.

– ID 4010. A rounded galaxy, with a LT companion at the southeast. Visual classification as ET.

– ID 4923. A faint and fuzzy galaxy. Visual classification as LT.

– ID 10207. A west-east oriented fuzzy small disk galaxy. Visual classification as LT.

COSMOS. We used the COSMOS dataset to check for the reliability of our DNN architecture. Using the ZSMC and CPhR catalogs we built a sample of 34 688 galaxies for which the photometry and Sérsic indexes are available. Photometric bands in the CPhR catalog do not exactly match OTELO’s bands. We have included Subaru’s BV bands, and have excluded the H band which is absent from the CPhR database. Thus, the COSMOS data used in this work consist of nine photometric bands (compared with eight in the case of OTELO catalog) and the Sérsic index. Because the OTELO and COSMOS bands are different, we had to train our DNN model again. As in the case of OTELO, we fed the DNN with the Sérsic index, the r magnitudes, and the colors relative to the r band. We did so without changing the DNN architecture except for the number of inputs. Despite the differences between the two datasets, we shall see that our DNN architecture reaches a high classification accuracy also for the COSMOS data.

Table 7 shows the confusion matrix for one of the 100 random samplings that we used to characterize the COSMOS DNN. For this sampling in particular, the classification accuracy is of 0.967 ± 0.001. Figure 9 and Table 3 show the distribution of accuracies for 100 DNN classification trials obtained from the COSMOS dataset. The mean accuracy for these trials is 0.967 ± 0.002, well above the relatively low baseline of 0.835 ± 0.002, which corresponds to 28 951 LT galaxies out of a total of 34 688 objects included in our COSMOS SP sample. Not only is the COSMOS SP baseline lower than OTELO’s (0.946), but the DNN performance is also lower: 0.967 for COSMOS compared with 0.985 for OTELO. The True ET Rate of 0.91 for COSMOS SP is similar within the errors to that of OTELO (0.84), but the True LT Rate for COSMOS is slightly lower (0.979) than that for OTELO (0.993).

Fig. 9.

Histogram of accuracies for 100 galaxy classification bootstrap runs using the COSMOS SP sample. The solid black line corresponds to the baseline accuracydistribution of a sample of 34 688 COSMOS galaxies. The blue histogram and dash-dotted line shows the DNN accuracy distribution for a test sample of 29 688 galaxies.

Table 7.

COSMOS DNN confusion matrix.

Applying the same DNN architecture to the OTELO and COSMOS datasets, the method yields high classification accuracy in both cases. Band differences between both datasets may contribute to the accuracy results. We note that OTELO optical bands were gathered from CFHTLS data, but most of the COSMOS optical bands used were measured by Subaru. The OTELO H band is missed in COSMOS, while the COSMOS BV bands, which are not included in our OTELO dataset, are heavily correlated to gr bands.

The high classification accuracies for both the OTELO and the COSMOS datasets suggests that our proposed DNN architecture may be applicable to a large number of databases that encompass both visual and infrared photometric bands and an estimate of the Sérsic index.

3.2. Concentration and photometry samples

The Sérsic index that we used in the LDA and DNN classification methods detailed above is obtained through a parametric fitting that is difficult to achieve when dealing with low-resolution images. On the contrary, the radius containing a given fraction of the galaxy total brightness is easier to estimate and can be measured directly. In this section we repeat our previous analysis of the OTELO and COSMOS databases, but using samples obtained through the concentration index defined as the ratio between the radii containing 80% and 20% of the galaxy brightness.

Table 8 shows the results obtained with the OTELO and COSMOS CP samples. As in the Sérsic index samples, the DNN classification yields thehighest accuracy for OTELO (0.980), and also yields very accurate results for COSMOS (0.971). In general, the results are comparable with those obtained using the SP sample.

Table 8.

Comparison of classification methods for CP samples.

Figure 10 shows the distribution of the u − r colors versus the concentration index, along with the u − r color and the LDA separation boundaries. The u − r color separation is 2.1 ± 0.3, in agreement with the values for the OTELO SP sample (2.0 ± 0.2), and Strateva et al. (2001, u − r = 2.22). The LDAseparation is located at

$\begin{matrix} u - r = (3.882 \pm 0.002) - (0.5342 \pm 0.0003) C, \end{matrix}$ $\begin{aligned} u-r = (3.882 \pm 0.002) - (0.5342 \pm 0.0003) C, \end{aligned}$

Fig. 10.

Classification for the OTELO CP sample of 2292 galaxies through the u − r color and LDA algorithms. The u − r color vs. concentration plot shows the original morphological type classification in the OTELO catalog reduced to LT galaxies (black circles) and ET galaxies (red squares). The dotted blue line indicates the u − r color separation, and the dashed green line the LDA separation by the u − r color and the concentration. The reader should take into account that this is not a flux limited sample.

where C is the concentration. The same trend of LT galaxies getting bluer at redshifts z > 0.5 can be seen in Fig. 11.

Fig. 11.

As Fig. 4 but for the 2292 galaxies in the OTELO CP sample.

Figure 12 shows the distributions for the accuracies of the baseline, u − r color, LDA and DNN classifications performed on the OTELO CP sample. As in the OTELO SP sample, the DNN yields the best accuracy, then LDA and finally the u − r color classification.

Fig. 12.

As Fig. 5 but for the 100 galaxy classification bootstrap runs for the 2292 galaxies of the OTELO CP sample.

The distribution of DNN accuracies for the COSMOS CP sample is shown in Fig. 13. Compared with the COSMOS SP sample, the proportion of LT galaxies is larger (95 781 LT out of 105 758 galaxies), yielding a more accurate baseline (0.906). The DNN accuracies are comparable, with a lower True ET Rate and a marginally larger True LT Rate for the COSMOS CP sample.

Fig. 13.

Histogram of accuracies for 100 galaxy classification bootstrap runs using the COSMOS CP sample. The solid black line corresponds to the baseline accuracy distribution of a sample of 105 758 COSMOS galaxies. The blue histogram and dash-dotted line shows the DNN accuracy distribution for a test sample of 95 758 galaxies.

3.3. Magnitude limited samples

Our aim in this paper is to use machine learning techniques to distinguish between ET and LT galaxies. Thus, our samples are selected from a redshift limited region with the only requirement of containing enough galaxies in every redshift interval for accurate training and testing the machine learning algorithm.

However, neither OTELO nor COSMOS SP and CP samples were flux limited to produce a complete sample of galaxies in the volume defined by z ≤ 2. This leads us to question the possible cosmological inferences of our results.

In this section, we present the results of the machine learning algorithms but using flux limited samples for both training and testing sets. Figure 14 shows the cumulative distribution of galaxies by r magnitudes for all the samples analyzed so far. With respect to the low-brightness tails, both OTELO SP and CP samples have similar cumulative distributions that flatten around a magnitude of r ≃ 26. This flattening may be considered as a rough measurement of completeness. Thus, compared with OTELO-Deep image measurements, Bongiovanni et al. (2019) estimate that the OTELO catalog reaches a 50% completeness flux at magnitude 26.38. For COSMOS, the SP sample flattens around r ≃ 23, while the CP sample does at r ≃ 24. Since COSMOS samples cover a large sky volume, their high brightness tails extend to galaxies approximately 1.5 magnitudes brighter than the much more confined OTELO volume.

Fig. 14.

Cumulative distribution of galaxies by r magnitudes. OTELO SP and CP samples are 3 and 2 magnitudes deeper than the COSMOS SP and CP samples, respectively. However, COSMOS sweeps a larger volume as the brightest end of the cumulative distribution implies.

We check our machine learning algorithms using flux-limited samples. Table 9 shows the results of 100 bootstrap runs on different OTELO r-magnitude-limited samples. We highlight the fact that all the u − r color, LDA, and DNN accuracies are consistent within the errors. However, for brighter samples, we can observe a downward trend in accuracies and upward trend in uncertainties for the LDA and DNN classifications, while the u − r color results remain basically without change. Analogously, Table 10 shows the limit r magnitude (Col. 1), the Sérsic index (Col. 2), the baseline (Col. 3), and the DNN accuracy for the COSMOS SP and CP samples. In this case, the training set size is always 5000 for the SP and 10 000 for the CP samples, except for the CP limit magnitude r ≤ 22 with a sample size of 9852, for which the training set was set to 5000. As in the OTELO case, we notice the consistency of the DNN accuracies within the errors, and the trend towards lower accuracies and larger uncertainties.

Table 9.

Mean accuracy for OTELO samples at different magnitude limits in the r-band.

Table 10.

Mean accuracy for COSMOS samples at different magnitude limits in the r-band.

There are two effects that may account for the trends in accuracy and uncertainty observed in the LDA and DNN classification methods. On one hand we detect a tendency for a lower proportion of LT galaxies in brighter galaxies, indicated by the baseline decrease. As the a priori probabilities of a galaxy to be ET or LT are more alike, the uncertainty in the classification increases. On the other hand, as the sample size shrinks in brighter samples, so do the fractions of the sample reserved for training and testing (70% and 30%, respectively). This shrinking of the sample size leads to a less satisfactory training and a less precise testing. Both effects, the baseline decrease and the sample shrinking, tend to reduce the classification accuracy. With respect to the u − r classification, the u − r color discriminant is determined by low-redshift galaxies (see Figs. 4 and 11) that tend to dominate flux limited samples. Thus, the discriminant remains constrained around a value of 2, and the classification accuracy remains around 0.96. For the brighter SP and CP samples, with magnitude limit r ≤ 25, the LDA and the DNN accuracies are similar to that of the u − r color. In the other two magnitude-limited cases (r ≤ 26 and r ≤ 27), the DNN presents the highest accuracy, and the accuracy of the LDA is higher than the u − r color.

These results show that all the machine learning methods for classification presented in this paper are robust for both limited and unlimited flux samples.

4. Conclusions

Neural networks are becoming increasingly important for image classification and will play a fundamental role in mining future databases. However, many of the current astronomical databases consist of catalogs of tabulated data. Machine learning techniques are often used to analyze astronomical tabulated data, but analysis through DNNs is far less frequent and limited to low-redshift galaxies.

Here, we provide a consistent and homogeneous comparison of the popular techniques used in the literature for binary ET and LT morphological type classification of galaxies up to redshift z ≤ 2. We used data from the OTELO catalog for classifying galaxies by means of (i) the single u − r color discriminant, (ii) LDA using u − r color and the shape parameter (Sérsic or concentration index), and (iii) DNN fed by visual-to-NIR photometry and shape parameter. We also applied the DNN architecture developed for OTELO on COSMOS to probe its reliability and reproducibility in a different database.

Both Sérsic index and concentration index shape parameters yield comparable results, but using the concentrations allowed to increase the size of OTELO and COSMOS available data. All the machine learning methodologies for galaxy classification tested in this paper are robust and produce comparable results for both limited and unlimited flux samples. Accuracy, True ET Rate, and True LT Rate estimates show that DNN outperforms the other two methods and allows the user to classify more objects because of the missing data management.

These results show that DNN classification is a powerful and reliable technique to mine existing optical astronomical databases. For unresolved objects, the morphological identification is unattainable, the spectrum of a dim object is very difficult to obtain, and multiwavelength data are usually unavailable. For most objects, photometric visible and near infrared observations are the only (and usually incomplete) accessible data.

This study indicates that DNN classification may address the mining of currently available astronomical databases better than other popular techniques.

An important limitation for all machine learning techniques is the availability of labeled data, that is, data that have already been classified or measured. This limited us to a binary ET and LT classification and to impose a redshift threshold. Incorporating reliable synthetic data for classification training is an important goal if we wish to overcome these limitations.

Our results provide compelling support for extending the DNN classification to targets other than binary morphological classification of galaxies, such as separating stars from galaxies, deciphering the spectral type of stars, and detecting rare events. The application of DNN is not restricted to classification problems. Our results strongly suggest that DNN methods can also be very effective in exploring other issues such as, for example, photometric redshift estimates.

¹

http://cosmos.astro.caltech.edu

²

http://www.gama-survey.org

³

http://www.cfht.hawaii.edu/Science/CFHTLS/

⁴

https://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/en/cfht/wirds.html

⁵

https://astronn.readthedocs.io/en/latest/

⁶

http://research.iac.es/proyecto/otelo/pages/data-tools/analysis.php

Acknowledgments

The authors are grateful to the referee for careful reading of the paper and valuable suggestions and comments. This work was supported by the project Evolution of Galaxies, of reference AYA2014-58861-C3-1-P and AYA2017-88007-C3-1-P, within the “Programa estatal de fomento de la investigacion cientifica y tecnica de excelencia del Plan Estatal de Investigacion Cientifica y Tecnica y de Innovacion (2013–2016)” of the “Agencia Estatal de Investigacion del Ministerio de Ciencia, Innovacion y Universidades”, and co-financed by the FEDER “Fondo Europeo de Desarrollo Regional”. JAD is grateful for the support from the UNAM-DGAPA-PASPA 2019 program, the UNAM-CIC, the Canary Islands CIE: Tricontinental Atlantic Campus 2017, and the kind hospitality of the IAC. MP acknowledges financial supports from the Ethiopian Space Science and Technology Institute (ESSTI) under the Ethiopian Ministry of Innovation and Technology (MoIT), and from the Spanish Ministry of Economy and Competitiveness (MINECO) through projects AYA2013-42227-P and AYA2016-76682C3-1-P. APG, MSP and RPM were supported by the PNAYA project: AYA2017–88007–C3–2–P. MC and APG are also funded by Spanish State Research Agency grant MDM-2017-0737 (Unidad de Excelencia María de Maeztu CAB). EJA acknowledges support from the Spanish Government Ministerio de Ciencia, Innovación y Universidades though grant PGC2018-095049-B-C21. M.P. and E.J.A. also acknowledge support from the State Agency for Research of the Spanish MCIU through the Center of Excellence Severo Ochoa award for the Instituto de Astrofísica de Andalucía (SEV-2017-0709). JG receives support through the project AyA2018-RTI-096188-B-100. MALL acknowledges support from the Carlsberg Foundation via a Semper Ardens grant (CF15-0384). JIGS receives support through the Proyecto Puente 52.JU25.64661 (2018) funded by Sodercan S.A. and the Universidad de Cantabria, and PGC2018–099705–B–100 funded by the Ministerio de Ciencia, Innovación y Universidades. Based on observations made with the Gran Telescopio Canarias (GTC), installed in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofísica de Canarias, in the island of La Palma. This work is (partly) based on data obtained with the instrument OSIRIS, built by a Consortium led by the Instituto de Astrofísica de Canarias in collaboration with the Instituto de Astronomía of the Universidad Autónoma de México. OSIRIS was funded by GRANTECAN and the National Plan of Astronomy and Astrophysics of the Spanish Government.

References

Abadi, M., Agarwal, A., Barham, P., et al. 2016, ArXiv e-prints [arXiv:1603.04467] [Google Scholar]
Abdel-Rahman, H. I., & Mohammed, S. A. 2019, NRIAG J. Astron. Geophys., 8, 180 [CrossRef] [Google Scholar]
Abolfathi, B., Aguado, D. S., Aguilar, G., et al. 2018, ApJS, 235, A42 [NASA ADS] [CrossRef] [Google Scholar]
Abraham, R. G., Valdes, F., Yee, H. K. C., et al. 1994, ApJ, 432, 75 [NASA ADS] [CrossRef] [Google Scholar]
Alpaslan, M., Driver, S., Robotham, A. S. G., et al. 2015, MNRAS, 451, 3249 [NASA ADS] [CrossRef] [Google Scholar]
Arnouts, S., Cristiani, S., Moscardini, L., et al. 1999, MNRAS, 310, 540 [NASA ADS] [CrossRef] [Google Scholar]
Ball, N. M., Loveday, J., Fukugita, M., et al. 2004, MNRAS, 348, 1038 [NASA ADS] [CrossRef] [Google Scholar]
Barchi, P. H., de Carvalho, R. R., Rosa, R. R., et al. 2020, Astron. Comput., 30, 100334 [CrossRef] [Google Scholar]
Baron, D. 2019, ArXiv e-prints [arXiv:1904.07248] [Google Scholar]
Bernardi, M., Shankar, F., Hyde, J. B., et al. 2010, MNRAS, 404, 2087 [NASA ADS] [Google Scholar]
Bershady, M. A., Jangren, A., & Conselice, C. J. 2000, AJ, 119, 2645 [NASA ADS] [CrossRef] [Google Scholar]
Bertin, E., & Arnouts, S. 1996, A&AS, 117, 393 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bongiovanni, Á., Ramón-Pérez, M., Pérez García, A. M., et al. 2019, A&A, 631, A9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bruzual, G., & Charlot, S. 2003, MNRAS, 344, 1000 [NASA ADS] [CrossRef] [Google Scholar]
Busca, N., & Balland, C. 2018, MNRAS, submitted [arXiv:1808.09955] [Google Scholar]
Cai, T. T., & Zhang, L. 2018, ArXiv e-prints [arXiv:1804.03018] [Google Scholar]
Carreira-Perpiñán, M. 2001, PhD Thesis, Dept. of Computer Science, University of Sheffield, UK [Google Scholar]
Cassata, P., Guzzo, L., Franceschini, A., et al. 2007, ApJS, 172, 270 [NASA ADS] [CrossRef] [Google Scholar]
Chan, L. S., Gilman, J. A., & Dunn, O. J. 1976, J. Am. Stat. Assoc., 71, 842 [CrossRef] [Google Scholar]
Chihara, L. M., & Hesterberg, T. C. 2018, Mathematical Statistics with Resampling (Hoboken, NJ: John Wiley & Sons) [CrossRef] [Google Scholar]
Chollet, F. 2017, Deep learning with Python (Manning Publications Co.) [Google Scholar]
Chollet, F., & Allaire, J. J. 2017, Deep learning with R (Manning Publications Co.) [Google Scholar]
Coleman, G. D., Wu, C.-C., & Weedman, D. W. 1980, ApJS, 43, 393 [NASA ADS] [CrossRef] [Google Scholar]
Conselice, C. J. 2003, ApJS, 147, 1 [NASA ADS] [CrossRef] [Google Scholar]
Conselice, C. J. 2006, MNRAS, 373, 1389 [NASA ADS] [CrossRef] [Google Scholar]
Dawson, K. S., Schlegel, D. J., Ahn, C. P., et al. 2013, AJ, 145, 10 [Google Scholar]
De Vaucouleurs, G. 1959, in Handbuch der Pysik/Encyclopedia of Physics, ed. S. Flügge (Berlin, Heidelberg: Springer), 11, 275 [NASA ADS] [CrossRef] [Google Scholar]
Debattista, V. P., Mayer, L., Carollo, C. M., et al. 2006, ApJ, 645, 209 [NASA ADS] [CrossRef] [Google Scholar]
Deng, X.-F. 2013, Res. Astron. Astrophys., 13, 651 [NASA ADS] [CrossRef] [Google Scholar]
Dieleman, S., Willett, K. W., & Dambre, J. 2015, MNRAS, 450, 1441 [NASA ADS] [CrossRef] [Google Scholar]
Djorgovski, S., & Davis, M. 1987, ApJ, 313, 59 [NASA ADS] [CrossRef] [Google Scholar]
Domínguez Sánchez, H., Huertas-Company, M., Bernardi, M., Tuccillo, D., & Fischer, J. L. 2018, MNRAS, 476, 3661 [NASA ADS] [CrossRef] [Google Scholar]
Efron, B., & Tibshirani, R. J. 1993, An Introduction to the Bootstrap (CRC Press) [Google Scholar]
Falcón-Barroso, J., van de Ven, G., Lyubenova, M., et al. 2019, A&A, 632, A59 [CrossRef] [EDP Sciences] [Google Scholar]
Ferrari, F., de Carvalho, R. R., & Trevisan, M. 2015, ApJ, 814, 55 [NASA ADS] [CrossRef] [Google Scholar]
Fukugita, M., Ichikawa, T., Gunn, J. E., et al. 1996, AJ, 111, 1748 [NASA ADS] [CrossRef] [Google Scholar]
Gerhard, O., Kronawitter, A., Saglia, R. P., et al. 2001, AJ, 121, 1936 [NASA ADS] [CrossRef] [Google Scholar]
Häussler, B., McIntosh, D. H., Barden, M., et al. 2007, ApJS, 172, 615 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Häußler, B., Bamford, S. P., Vika, M., et al. 2013, MNRAS, 430, 330 [NASA ADS] [CrossRef] [Google Scholar]
Henrion, M., Mortlock, D. J., Hand, D. J., et al. 2011, MNRAS, 412, 2286 [NASA ADS] [CrossRef] [Google Scholar]
Hložek, R. 2019, PASP, 131, 118001 [CrossRef] [Google Scholar]
Holwerda, B. W., Kelvin, L., Baldry, I., et al. 2019, AJ, 158, 103 [NASA ADS] [CrossRef] [Google Scholar]
Hubble, E. P. 1926, ApJ, 64, 321 [NASA ADS] [CrossRef] [Google Scholar]
Huertas-Company, M., Rouan, D., Tasca, L., et al. 2008, A&A, 478, 971 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Huertas-Company, M., Gravet, R., Cabrera-Vives, G., et al. 2015, ApJS, 221, 8 [NASA ADS] [CrossRef] [Google Scholar]
Ilbert, O., Arnouts, S., McCracken, H. J., et al. 2006, A&A, 457, 841 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ilbert, O., Capak, P., Salvato, M., et al. 2009, ApJ, 690, 1236 [NASA ADS] [CrossRef] [Google Scholar]
Jackson, E. C. 1968, Biometrics, 24, 835 [CrossRef] [Google Scholar]
Kauffmann, G., Heckman, T. M., White, S. D. M., et al. 2003, MNRAS, 341, 33 [NASA ADS] [CrossRef] [Google Scholar]
Kelvin, L. S., Driver, S. P., Robotham, A. S. G., et al. 2012, MNRAS, 421, 1007 [NASA ADS] [CrossRef] [Google Scholar]
Kennicutt, R. C. 1998, ARA&A, 36, 189 [Google Scholar]
Kinney, A. L., Calzetti, D., Bohlin, R. C., et al. 1996, ApJ, 467, 38 [NASA ADS] [CrossRef] [Google Scholar]
Lauberts, A., & Valentijn, E. A. 1989, The Surface Photometry Catalogue of the ESO-Uppsala Galaxies (Garching: European Southern Observatory) [Google Scholar]
Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, ArXiv e-prints [arXiv:1110.3193] [Google Scholar]
Lianou, S., Barmby, P., Mosenkov, A. A., et al. 2019, A&A, 631, A38 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lintott, C., Schawinski, K., Bamford, S., et al. 2011, MNRAS, 410, 166 [NASA ADS] [CrossRef] [Google Scholar]
Lovell, C. C., Acquaviva, V., Thomas, P. A., et al. 2019, MNRAS, 490, 5503 [CrossRef] [Google Scholar]
Masters, K. L., Maraston, C., Nichol, R. C., et al. 2011, MNRAS, 418, 1055 [NASA ADS] [CrossRef] [Google Scholar]
Miller, A. A., Kulkarni, M. K., Cao, Y., et al. 2017, AJ, 153, 73 [NASA ADS] [CrossRef] [Google Scholar]
Murtagh, F., & Heck, A. 1987, Multivariate Data Analysis (Dordrecht: D. Reidel Publ. Co.) [CrossRef] [Google Scholar]
Nadolny, J., Bongiovanni, A., Cepa, J., et al. 2020, A&A, submitted [Google Scholar]
Naim, A., Lahav, O., Sodre, L., et al. 1995, MNRAS, 275, 567 [NASA ADS] [CrossRef] [Google Scholar]
Nevin, R., Blecha, L., Comerford, J., & Greene, J. 2019, ApJ, 872, 76 [NASA ADS] [CrossRef] [Google Scholar]
Peng, C. Y., Ho, L. C., Impey, C. D., & Rix, H.-W. 2002, AJ, 124, 266 [NASA ADS] [CrossRef] [Google Scholar]
Peng, C. Y., Ho, L. C., Impey, C. D., et al. 2010, AJ, 139, 2097 [NASA ADS] [CrossRef] [Google Scholar]
Pérez-Carrasco, M., Cabrera-Vives, G., Martinez-Marín, M., et al. 2019, PASP, 131, 108002 [CrossRef] [Google Scholar]
Pović, M., Sánchez-Portal, M., Pérez García, A. M., et al. 2009, ApJ, 706, 810 [NASA ADS] [CrossRef] [Google Scholar]
Pović, M., Huertas-Company, M., Aguerri, J. A. L., et al. 2013, MNRAS, 435, 3444 [NASA ADS] [CrossRef] [Google Scholar]
Pović, M., Márquez, I., Masegosa, J., et al. 2015, MNRAS, 453, 1644 [NASA ADS] [CrossRef] [Google Scholar]
Romanowsky, A. J., & Fall, S. M. 2012, ApJS, 203, 17 [NASA ADS] [CrossRef] [Google Scholar]
Sargent, M. T., Carollo, C. M., Lilly, S. J., et al. 2007, ApJS, 172, 434 [NASA ADS] [CrossRef] [Google Scholar]
Scarlata, C., Carollo, C. M., Lilly, S., et al. 2007, ApJS, 172, 406 [NASA ADS] [CrossRef] [Google Scholar]
Scoville, N., Aussel, H., Brusa, M., et al. 2007, ApJS, 172, 1 [NASA ADS] [CrossRef] [Google Scholar]
Serra-Ricart, M., Calbet, X., Garrido, L., et al. 1993, AJ, 106, 1685 [CrossRef] [Google Scholar]
Serra-Ricart, M., Gaitan, V., Garrido, L., et al. 1996, A&AS, 115, 195 [Google Scholar]
Shimasaku, K., Fukugita, M., Doi, M., et al. 2001, AJ, 122, 1238 [NASA ADS] [CrossRef] [Google Scholar]
Shlens, J. 2014, ArXiv e-prints [arXiv:1404.1100] [Google Scholar]
Simmons, B. D., Lintott, C., Willett, K. W., et al. 2017, MNRAS, 464, 4420 [CrossRef] [Google Scholar]
Sreejith, S., Pereverzyev, S., Kelvin, L. S., et al. 2018, MNRAS, 474, 5232 [CrossRef] [Google Scholar]
Storrie-Lombardi, M. C., Lahav, O., Sodre, L., et al. 1992, MNRAS, 259, 8P [NASA ADS] [CrossRef] [Google Scholar]
Strateva, I., Ivezić, Ž., Knapp, G. R., et al. 2001, AJ, 122, 1861 [Google Scholar]
Tyson, J. A. 2002, Proc. SPIE, 2201, 10 [NASA ADS] [CrossRef] [Google Scholar]
Vika, M., Vulcani, B., Bamford, S. P., et al. 2015, A&A, 577, A97 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Walmsley, M., Smith, L., Lintott, C., et al. 2020, MNRAS, 491, 1554 [CrossRef] [Google Scholar]
Zamojski, M. A., Schiminovich, D., Rich, R. M., et al. 2007, ApJS, 172, 468 [NASA ADS] [CrossRef] [Google Scholar]

All Tables

Table 1.

One color separation.

	Fig. 2. As Fig. 1 but for the 2292 galaxies in the OTELO CP sample.
In the text

	Fig. 6. True ET Rate vs. True LT Rate values for 100 bootstrap runs in each classification method on the OTELO SP sample. The closer to the upper-left corner, the best classification result. DNN runs yield consistently the best classifications.
In the text

	Fig. 9. Histogram of accuracies for 100 galaxy classification bootstrap runs using the COSMOS SP sample. The solid black line corresponds to the baseline accuracydistribution of a sample of 34 688 COSMOS galaxies. The blue histogram and dash-dotted line shows the DNN accuracy distribution for a test sample of 29 688 galaxies.
In the text

	Fig. 11. As Fig. 4 but for the 2292 galaxies in the OTELO CP sample.
In the text

	Fig. 12. As Fig. 5 but for the 100 galaxy classification bootstrap runs for the 2292 galaxies of the OTELO CP sample.
In the text

	Fig. 13. Histogram of accuracies for 100 galaxy classification bootstrap runs using the COSMOS CP sample. The solid black line corresponds to the baseline accuracy distribution of a sample of 105 758 COSMOS galaxies. The blue histogram and dash-dotted line shows the DNN accuracy distribution for a test sample of 95 758 galaxies.
In the text

	Fig. 14. Cumulative distribution of galaxies by r magnitudes. OTELO SP and CP samples are 3 and 2 magnitudes deeper than the COSMOS SP and CP samples, respectively. However, COSMOS sweeps a larger volume as the brightest end of the cumulative distribution implies.
In the text

Galaxy classification: deep learning on the OTELO and COSMOS databases

1. Introduction

2. Methodology

2.1. OTELO samples

2.1.1. Sérsic index and photometry sample

2.1.2. Concentration and photometry sample

2.2. COSMOS samples

2.3. Classification procedures

2.3.1. Color classification

2.3.2. Linear discriminant analysis

2.3.3. Deep neural network

2.3.4. Bootstrap

3. Results

3.1. Sérsic index and photometry samples

3.1.1. Baseline classification

3.1.2. Color classification

3.1.3. Linear discriminant analysis classification

3.1.4. DNN classification

3.2. Concentration and photometry samples

3.3. Magnitude limited samples

4. Conclusions

Acknowledgments

References

All Tables

All Figures