Stellar atmosphere parameters with MA, a MAssive compression of for spectral fitting
P. Jofré^{1}  B. Panter^{2}  C. J. Hansen^{3}  A. Weiss^{1}
1  MaxPlanckInstitut für Astrophysik, KarlSchwarzschildStr. 1,
85741 Garching, Germany
2  Institute for Astronomy, University of Edinburgh, Royal
Observatory, Blackford Hill, Edinburgh EH9 3HJ, UK
3  European South Observatory (ESO), KarlSchwarzschildStr. 2, 85748
Garching, Germany
Received 7 January 2010 / Accepted 19 April 2010
Abstract
MA
is a new tool to estimate parameters from stellar spectra. It is based
on the maximum likelihood method, with the likelihood compressed in a
way that the information stored in the spectral fluxes is conserved.
The compressed data are given by the size of the number of parameters,
rather than by the number of flux points. The optimum speedup reached
by the compression is the ratio of the data set to the number of
parameters. The method has been tested on a sample of lowresolution
spectra from the Sloan Extension for Galactic Understanding and
Exploration (SEGUE) survey for the estimate of metallicity, effective
temperature and surface gravity, with accuracies of 0.24 dex,
130 K and 0.5 dex, respectively. Our stellar
parameters and those recovered by the SEGUE Stellar Parameter Pipeline
agree reasonably well. A small sample of highresolution VLTUVES
spectra is also used to test the method and the results were compared
to a more classical approach. The speed and multiresolution capability
of MA
combined with its performance compared with other methods indicates
that it will be a useful tool for the analysis of upcoming spectral
surveys.
Key words: techniques: spectroscopic  surveys  stars: fundamental parameters  methods: data analysis  methods: statistical
1 Introduction
The astronomical community has conducted many massive surveys of the Universe, and many more are either ongoing or planned. Recently completed is the Sloan Digital Sky Survey (SDSS, York et al. 2000), notable for assembling in a consistent manner the positions, photometry and spectra of millions of galaxies, quasars and stars, and covering a significant fraction of the sky. The size of such a survey allows us to answer many questions about the structure and evolution of our universe. More locally, massive surveys of stars have been undertaken to reveal their properties, for example the GenevaCopenhagen Survey for the solar neighborhood (Nordström et al. 2004) and the ELODIE library (Prugniel et al. 2007). These surveys provide a copious amount of stellar photometry and spectra from the solar neighborhood, broadening the range of stellar types studied. The SEGUE catalog  Sloan Extension for Galactic Understanding and Exploration (Yanny et al. 2009)  a component of SDSS, contains additional imaging data at lower galactic latitudes, to better explore the interface between the disk and halo population.
By statistically analyzing the properties of these stars such as chemical abundances, velocities, distances, etc., it has been possible to match the structure and evolution of the Milky Way to the current generation of galaxy formation models (Wyse 2006; Bond et al. 2010; Ivezic et al. 2008; Juric et al. 2008).
Studies of large samples of stars of our galaxy are crucial tests for the theory of the structure and formation of spiral galaxies. Even though the statistics given by SEGUE or the GenevaCopenhagen survey agree with the models and simulations of galaxy formation, there is some contention over whether the accuracies of measurements adequately support the conclusions. New surveys such as RAVE (Steinmetz et al. 2006) and Gaia (Perryman et al. 2001) will give extremely high accuracies in velocities and positions, LAMOST (Zhao et al. 2006) and part of the SDSSIII project, APOGEE and SEGUE2 (Rockosi et al. 2009), will give highquality spectra and therefore stellar parameters and chemical abundances of millions of stars over all the sky. These more accurate properties and larger samples will increase our knowledge of the Milky Way and answer wider questions about the formation and evolution of spiral galaxies.
As data sets grow it becomes of prime importance to create efficient and automatic tools capable of producing robust results in a timely manner. A standard technique for estimating parameters from data is the maximumlikelihood method. In a survey such as SEGUE, which contains more than 240 000 stellar spectra, each with more than 3000 flux measurements, it becomes extremely timeconsuming to do spectral fitting with a bruteforce search on the multidimensional likelihood surface, even more so if one wants to explore the errors on the recovered parameters.
Efforts to reduce the computational burden of characterizing the likelihood surface include MarkovChain MonteCarlo methods (Ballot et al. 2006), where a chain is allowed to explore the likelihood surface to determine the global solution. An alternative method is to start at an initial estimate and hope that the likelihood surface is smooth enough that a gradient search will converge on the solution. An example of this approach can be found in Allende Prieto et al. (2006) and Gray et al. (2001), based on the NelderMead downhill simplex method (Nelder & Mead 1965), where the derivatives of the likelihood function give the direction toward the maximum. In both cases, the likelihood is evaluated only partially, leaving large amounts of parameter space untouched. Moreover, the time needed to find the maximum depends on the starting point and the steps used to evaluate the next point.
Another parameter estimation method utilizes neural networks, for example Snider et al. (2001) and Re Fiorentin et al. (2007). While these nonlinear regression models can obtain quick accurate results, they are entirely dependent on the quality of the training data, in which many parameters must be previously known. Careful attention must also be paid to the sampling of the model grid which is used to generate the neural network.
An alternative approach to spectral analysis is the Massively Optimized Parameter Estimation and Data compression method ( MOPED^{}, Heavens et al. 2000). This novel approach to the maximum likelihood problem involves compressing both data and models to allow very rapid evaluation of a set of parameters. The evaluation is fast enough to do a complete search of the parameter space on a finely resolved grid of parameters. Using carefully constructed linear combinations, the data are weighted and the size is reduced from a full spectrum to only one number per parameter. This number, with certain caveats (discussed later), contains all the information about the parameter contained in the full data. The method has been successfully applied in the fields of CMB analysis (Bond et al. 1998), medical image registration^{} and galaxy shape fitting (Tojeiro et al. 2007). A complete background of the development of the MOPED algorithm can be found in Tegmark et al. (1997, hereafter T97), Heavens et al. (2000, hereafter H00) and Panter et al. (2003).
We present a new derivative of MOPED, MA (MAssive compression of ), to analyze stellar spectra. To estimate the metallicity history of a galaxy for instance, MOPED needs models where the spectra are the sum of single stellar populations. Here we study one population, meaning the metallicity has to be estimated from the spectrum of each single star. It is therefore necessary to develop a specific tool for this task: MA. In Sect. 2 we describe the method, giving a summary of the derivation of the algorithm. We then test the method in Sect. 3 on the basic stellar atmosphere parameter estimation of a sample of lowresolution stars of SEGUE, where we compare our results with the SEGUE Stellar Parameter Pipeline in Sect. 4 and finally check the method for a small sample of highresolution spectra in Sect. 5, where we compare our results with a more classical approach for a small sample of VLTUVES spectra. A summary of our conclusions is given in Sect. 6.
2 Method
In this section we describe our method to treat the likelihood surface. We first review the classical maximumlikelihood method, where a parametric model is used to describe a set of data.Then we present the algorithm that compresses the data and hence speeds the likelihood estimation. Finally we show the proof for the lossless nature of the compression procedure.2.1 Maximumlikelihood description
Suppose the data (e.g. the flux of a spectrum) are represented by N
real numbers x_{1},
x_{2}, ..., x_{N},
which are arranged in an Ndimensional vector .
They are the flux measurements at N wavelength
points. Assume that each data point x_{i}
has a signal part
and a noise contribution
(1) 
If the noise has zero mean, , we can think of as a random variable of a probability distribution , which depends in a known way on a vector of m model parameters. In the present case m=3 with
with the iron abundance , effective temperature and surface gravity . A certain combination of the single parameters (j = 1,2,3) also produces a theoretical model , and we can build an mdimensional grid of theoretical models by varying the parameters. If the noise is Gaussian and the parameters have uniform priors, then the likelihood
(3) 
gives the probability for the parameters, where is the averaged square of the noise of each data point and
The position of its maximum estimates the set of parameters that best describe the data . The fit between and is good if , where are the degrees of freedom. In the most basic form one finds the maximum in the likelihood surface by exploring all points in the parameter space, with each likelihood estimation calculated using all N data points. This procedure is of course very timeconsuming if N and m are large.
2.2 Data compression
In practice not all data points give information about the parameters, because either they are noisy or not sensitive to the parameter under study. The MOPEDalgorithm uses this knowledge to construct weighting vectors which neglect some data without losing information. A way to do this is by forming linear combinations of the data.
Let us compress the information of a given parameter
from our spectrum. The idea is to capture as much information as
possible about this particular parameter. We define our weighting
vector
as
where . This definition assures that each data point of the spectrum is less weighted if is large and more weighted if the sensitivity in the flux  meaning the derivative of respect to the parameter  is high. To create this vector we need, on one hand, the information about the behavior of the parameters from a theoretical fiducial model, and on the other hand, the error measurement from the data.
Figure 1 shows the example of the weighting vectors for the parameters of the set (2) using a fiducial model with , K and . The upper panel corresponds to the synthetic spectrum at the SDSS resolving power (R = 2000) and the panels B, C and D to the weighting vector for metallicity, effective temperature and surface gravity, respectively. This figure is a graphical representation of Eq. (5), and helps to better visualize the weight in relevant regions of the spectrum.
For metallicity (panel B), the major weight is concentrated in the two peaks at 3933 and 3962 Å, corresponding to the CaII K and H lines seen in the synthetic spectrum in panel A. The second important region with weights is close to 5180 Å, corresponding to the MgIb triplet. A minor peak is seen at 4300 Å, which is the Gband of the CH molecule. These three features are metallic lines, therefore show a larger dependency on metallicity than the rest of the vector, which is dominated by a zero main weight.
For the temperature (panel C) the greatest weight is focused on the hydrogen Balmer lines at 4101, 4340 and 4861 Å. Peaks at the previous metallic lines are also seen, but with a minor amplitude. Finally, the surface gravity (panel D) presents the strongest dependence on the Mg I triplet. The wings of the Ca II and Balmer lines also present small dependence. As in the case for metallicity, the rest of the continuum is weighted by a mean close to zero.
Figure 1: (A) Fiducial model with parameters , K and . Other panels indicate the weighting vectors according to Eq. (5) for metallicity (B), temperature (C) and surface gravity (D). 

Open with DEXTER 
The weighting is done by multiplying these vectors with the spectrum,
where the information kept from the spectrum is given by the peaks of
the weighting vector. We define this procedure as ``compression'',
where the information about the parameter
is expressed as
with the spectrum. With the weighting vector defined in Eq. (5) we can compress data () and model () by using Eq. (6). Because and (or ) are Ndimensional, the product y_{j} in (6) is a number, which stores the information about the parameter .
In order to perform a compression for all the parameters
simultaneously, we require that y_{k}
is uncorrelated with y_{j},
with .
This means that the vectors
must be orthogonal, i.e. .
Following the procedure of H00 we find the other
by the GramSchmidt ortogonalization
(7) 
For m = 3 we have the numbers , , , for the data and , , for the models, corresponding to the parameters of given in Eq. (2)
Goodness of fit
The definition of ,
which is the sum of the differences of the data and the model,
motivates us to define our goodness of fit. The ``compressed''
is the sum of the differences of the compressed data and model
(8) 
which gives the probability for the parameter .
(9) 
Because the y_{j} numbers are by construction uncorrelated, the ``compressed'' likelihood of the parameters is obtained by multiplication of the likelihood of each single parameter
(10) 
where
is the compressed .
As in the classical approach, the peak of the compressed likelihood estimates the parameters that generate the model , which reproduces best the data . The advantage of using the compressed likelihood is that it is fast, because the calculation of needs only m iterations and not N as in the usual computation. The search of the maximum point in the likelihood is therefore done in an mdimensional grid of ynumbers and not of Nlength fluxes.
Figure 2: Correlations between the Fisher matrix values obtained from the full and compressed likelihoods for metallicity (upper panel) and effective temperature (lower panel) for 75 randomly selected stars of SEGUE. The line corresponds to the onetoone relation. 

Open with DEXTER 
2.3 No loss of information by the compression
The Fisher Information Matrix describes the behavior of the likelihood close to the maximum. Here we use it only to show the lossless compression offered by the MOPED method, but a more extensive study is given in T97 and H00.
To understand how well the compressed data constrain a
solution we consider the behavior of the logarithm of the likelihood L
near the peak. Bellow we denote a generic L that
can correspond to the compress likelihood
or the full one ,
because the procedure is the same. In the Taylor expansion the first
derivatives
vanish at the peak and the behavior around it is dominated by the
quadratic terms
The likelihood function is approximately a Gaussian near the peak and the second derivatives, which are the components of the inverse of the Fisher Matrix , measure the curvature of the peak. Because the dependence of the parameters is not correlated, the matrix is diagonal with values
(13) 
In Fig. 2 we plotted the values and , corresponding to the parameters of metallicity and effective temperature, respectively, for likelihoods of 75 randomly selected stars of SEGUE, using the classical (full) and the compressed (comp) dataset. The line corresponds to a onetoone relation. The values correlate well, as expected.
Figure 3 shows full and compressed likelihood surfaces of metallicity and temperature for four randomly selected stars, where eight equally spaced contour levels have been plotted in each case. This is another way to visualize the statement that the Fisher Matrices correlate. The curvature of both likelihoods is the same close to the peak, as predicted. In Fig. 3 we also overplotted the maximum point of the compressed likelihood as a triangle and the maximum point of the full one as a diamond. It can be seen that the maximum points of the compressed and full data set lie inside the first contour level of the likelihoods. This shows that the compressed data set gives the same solution as the full one  even after the dramatic compression.
Figure 3: Likelihoods of the compress data set ( left) and full data set ( right) in the parameter space of effective temperature and metallicity for four randomly selected SEGUE stars. In each panel eight equally spaced contour levels are plotted. The triangle and diamond correspond to the maximum point of the compressed and the full data set, respectively. Both maxima lie in the first confidence contour level, meaning that both data sets reach the same solution for these two parameters. 

Open with DEXTER 
In the bellow parts the numerical values of correspond to the classical reduced .
3 Implementation on lowresolution spectra
The sample of stars used to test the method are FG dwarf stars. Based on the metallicity given by the SEGUE Stellar Parameter Pipeline, we chose metalpoor stars as will be explained below. This choice allows us to avoid the problem of the saturation of metallic lines such as Ca II K (Beers et al. 1999), which is a very strong spectral feature that serves as a metallicity indicator in the lowresolution spectra from SEGUE. These stars fall in the temperature range where Balmer lines are sensitive to temperature and the spectral lines are not affected by molecules (Gray 1992). With these considerations it is correct to assume that the spectra will behave similar under changes of metallicity, temperature and gravity. This allows the choice of a random fiducial model from our grid of models for the creation of the weighting vectors, which will represent well the dependence of the parameters in all the stars.
3.1 Data
We used a sample of SEGUE stellar spectra (Yanny et al. 2009), part of the Seventh Data Release (DR7, Abazajian et al. 2009) of SDSS. The survey was performed with the 2.5m telescope at the Apache Point Observatory in southern New Mexico and contains spectra of 240 000 stars. All the spectra have ugriz photometry (Fukugita et al. 1996). From a colorcolor diagram we identified the dwarf stars and selected them using the sqlLoader (Thakar et al. 2004) under the constraints that the object must be a star and have colors of 0.65 < ug < 1.15 and 0.05 <gr< 0.55. We verified the spectral type by also downloading the values ``seguetargetclass'' and ``hammerstype'', which classify our stars mainly as FG dwarfs. We also constrained the metallicity to be in the range of . The latter values were taken from the new SEGUE Stellar Parameter Pipeline (Allende Prieto et al. 2008; Lee et al. 2008b,a), were the 999 indicates that the pipeline has not estimated the metallicity of this particular star.Our final sample contains spectra of 17 274 stars with a resolving power of and a signaltonoise up to 10. The wavelength range is and the spectra are with absolute flux.
3.2 Grid of models
The synthetic spectra were created with the synthesis code SPECTRUM (Gray & Corbally 1994), which uses the new fully blanketed stellar atmosphere models of Kurucz (1992) and computes the emergent stellar spectrum under the assumption of local thermal equilibrium (LTE). The stellar atmosphere models assume the solar abundances of Grevesse & Sauval (1998) and a plane parallel lineblanketed model structure in one dimension. For the creation of the synthetic spectra, we set a microturbulence velocity of 2 km s^{1}, based on the atmosphere model value. The linelist file and atomic data were taken directly from the SPECTRUM webpage^{}. In these files, the lines were taken from the NIST Atomic Spectra Database^{} and the Kurucz webside^{}. No molecular opacity was considered in the model generation.
We created an initial three dimensional grid of synthetic spectra starting from the ATLAS9 Grid of stellar atmosphere models of Castelli & Kurucz (2003) by varying the parameters of Eq. (2). They cover a wavelength range from 3800 to 7000 in steps of , based on the wavelength range of SDSS spectra together with our linelist. This wavelength range is broad enough for a proper continuum subtraction, as described in Sect. 3.3. The spectra have an absolute flux and were finally smoothed to a resolving power of R = 2000, according to SDSS resolution.
In order to have a finer grid of models, we linearly interpolated the fluxes created for the initial grid. It has models with with dex, with K and with dex. The scaling factor for the normalization varies linearly from 0.85 to 1.15 in steps of 0.01 (see below). Our final grid has models. We set a linearly varying of for stars in the metallicity range of and for stars with , to reproduce the abundance of elements in the Milky Way, as in Lee et al. (2008a). These varying abundances were also calculated with interpolation of fluxes created from solar and enhanced ([/Fe] = 0.4) stellar atmosphere models. We did not include the abundances as a free parameter, because we aim to estimate the metallicity from lines of elements (Ca, Mg), meaning we would obtain a degeneracy in both parameters. We prefered to create a grid where the abundances were already ``known''.
Table 1: Strongest lines in FG dwarf stars.
Figure 4: Radial velocities found using minimum fluxes of strong lines given by Table 1 (MAX) compared with those of the SEGUE database. The difference of the radial velocities given by the SEGUE database from those obtained by our method is indicated in the legend as offset, with its standard deviation as scatter. 

Open with DEXTER 
The grid steps between the parameters are smaller than the accuracies expected from the low resolution SEGUE data. The extra time required to calculate the compressed likelihood in this finer grid is not significant, and retaining the larger size allows us to demonstrate the suitability of the method for future more accurate data capable of using the full grid.
3.3 Matching models to data
Data and model needed to be prepared for the analysis. First of all, the data needed to be corrected from the vacuum wavelength frame of the observations to the air wavelength frame of the laboratory. Secondly, we needed to correct for the Doppler effect zc. This was done by using the flux minimum of the lines indicated in Table 1 except for the Mg Ib triplet, because this feature is not seen clearly enough in every spectrum, driving in our automatic zc calculation to unrealistic values. A comparison of the radial velocity found with this method with the value given by the SEGUE database f is shown in Fig. 4. The difference of the radial velocities (SEGUE  MAX) has a mean (offset) of 20.15 km s^{1} with standard deviation (scatter) of 7.85 km s^{1}. The effect of this difference on the final parameter estimation is discussed in Sect.4. Once our data were corrected for the Doppler effect, they had to be interpolated to the wavelength points of the models to get the same data points.
For automatic fitting of an extensive sample of stars with a large grid of models, we decided to normalize the flux to a fitted continuum. In this way the dependence of the parameters is this is not past, it is a fact that happens now too) concentrated mainly on the line profiles. The choice of the normalization method is a difficult task, because none of them is perfect. It becomes especially difficult in regions where the spectrum shows too many strong absorption features, as the Ca II lines of Table 1. Using the same method for models and data, difficult regions behave similar in both cases, which allowed us a final fitting in these complicated regions as well. We adopted the normalization method of Allende Prieto et al. (2006), because it works well for the extended spectral range of SDSS spectra. It is based on an iteratively polynomial fitting of the pseudocontinuum, considering only the points that lie inside the range of 4 above and 0.5 below. Then we divided the absolute flux by the final pseudocontinumm. Due to noise, the continuum of the final flux may not necessary be at unity. We included a new degree of freedom in the analysis that scaled this subtracted continuum. This scaling factor for the normalization is then another free parameter, which means that we have finally four parameters to estimate  three stellar atmosphere parameters and the scaling factor. The final step is to choose the spectral range for the analysis.
3.4 Compression procedure
We chose the fiducial model with parameters K, and , to calculate the weighting vectors, , , and , corresponding to metallicity, effective temperature, gravity and scaling factor for normalization, respectively. Then, we calculated the set of ynumbers using Eq. (6) for each point in the grid of synthetic spectra by projecting the vectors onto the spectrum, resulting in a four dimensional ygrid, with every point a single number. With the respective vectors we calculated the ynumbers for the observed spectra, the expression (11) for every point in the grid and finally we found the minimum value in the grid which corresponds to the maximum point of the compressed likelihood.
To refine our solution we found the ``real'' minimum with a quadratic interpolation. For the errors, we looked at the models within the confidence contour of with corresponding to the 1 error in a likelihood with four free parameters (for further details see chapter 14 of Press 1993).
4 Application to SEGUE spectra
We chose to use the wavelength range of [3850, 5200] Å as information about metallicity, temperature and gravity, which is available in the lines listed in Table 1. These spectral lines are strong in FG stars, meaning they can be identified at low resolution without difficulty. The wings of Balmer lines are sensitive to temperature and the wings of strong Mg lines are sensitive to gravity (Fuhrmann 1998). Beacuse iron lines are not strong enough to be distinguished from the noise of our spectra, the Ca II K and Mg Ib lines are our indicators of metallicity (Beers et al. 2000; Allende Prieto et al. 2006; Lee et al. 2008a).
It is important to discuss the carbon feature known as the Gband at 4304 Å in our spectral window. The fraction of carbonenhanced metalpoor (CEMP) stars is expected to increase with decreasing metallicity (Beers et al. 1992) to about 20% at below (Lucatello et al. 2006), possibly implying a strong carbon feature in the observed spectrum. As discussed by Marsteller et al. (2009), a strong Gband may also affect the measurement of the continuum at the Ca II lines, which could result in an underestimation of the stellar metallicity. We commented in Sect. 2 on the influence of the lines in the weighting vectors used for the compression and we saw that the Gband also displayes minor peaks (see Fig. 1). The role of this minor dependence compared with those from the Ca II , Balmer and Mg I lines can be studied by comparing the final parameter estimation when using the whole spectral range or only those regions with the lines of Table 1, where the Gband is not considered. A further motivation for performing an analysis of an entire range against spectral windows is also explained below. The implications of this tests in terms of fraction of CEMP stars is discussed below.
4.1 Whole domain vs. spectral windows
The classical fitting procedure uses every datapoint; therefore, a straightforward method to speed up the analysis would be to mask those parts of the spectrum which do not contain information about the parameters, essentially those that are merely continuum. Certainly, by considering only the Ca II , Balmer and Mg Ib lines, the number of operations becomes smaller, increasing the processing speed of calculations. This is clumsy however: an empirical decision must be made about the relevance of pixels, and no extra weighting is considered  and the time taken for the parameters estimation is still long if one decides to do it for many spectra. The use of the vectors in the MA method means that very little weight is placed on pixels which do not significantly change with the parameter under study, automatically removing the sectors without lines.
The remarkable result of the MA method is that it is possible to determine the maximum of the compressed likelihood in 10 ms with a present day standard desktop PC. This is at least 300 times faster than the same procedure when doing an efficient evaluation of the uncompressed data.
The 1 confidence contours indicate errors in the parameters of 0.24 dex in metallicity on average, 130 K in temperature and 0.5 dex in gravity. Examples of fits are shown in Fig. 5. The upper plot shows the fit between a randomly selected SEGUE spectrum and the best model  the crosses correspond to the observed spectrum and the dashed red line to the model, with stellar atmosphere parameters in the legend, as well as the resulting reduced of the fit. The plot in the bottom is the fit of the same star, but considering only the data points within the line regions identified in Table 1, as discussed above. Again, the red dashed line indicates the model with parameters in the legend. The of the fit is also given.
Because the lines contain the most information about metallicity, temperature and gravity, we expect to obtain the same results whether we use all the data points or only those corresponding to the lines. The legends in Fig. 5 show the parameters estimated in both analyses. The small differences between them are within the 1 errors. We plotted in the upper panel of Fig. 6 the comparison for metallicity (left panel), temperature (middle) and gravity (right) of our sample of stars when using the whole spectral range (``whole'') or only spectral windows with the lines (``win''). The offsets and scatters of the distribution as well as the onetoone relation are indicated in the legend of each plot. Here we randomly selected 150 stars to plot in Fig. 6 to visualize better the correlations of the results. Offsets and scatters are calculated with the entire star sample. Metallicity has excellent agreements, with a small scatter of 0.094 dex, as shown in the legend of the plot. Temperature has a negligible offset of 38 K and a scatter of 69 K, which is also less than the accuracies obtained in the temperature estimation. Gravity has the largest scatter offset of 0.1 dex and a scatter of 0.34 dex, but this is still within the errors.
Figure 5: Example of the fit between a randomly selected SEGUE star (crosses) and a synthetic spectrum (red dashed line). The legend indicates the stellar atmosphere parameters of the model and the value of the reduced of the fit. The upper panel is the fit using all the points of the spectral region [3850, 5200] Å. The lower panel is the fit using only the data points where the lines of Table 1 are located. 

Open with DEXTER 
Figure 6: Upper panel: metallicity ( left), effective temperature ( middle) and surface gravity ( right) obtained using MA for the entire spectral range of [3900, 5200] (whole) compared to that with selected spectral windows (win) for a sample of 150 randomly selected stars. The offset (mean difference of ``win  whole'') of the results and its scatter (standard deviation) is indicated at the bottom right of each plot. The line has a slope of unity. Lower panel: as upper panel, but investigating the influence of using our radial velocities (rv MAX) or the SEGUE ones (rv SEGUE). Offset and scatters are calculated from the difference ``SEGUEMAX''. 

Open with DEXTER 
The negligible offsets obtained when using the entire spectral range against the limited regions is an encouraging result in terms of the effect of the Gband on the spectrum. As pointed out above, the dependence on this feature in the parameters under study is not as strong as the rest of the lines of Table 1. This is translated into less weight for the compression, as seen in Fig. 1, meaning the Gband does not play a crucial role in our compressed data set for the parameter recovery.
Let us remark that this does not mean a lack of CEMP stars in our sample, we simply do not see them in the compressed space. A spectrum with a strong observed CH molecule gives a similar compressed as one with a weak one, because the compression does not consider variations in carbon abundance. The full , on the other hand, will certainly be larger for the spectrum with a strong Gband, because our models do not have different carbon abundances.
In any case, the fraction of CEMP stars is high and it is certainly interesting to locate them with the MA method in the future. This implies that we need to create another dimension in the grid of synthetic spectra  models with varying carbon abundances  and increase the number of parameters to analyze to five. One more dimension certainly means a heavier grid of models, but in terms of parameter recovery, there is no big difference to use four or five ynumbers for the compressed calculation. This study goes beyond the scope of this paper, however, where we only introduce the MA method.
4.2 Effect on radial velocities
As mentioned in Sect. 3.3 and seen in Fig. 4, radial velocities given by the SEGUE database have a mean difference of 20 km s^{1} compared to ours. To study the effect of this difference in the final parameter estimation, we analyzed a subsample of randomly selected 2670 stars of our sample using SEGUE radial velocities. The comparison of the metallicity, effective temperature and surface gravity obtained when using our radial velocities (``rv MAX'') and the SEGUE ones (``rv SEGUE'') can be seen in the lower panel of Fig. 6. In this plot, also a small sample of randomly selected 150 stars was used to better visualize the results, but offsets and scatters were calculated considering the 2670 stars. A difference in 20 km s^{1} produces offsets and scatters in the three parameters, with dex for metallicity, K for temperature and dex for gravity. These offsets are negligible when compared with the 1 errors obtained the parameter estimation.
Figure 7: Upper plot: comparison of the results from the SEGUE Stellar Parameter Pipeline (SSPP) with different methods: this work (MAX, black), (Lee et al. 2008a, NGS1, blue), (Re Fiorentin et al. 2007, ANNSR, red) and (Allende Prieto et al. 2006, k24, green). Lower part: comparisons of individual methods for a randomly selected subsample of stars. Each plot has a line with slope of unity. The different offsets and scatters between the results are indicated in Table 2. In each plot, a selection of 150 random stars has been used. 

Open with DEXTER 
4.3 Comparison with the SEGUE Stellar Parameter Pipeline
The SEGUE Stellar Parameter Pipeline (SSPP, Allende Prieto et al. 2008; Lee et al. 2008b,a) is a combination of different techniques to estimate the stellar parameters of SEGUE. Some of them are to fit the data to a grid of synthetic spectra like the k24 (Allende Prieto et al. 2006), and the k13, NGS1 and NGS2 ones (Lee et al. 2008a). Another method are the ANNSR and ANNRR of Re Fiorentin et al. (2007), which are an artificial neural network (ANN) trained to use a grid of synthetic (S) or real (R) spectra to estimate the parameters of real spectra (R). Other options are line indices (Wilhelm et al. 1999) and the Ca II K line index (Beers et al. 1999).
We compare our results with the final adopted SSPP value (ADOP), and those of the k24, NGS1 and ANNSR grids of synthetic spectra. Each of these results can be found in the SSPP tables of the SDSS database. The grids of models are created using Kurucz stellar atmosphere models, but k24 and NGS1 cover the wavelength range of [4400, 5500] Å. The ANNSR grid includes the Ca II triplet close to 8600 Å.
Figure 7 shows the correlations between the results of these methods and our own. As in Fig. 6, we plotted a random selection of 150 stars to better visualize the correlations, but values of offsets and scatters were made using the entire sample of 17 274 stars. The left panels correspond to the metallicity, the middle ones to temperature and the right ones to the surface gravity. The top row of plots shows a general comparison of all the different methods, plotted with different colors. The yaxis indicates the ADOP results and the xaxis the methods MA (black), NGS1 (blue), ANNSR (red) and k24 (green). The onetoone line is also plotted in the figures. For each distribution a histogram of ( ) with was fitted with a Gaussian to obtain estimations for offset (mean ) and scatter (standard deviation ). The results are summarized in Table 2. In the lower panel of Fig. 7 we plotted the comparisons of individual methods for 150 randomly selected stars. These panels help to better visualize how the single methods of the pipeline and our own correlate with each other. Figure 7 and Table 2 show that our results reasonably agree with the pipeline, and that the scatter is of the same order on magnitude as that between the individual SSPP methods.
Considering individual parameters, our metallicities have a general tendency to be 0.3 dex lower than adopted metallicity of SSPP (ADOP). A similarly large offset exists between ANNSR and NGS1, with 0.22 dex. This could be due to the consideration of the Ca II lines in the fit, which are not included in the NGS1 and k24 ones. ANNSR also includes Ca II lines, the offset of 0.17 dex to our results is smaller than 1 error. If the same lines are used in the fitting, no offset should be seen, as will be shown in Sect. 5.
The temperature, on the other hand, shows a small offset of 61 K with respect to the ADOP value of the pipeline. It is encouraging to see that we obtain the best agreement, except for ANNSR, which has no offset. A large mean difference is found between k24 and NGS1, where the offset is 244 K. The scatter in the various comparisons varies from 100 K (NGS1 v/s ADOP) to 200K (k24 v/s ANNSR); our scatter of 112 K with respect to ADOP is one of the lowest values.
Finally, shows the largest offset. We derived gravity values 0.51 dex higher than the ADOP ones. The worst case is the comparison between our method and ANNSR with 0.63 dex. Between the methods of the pipeline, the largest difference (0.27 dex) is found between ANNSR and NGS1. The best agreement we found for gravities are with NGS1, with an offset of 0.38 dex. The scatter for gravities varies from 0.23 dex (NGS1 v/s ADOP) to 0.48 dex (ANNSR v/s k24 and MAv/s k24). A possible approach to correct our gravities would be to shift the zero point by 0.38 dex to agree with those of NGS1. We prefer to accept that gravity is our least constrained parameter, given the lack of sensitive features except for the wings of the Mg Ib lines, and the noise in the spectra does not restrict effectively. In order to deal with this problem, k24 and NGS1 smooth the spectra to half of the resolution, gaining signaltonoise by this procedure. We remark that the k24 grid includes colors, also to constrain the temperature. This automatically leads to different gravity values. A more extensive discussion of this aspect will be given in Sect. 5.6. The right panels of Fig. 7 shows that all estimates are rather uncorrelated with each other with respect to the other ones.
Figure 8: Histograms of the differences of the results obtained using 8 different fiducial models for the compression. Left: metallicity, middle: temperature and right: gravity. The parameters of the fiducial model are indicated in the right side of each plot, labeled as [[Fe/H], , ]. Each histogram has a Gaussian fit, where the peak () and the standard deviation () are indicated in the top legend. 

Open with DEXTER 
Table 2: Offsets () and scatter () of the differences (raw  column) between the methods: MA (this work).
4.4 Systematic errors: choice of fiducial model
The vectors are basically derivatives with respect to the parameters.With the assumption that the responses of the model to these derivatives are continuous, the choice of the fiducial model (FM) used to build the vectors is free. This assumption, however, is only correct for certain regions of the parameter space. For example, Balmer lines are good temperature indicators for stars no hotter than 8000 K (Gray 1992), and for cold stars spectral lines are affected by molecular lines. Limiting our candidate stars to the range of 5000 K to 8000 K we know that we have a clean spectra with Balmer lines as effective temperature indicators. This allows us to choose a FM with any temperature in that range to construct our weighting vector , which will represent well the dependence of the model spectrum on temperature. Similar assumptions are made for the other two model parameters.
In order to test this assumption we studied the effect of using different fiducial models in the final results of the parameter estimation. In Fig. 8 we plot the results of metallicity (left), temperature (middle) and gravity (right) for 150 randomly selected SEGUE stars. There are four sets of different fiducial models [A,B] used for the vector calculation. The parameters of the FM are indicated at the right side in each set. From top to bottom the first one compares the results of a cold ( K) and a hot ( K) fiducial model of our grid of synthetic spectra, the second one a metalpoor ( ) and a metalrich ( ) one. The third one considers two different gravities ( and ). For these three sets, all other parameters are identically calculated. Finally, the last one uses models which vary all three parameters. The histograms of each plot correspond to the difference of the results obtained with the fiducial model with a set of the parameters A and B, respectively. The Gaussian fit of the histogram is overplotted and its mean () and standard deviation () are indicated in the legend at the top.
Metallicity shows a well defined behavior under different vectors. There are negligible offsets when varying only one parameter in the FM, except for the case where the FM is totally different. But even in that case the offset of 0.15 dex is less than the 1 errors of 0.25 dex obtained in the metallicity estimation. We conclude that this parameter does not show real systematic offsets due to the choice of fiducial model. The scatter obtained in the metallicity is also less than the 1 errors of the results. Only in the last case (bottom left) it becomes comparable with the errors. Because gravity is a poorly constrained parameter (see discussions above) and the ynumbers are uncorrelated, the effect when using different gravities in the FM should be completely negligible in the result of the other parameters. This can be seen in the third panel from top to bottom of Fig. 8, where metallicity and temperature show extremely small offsets and scatters.
The derivation of temperature using different fiducial models shows a similar behavior. The discrepancies between results when using different FM are on the order of 50 K, except when the metallicity is varied. The mean difference of the final temperature estimates when using a metalpoor fiducial model and a metalrich one is of 141 K, which is smaller than some differences seen in Table 2 between the methods of the SEGUE Stellar Parameter Pipeline. This probably happens because the spectrum of a metalrich star presents more lines than a metalpoor one, which is translated into the vector as temperaturedependent regions that do not exist. In the case of SEGUE data, many of the weak lines seen in a metalrich synthetic spectrum and(or) the vector of a metalrich fiducial model are hidden by the noise in the observed spectra and the assumption of a temperature dependence due to these weak lines may not be correct. For the analysis of low signaltonoise spectra, it is preferable to choose a FM with a rather low metallicity. But even when using a high metallicity for the FM, the offset is comparable with the 1 error of 150 K for the temperature estimations. We conclude that temperature does not show a significant systematic offset due to choice of fiducial model either.
Finally, gravity shows larger discrepancies in the final results when varying the FM. The differences can be as large as 0.4 dex in the worst case when using different metallicities in the FM. This could also be because a metalrich FM contains more lines and therefore greater sensitivities to gravity, which in our low signaltonoise spectra is not the case. These sensitivities are confused with the noise in the observed spectrum. Scatters of 0.5 dex are in general on the order of the errors.
5 Application to highresolution spectra
As a check, MA also was tested with a smaller sample of 28 highresolution spectra. For consistency with the low resolution implementation, again metalpoor dwarf stars were selected.
5.1 Data
The spectra were obtained with UVES, the Ultraviolet Echelle Spectrograph (Dekker et al. 2000) at the ESO VLT 8 m Kueyen telescope in Chile. The resolving power is and the signaltonoise is typically above 300 in our spectra. Because MA is an automatic fitting tool for synthetic spectra, we first had to be sure that the observed spectra could be fitted by our grid of synthetic spectra. This means that we had to have easily distinguishable unblended spectral lines and a clear continuum. For these reasons we selected a part of the red setting of the UVES spectra, which covers the wavelength range 580 nm  680 nm, where there are many unblended lines.
5.2 Grid of highresolution models
When modeling spectra in high resolution, there are some differences to the lowresolution case and the following considerations have to be made: A value for microturbulence must be set, because the shape of strong lines sensitively depends on the value chosen for v_{t}. This is not the case in low resolution, so that there is no need to finetune this parameter. In high resolution the set of basic stellar atmosphere parameters is . To do a proper parameter estimation we would need to create a fourdimensional grid of synthetic spectra, varying all the parameters indicated above. This goes further than the purposes of this paper, where we aim to check the applicability of our method. Therefore we fixed the microturbulence parameter to a typical value, guided by the results of a standard ``classical'' spectral analysis (see below).
 To compute synthetic spectra, a list of lines with their wavelength and atomic data must be included. The atomic data consist of oscillator strength and excitation potentials, which are an important source of differences in the final shape of a given line. For the comparison consistent line lists are needed.
 In order to make a proper fit to a broad wavelength range, the models should be perfect. Our models were computed under the simplifying assumption of LTE and with planeparallel atmosphere layers; the atomic data have errors, too. There are too many features to fit, therefore a global fit becomes difficult, almost impossible. For this reason we selected windows of a limited spectral range and fitted the observed spectra only in these.
We determined the three parameters temperature, metallicity and gravity using neutral and ionized iron lines. Neutral lines are sensitive to temperature and single ionized ones to gravity (Fuhrmann 1998). The metallicity was effectively obtained from both Fe I and II, for a given temperature and gravity. In the wavelength range of 580 nm680 nm, we have six Fe II lines and 20 Fe I ones that are unblended and strong enough to be present in most metalpoor stars of our sample. The lines are indicated in Table 3, where wavelength and values are listed. The values and excitation potentials are taken from Nissen et al. (2002) and the VALD database (Kupka et al. 2000).
Our grid of high resolution models covers a range in metallicity of , in effective temperature of , in surface gravity of and in scaling factor for normalization from 0.85 to 1.15. The grid steps are the same as in Sect. 3.2. The models have and v_{t} = 1.2 km s^{1}, according to the values obtained for our stars with the ``classical'' method (see below).
Figure 9: Fit of the spectrum of the star HD 195633 (points) with the best model (red dashed line). The parameters of the synthetic spectrum are [Fe/H] = 0.5, K and , with . The blue line corresponds to the model with [Fe/H] = 0.6, K and = 3.86 ( , obtained with the ``classical'' approach. Each panel represents a spectral window where the lines from Table 3 are labeled at the bottom. 

Open with DEXTER 
Table 3: Fe I and II lines used for the highresolution fits.
5.3 Preparing the observed data for synthetic spectral fitting
The reduction of the data was carried out in the same way as in Sect. 3.3, with the following difference in the normalization procedure. At this resolution, neither the polynomialfitting approach by Allende Prieto et al. (2006) nor IRAF^{} were able to automatically find a good pseudocontinuum. Hence, the observed spectra were normalized interactively using Midas (Crane & Banse 1982), and the synthetic spectra were computed with normalized flux. The compression procedure was the same as in Sect. 3.4. Compressing the grid of highresolution spectra takes more time than for the lowresolution ones, but this has to be carried out only once to obtain a new grid of ynumbers.
5.4 Results of the highresolution analysis using MA
Figure 9 shows an example fit of HD 195633 (points) with the best MA model (red dashed line) and the model with parameters found with the ``classical'' method (blue dashed line; see below). Each panel represents a spectral window with the lines of Table 3 used for the parameter estimation. All three parameters were determined simultaneously and the best fit corresponds to the model with parameters of [Fe/H] = 0.5, K and = 4.61. The final parameters of all stars in this sample are given in Table 4. The first column of the table indicates the name of the star and the six next ones are the parameters found with two different MA analyses. The first set (``free'') corresponds to the standard approach of determining all parameters simultaneously from the spectrum only. In the second variant (``restricted'') the gravity is determined independently using Eq. (14), which will be introduced below. The last four columns are the parameters obtained from the ``classical'' approach (below).
Table 4: Parameters of the stars (Col. 1) obtained with MA for both types of analysis discussed in the text.
5.5 Results of a ``classical'' analysis
In order to compare our MA results, we analyzed the spectra with a classical procedure, determining the stellar parameters through an iterative process. The method is basically the same as in, e.g., Nissen et al. (2002):
For the effective temperature it relies on the infrared flux method, which provides the coefficients to convert colors to effective temperatures. We used the (VK) color and the calibration of Alonso et al. (1996), after converting the K 2MASS filter (Skrutskie et al. 1997) to the Johnson filter (Bessell 2005) and dereddening the color. The extinction was taken from Schlegel's dust maps (Schlegel et al. 1998) in the few cases where we could not find the values in Nissen et al. (2004,2002).
Surface gravities were determined using the basic parallax
relation
where is the V magnitude corrected for interstellar absorption, BC the bolometric correction and the parallax in arcsec. We adopted a different mass value for each star based on those of Nissen et al. (2002), which are between 0.7 and 1.1 . The bolometric correction BC was calculated with the solar calibration of as in Nissen et al. (1997).
The equivalent widths of neutral and ionized iron lines were used to determine the metallicity [Fe/H]. It was obtained using Fitline (François et al. 2007), which fits Gaussians to the line profiles. The equivalent widths computed from these Gaussians fits were converted to abundances by running MOOG (Sneden 1973; Sobeck, priv. comm.) with the planeparallel LTE MARCS model atmospheres of (Gustafsson et al. 2008), which differ from those used for MA. The line list contains the lines of Table 3 and more taken from bluer wavelengths in the range 300580 nm (Hansen, priv. comm.).
Finally, the value for the microturbulence was found by requiring that all the equivalent widths of neutral lines should give the same Fe abundance as the ionized lines.
In order to obtain the four parameters in this way, we started with an initial guess for each of the interdependent parameters. After determining them with the above steps all values will change due to the interdependence, hence we had to iteratively determine the parameters until their values showed a negligible change.
We are aware that our metallicity based on Fe I lines is ignoring any nonLTE effects (Asplund 2005; Collet et al. 2005). But because neither the classically nor the MAanalysis are taking this into account, the results obtained by both methods can well be compared.
Figure 10: Upper panels: Correlations between the results obtained with the automatic fitting (MAX) and the classical method (EW) for metallicity ( left), temperature ( middle) and gravity ( right). In each panel, the onetoone line is overplotted and a legend containing the mean difference (in the sense ``EWMA'') and its standard deviation. Here the three parameters were determined simultaneously and directly from the spectra by our method. The stars with special symbols are indicated in the legend and correspond to special cases where the offset between the two methods is particularly large (discussed in text). Lower panels: As upper panels, but MA results were obtained forcing the method to find the bestfitting model close to the gravities computed from the parallax formula, Eq. (14). Individual data can be found in Table 4 (columns ``MA, free'' for the upper case and ``MA, restr.'' for the lower one). 

Open with DEXTER 
The upper panels of Fig. 10 show the results of metallicity (left), temperature (middle) and gravity (right) for the entire sample of stars. The xaxis shows the results of the automatic fitting MA and the yaxis the ``classical'' parameter estimation (EW). The onetoone line is overplotted in each figure and the legend indicates the mean of the differences (MA  EW) and its standard deviation, denoted as offset and scatter, respectively. Each point corresponds to a star of Table 4 with parameters obtained by the ``free'' method.
Gravity shows a scatter of dex and a negligible offset. The determination of the gravity from FeII lines has always been a problematic task: Fuhrmann (1998) in an extensive study of parameters of nearby stars showed that surface gravities of Ftype stars located at the turnoff point can easily differ by up to 0.4 dex if derived from either LTE iron ionization equilibrium or parallaxes. The amount of ionized lines (gravity dependent) present in the spectra is usually smaller than that of neutral ones (temperature dependent). By performing an automatic fitting of weighted spectra that contain only six Fe II lines compared to 20 Fe I lines, a scatter of 0.46 dex is reasonable.
The result for the effective temperature shows a very small offset of only 10 K, but a quite large scatter of K. Given that two different methods were employed to determine it, this still appears to be acceptable. The scatter may also be affected by our fixed value for the microturbulence, which was taken from the average of the values found with the EW approach (see Table 4), but which in individual cases may differ severely.
One can decrease the scatter in the temperature difference by removing the three most discrepant objects from the sample. The star HD 63598 (triangle) has an offset of 398 K. The Schlegel maps give a dereddening for this star that is unrealistically large, so instead we set it to zero. This led to a temperature that was too low. Jonsell et al. (2005) have found a temperature of 5845 K for this star, reducing the difference to 233 K with respect to our result. The star G005040 (asterisk) is the second case, where the offset is 462 K. For this star, the continuum subtraction was not perfect in every spectral window, resulting in a fit where the parameters were quite unreliable. Finally, the weak lines of the observed spectrum of HD 19445 pushed MA to the border of the grid. The parameters in this case were undetermined. By removing these three stars from our sample, the scatter for temperature is reduced to 197 K.
Metallicity, on the other hand, shows a very good agreement, with a negligible offset of 0.02 dex and a scatter of 0.16 dex (upper left panel of Fig. 10). The lines used for the automatic fitting (MA) and for the classical analysis (EW) are in most cases identical, except for the lowest metallicities, where some of them are hardly visible. In these cases the classical method also resorted to other lines outside the MA wavelength range. The atomic data are identical, but the value for microturbulence are not, as mentioned before. In view of all this, the very good correlation of metallicity is encouraging.
5.6 ``Restricted'' parameter recovery
An example for the individual fit of the best MA model to the observed spectra was shown in Fig. 9 for HD 195633, which is shown as a filled symbol in Fig. 10 and is one of the objects with the most pronounced discrepancies (see also Table 4). Nevertheless, the two synthetic spectra appear to fit equally well the observations ( for the MA model and for that obtained from the classical analysis). We could not decide from the values which method leads to a more accurate parameter determination for this star. One would need independent information of the spectrum about this star to reach conclusions about its parameters.It is also interesting to notice from the upper panels of Fig. 10 that stars with large discrepancies in gravity (for example those with special symbols) also give large discrepancies in temperature, as expected. They show a quite unsatisfying fit for the Fe II lines. Motivated by this, we did an additional test by restricting MA to the determination of the parameters with input values for obtained from the EW method, i.e. we found a local maximum point of the likelihood in a restricted area. To do this we chose the three closest gravity values from our grid of synthetic spectra to the classical one found with Eq. (14)  see Col. 9 of Table 4  and we searched for the maximum point within this range. May be that for there is no local maximum in this range, and the final estimation will go to the border of the grid, as the case of CD3018140, where . The general tendency is anyway a local maximum close to the input EWvalue.
Now the agreement with the classical method for the effective temperatures became excellent, with a negligible offset of only 7.42 K and a scatter of 128 K, as seen in the lower panels of Fig. 10. Gravity was also better constrained, with a small scatter of 0.14 dex and a negligible offset. Note that we did not necessarily obtain identical values, because the MA gravity is obtained by using the value from Eq. (14) only as input, and we were looking for a final solution close to this value. The behavior of the metallicity does not change with respect to the ``free'' case, demonstrating the robustness of the determination of this parameter. The results obtained when using parallaxes for the initial guess for are summarized in Table 4 in the columns under the heading ``MA, restr.''.
It is instructive to discuss the implication of this comparison. For the ``classical'' EW method we determined the parameters making use of the best information available and of the freedom to adopt the method to each star individually. The iterative process allowed us to decide where to stop the iteration, or, if no satisfying convergence could be reached, to draw on the options to move to another spectral window, to use other lines, or disregard problematic lines. Moreover, the continuum could be separately subtracted for each line, thus creating locally perfect normalized flux levels. For the effective temperature, the more reliable infrared flux method employing photometric data could be used instead of only relying on spectra. The advantage of determining independently was already demonstrated.
On the other hand we were attempting an estimate of the parameters only from the spectral information without finetuning the models or fit procedure for each star for our automatic MA method. Given this, the comparison with the full interactive method is surprisingly good. We demonstrated that by using additional information for one parameter (here ) it becomes even better for the remaining two quantities. Table 4 shows our final parameter estimate for our stars, when we make use of the parallaxes as additional information.
The result of this test is that we demonstrated our ability to estimate the basic stellar atmosphere parameters quickly and accurately enough for a substantial sample of stars. While individually severe outliers may occur, the method is accurate overall for a whole population of stars. For this purpose the EW method would be much too slow and tedious. Our method may also serve as a quick and rough estimate for a more detailed followup analysis.
6 Summary and conclusions
We described MA, a new derivative of MOPEDfor the estimation of parameters from stellar spectra. In this case the parameters were the metallicity, effective temperature and surface gravity. The method reduces the data to a compressed data set of three numbers, one per parameter. Assuming that the noise is independent of the parameters, the compressed data contain as much information about the parameters of the spectrum as the entire data itself. As a result, the likelihood surface around the peak is locally identical for the entire and compressed data sets, and the compression is ``lossless''. This massive compression, with the degree of compression given by the ratio of the size of the data to the number of parameters, allows the cost function calculation for a parameter set to be sped up by the same factor. For SDSS data, the spectra are of the order of 1000 datapoints, each with a corresponding error, which must be compared to a similarly sized model. The speed up factor in this case is then at least 1000/3, 333x.
This extremely fast multiple parameter estimate make the MA method a powerful tool for the analysis of large samples of stellar spectra. We have applied it to a sample of 17 274 metalpoor dwarf stars with lowresolution spectra from SEGUE, using a grid of synthetic spectra with the parameter range of [2.5, 0.5] dex in metallicity, [5000, 8000] K in effective temperature and [3.5, 5] dex in surface gravity, covering a wavelength range of [3850, 5200] Å.
From the Ca II , Balmer and Mg Ib lines, which are the strongest absorption features identified in SDSS spectra, we estimated the metallicity with averaged accuracies of 0.24 dex, the temperature with 130 K and with 0.5 dex, corresponding to the 1 errors. Surface gravity is a poorly constrained parameter using these data, mainly due to the lack of sensitive features of this parameter (apart from some degree of sensitivity of the wings of the Mg Ib triplet) and the considerable noise in the spectra when compared to highresolution spectra. Additional information to the spectra, such as photometry, would help to constrain the gravity parameter more.
MA has the option to simultaneously fit different spectral windows. We have compared estimates of the parameters using the whole spectrum and only those data ranges where known lines exist. Both analyses take approximately the same time and agree excellent in recovered parameters. This suggests that for these lowresolution spectra there is no need to carefully and laboriously select the spectral windows to be analyzed with MA, the method calculates this automatically as part of its weighting procedure.
With the assumption that the spectra behave similarly under changes of the parameters, the choice of the fiducial model for the compression is free. We have created eight compressed grids using different fiducial models for the vector calculation and obtain agreement between the recovered parameters. Given the low signaltonoise of our data, it is better to use a more metalpoor fiducial model, as the dependence on the parameters will be focussed on the strong lines.
We have comprehensively investigated the correlations of our results with those obtained for the SEGUE Stellar Parameter Pipeline (Lee et al. 2008a), which reports results from a number of different methods. The results from MA agree well with those of the various pipelines, and any differences are consistent with those between the various accepted approaches. We are aware of the MATISSE (RecioBlanco et al. 2006) method of parameter estimate, which uses a different combination of weighting data but is closer to the MA approach than the standard pipeline methods. We look forward to comparing our results with those of MATISSE when they become available.
More specifically, temperature agree excellently with the averaged temperature of SSPP, with a negligible offset of 61 K and low scatter of 112 K. Our metallicities show a tendency to be 0.32 dex lower than SSPP averages. The small scatter of 0.23 dex suggests that the different spectral features used in the analysis (mainly Ca II lines) could shift the zero point of the metallicity. The most pronounced discrepancy is found in surface gravity, where Ma reports values 0.51 dex ( = 0.39) higher than the averaged value of the pipeline. We saw that this is consistent with the discrepancies seen between other methods.
We have also tested MA on a sample of 28 high resolution spectra from VLTUVES, where no offset in metallicity is seen. In this case we have carefully chosen the models and spectral range for comparing our parameter estimation against a ``classical'' approach: temperature from photometry, surface gravity from parallaxes and metallicity from equivalent widths of neutral and ionized iron lines. We have calculated the parameters ourselves to avoid additional systematic offsets that various different methods would introduce in stellar parameter scales. These results were compared with the automatic fitting of 20 Fe I and 6 Fe II lines (that coincide in most cases with the equivalent widths calculations) made with MA. We obtained large scatter in gravity and temperature (0.44 dex and 220 K, respectively), but no systematic offset. The normalization of the continuum in some cases was not perfectly done, making it difficult to fit every line properly: especially for the Fe II lines. They produce a scatter in gravity, which drives a scatter in temperature.
We have seen in our fits that our best model does not differ very much from the model with parameters found by the ``classical'' approach. Motivated by this we decreased the scatter in gravity by using additional information to the spectrum, i.e. using the gravities determined from parallaxes as input value. This forced our method to find a local maximum point of the likelihood close to this input gravity value. Fixing the gravity gave an improved agreement in temperature, now with a scatter of 128 K. Metallicity does not change when forcing gravities, illustrating the robustness of determination for this parameter.
MA is an extremely rapid fitting technique and as such is independent of model and data used. It will work for any star for which an appropriate grid of synthetic spectra can be calculated. Although the grid calculation is time consuming, it only needs to be performed once allowing an extremely rapid processing of individual stars.
MA is one of the fastest approaches to parameter determination, and its accuracy is comparable with other methods. To develop MA further in preparation for the next generation of surveys (e.g. Gaia, APOGEE or LAMOST), we plan to expand our range of model grid and integrate photometric data to improve the determination of the parameters, especially surface gravity. We must be aware that in a grid of synthetic spectra with a broader parameter range a set of vectors from one single fiducial model will not represent well the dependence on the parameters in the entire sample. To cover the entire sample with well represented dependences, we plan to compute different compressed grids, each of them with vectors calculated from different fiducial models. The parameter estimate in this case will be made with the different compressed grids using a final convergence test. After convergence, the final parameters should be estimated from the compressed grid with the fiducial model with parameters close to the final result.
AcknowledgementsThis work is part of the PhD Thesis of Paula Jofré and is funded by an IMPRS fellowship. We thank Richard Gray, Martin Asplund, Francesca Primas and Jennifer Sobeck for their helpful comments in interpreting our results, Alan Heavens for algorithmic advice and especially Carlos Allende Prieto, for all the great advice and for sharing with the authors the continuum subtraction routine. P. Jofré is thankful to Timo Anguita and Manuel Aravena for programing tips and Thomas Mädler for his careful reading of the manuscript and the support to take this paper out. Finally, the authors gratefully acknowledge the many and detailed positive contributions made by the referee. Funding for the SDSS and SDSSII has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS Web Site is http://www.sdss.org/.
The SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions. The Participating Institutions are the American Museum of Natural History, Astrophysical Institute Potsdam, University of Basel, University of Cambridge, Case Western Reserve University, University of Chicago, Drexel University, Fermilab, the Institute for Advanced Study, the Japan Participation Group, Johns Hopkins University, the Joint Institute for Nuclear Astrophysics, the Kavli Institute for Particle Astrophysics and Cosmology, the Korean Scientist Group, the Chinese Academy of Sciences (LAMOST), Los Alamos National Laboratory, the MaxPlanckInstitute for Astronomy (MPIA), the MaxPlanckInstitute for Astrophysics (MPA), New Mexico State University, Ohio State University, University of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University of Washington.
References
 Abazajian, K. N. AdelmanMcCarthy, J. K., Agüeros, M. A., et al. 2009, ApJS, 182, 543 [NASA ADS] [CrossRef] [Google Scholar]
 Allende Prieto, C. Beers, T. C., Wilhelm, R., et al. 2006, ApJ, 636, 804 [NASA ADS] [CrossRef] [Google Scholar]
 Allende Prieto, C., Sivarani, T., Beers, T. C., et al. 2008, AJ, 136, 2070 [NASA ADS] [CrossRef] [Google Scholar]
 Alonso, A., Arribas, S., & MartinezRoger, C. 1996, A&A, 313, 873 [NASA ADS] [Google Scholar]
 Asplund, M. 2005, ARA&A, 43, 481 [NASA ADS] [CrossRef] [Google Scholar]
 Ballot, J., García, R. A., & Lambert, P. 2006, MNRAS, 369, 1281 [NASA ADS] [CrossRef] [Google Scholar]
 Beers, T. C., Preston, G. W., & Shectman, S. A. 1992, AJ, 103, 1987 [NASA ADS] [CrossRef] [Google Scholar]
 Beers, T. C., Rossi, S., Norris, J. E., Ryan, S. G., & Shefler, T. 1999, AJ, 117, 981 [NASA ADS] [CrossRef] [Google Scholar]
 Beers, T. C., Chiba, M., Yoshii, Y., et al. 2000, AJ, 119, 2866 [NASA ADS] [CrossRef] [Google Scholar]
 Bessell, M. S. 2005, ARA&A, 43, 293 [NASA ADS] [CrossRef] [Google Scholar]
 Bond, J. R., Jaffe, A. H., & Knox, L. 1998, Phys. Rev. D, 57, 2117 [NASA ADS] [CrossRef] [Google Scholar]
 Bond, N. A., Ivezic, Z., Sesar, B., Juric, M., & Munn, J. 2010, ApJ, 716, 1 [NASA ADS] [CrossRef] [Google Scholar]
 Castelli, F., & Kurucz, R. L. 2003, in Modelling of Stellar Atmospheres, ed. N. Piskunov, W. W. Weiss, & D. F. Gray, IAU Symp., 210, 20P [Google Scholar]
 Collet, R., Asplund, M., & Thévenin, F. 2005, A&A, 442, 643 [NASA ADS] [CrossRef] [EDP Sciences] [MathSciNet] [Google Scholar]
 Crane, P., & Banse, K. 1982, Mem. Soc. Astron. Ital., 53, 19 [NASA ADS] [Google Scholar]
 Dekker, H., D'Odorico, S., Kaufer, A., Delabre, B., & Kotzlowski, H. 2000, in SPIE Conf. Ser. 4008, ed. M. Iye, & A. F. Moorwood, 534 [Google Scholar]
 François, P., Depagne, E., Hill, V., et al. 2007, A&A, 476, 935 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Fuhrmann, K. 1998, A&A, 338, 161 [NASA ADS] [Google Scholar]
 Fukugita, M., Ichikawa, T., Gunn, J. E., et al. 1996, AJ, 111, 1748 [NASA ADS] [CrossRef] [Google Scholar]
 Gray, D. F. 1992, The observation and analysis of stellar photospheres., ed. D. F. Gray [Google Scholar]
 Gray, R. O., & Corbally, C. J. 1994, AJ, 107, 742 [NASA ADS] [CrossRef] [Google Scholar]
 Gray, R. O., Graham, P. W., & Hoyt, S. R. 2001, AJ, 121, 2159 [NASA ADS] [CrossRef] [Google Scholar]
 Grevesse, N., & Sauval, A. J. 1998, Space Sci. Rev., 85, 161 [NASA ADS] [CrossRef] [Google Scholar]
 Gustafsson, B., Edvardsson, B., Eriksson, K., et al. 2008, A&A, 486, 951 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Heavens, A. F., Jimenez, R., & Lahav, O. 2000, MNRAS, 317, 965 [NASA ADS] [CrossRef] [Google Scholar]
 Ivezic, Z., Sesar, B., Juric, M., et al. 2008, ApJ, 684, 287 [NASA ADS] [CrossRef] [Google Scholar]
 Jonsell, K., Edvardsson, B., Gustafsson, B., et al. 2005, A&A, 440, 321 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Juric, M., Ivezic, Z., Brooks, A., et al. 2008, ApJ, 673, 864 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
 Kupka, F. G., Ryabchikova, T. A., Piskunov, N. E., Stempels, H. C., & Weiss, W. W. 2000, Baltic Astron., 9, 590 [NASA ADS] [Google Scholar]
 Kurucz, R. L. 1992, Rev. Mex. Astron. Astrofis., 23, 45 [NASA ADS] [Google Scholar]
 Lee, Y. S., Beers, T. C., Sivarani, T., et al. 2008a, AJ, 136, 2022 [NASA ADS] [CrossRef] [Google Scholar]
 Lee, Y. S., Beers, T. C., Sivarani, T., et al. 2008b, AJ, 136, 2050 [NASA ADS] [CrossRef] [Google Scholar]
 Lucatello, S., Beers, T. C., Christlieb, N., et al. 2006, ApJ, 652, L37 [NASA ADS] [CrossRef] [Google Scholar]
 Marsteller, B., Beers, T. C., Thirupathi, S., et al. 2009, AJ, 138, 533 [NASA ADS] [CrossRef] [Google Scholar]
 Nelder, J. A. & Mead, R. 1965, Comput. J., 7, 308 [Google Scholar]
 Nissen, P. E., Hoeg, E., & Schuster, W. J. 1997, in Hipparcos  Venice '97, ESA Special Publication, 402, 225 [Google Scholar]
 Nissen, P. E., Primas, F., Asplund, M., & Lambert, D. L. 2002, A&A, 390, 235 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Nissen, P. E., Chen, Y. Q., Asplund, M., & Pettini, M. 2004, A&A, 415, 993 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Nordström, B., Mayor, M., Andersen, J., et al. 2004, A&A, 418, 989 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Panter, B., Heavens, A. F., & Jimenez, R. 2003, MNRAS, 343, 1145 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
 Perryman, M. A. C., de Boer, K. S., Gilmore, G., et al. 2001, A&A, 369, 339 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Press, W. H. 1993, Science, 259, 1931 [NASA ADS] [Google Scholar]
 Prugniel, P., Soubiran, C., Koleva, M., & Le Borgne, D. 2007 [arXiv:0703658], unpublished [Google Scholar]
 Re Fiorentin, P., BailerJones, C. A. L., Lee, Y. S., et al. 2007, A&A, 467, 1373 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 RecioBlanco, A., Bijaoui, A., & de Laverny, P. 2006, MNRAS, 370, 141 [NASA ADS] [CrossRef] [Google Scholar]
 Rockosi, C., Beers, T. C., Majewski, S., Schiavon, R., & Eisenstein, D. 2009, in AGB Stars and Related Phenomenastro2010: the Astronomy and Astrophysics Decadal Survey, Astronomy, 2010, 14 [Google Scholar]
 Schlegel, D. J., Finkbeiner, D. P., & Davis, M. 1998, ApJ, 500, 525 [NASA ADS] [CrossRef] [Google Scholar]
 Skrutskie, M. F., Schneider, S. E., Stiening, R., et al. 1997, in Astrophysics and Space Science Library, The Impact of Large Scale NearIR Sky Surveys, ed. F. Garzon, N. Epchtein, A. Omont, B. Burton, & P. Persi, 210, 25 [Google Scholar]
 Sneden, C. A. 1973, PhD Thesis, A&A (The University Of Texas At Austin.) [Google Scholar]
 Snider, S., Allende Prieto, C., von Hippel, T., et al. 2001, ApJ, 562, 528 [NASA ADS] [CrossRef] [Google Scholar]
 Steinmetz, M., Zwitter, T., Siebert, A., et al. 2006, AJ, 132, 1645 [NASA ADS] [CrossRef] [Google Scholar]
 Tegmark, M., Taylor, A. N., & Heavens, A. F. 1997, ApJ, 480, 22 [NASA ADS] [CrossRef] [Google Scholar]
 Thakar, A., Szalay, A. S., & Gray, J. 2004, in Astronomical Data Analysis Software and Systems (ADASS) XIII, ed. F. Ochsenbein, M. G. Allen, & D. Egret, ASP Conf. Ser., 314, 38 [Google Scholar]
 Tojeiro, R., Heavens, A. F., Jimenez, R., & Panter, B. 2007, MNRAS, 381, 1252 [NASA ADS] [CrossRef] [Google Scholar]
 Wilhelm, R., Beers, T. C., & Gray, R. O. 1999, AJ, 117, 2308 [NASA ADS] [CrossRef] [Google Scholar]
 Wyse, R. F. G. 2006, Mem. Soc. Astron. Ital., 77, 1036 [NASA ADS] [Google Scholar]
 Yanny, B., Rockosi, C., Newberg, H. J., et al. 2009, AJ, 137, 4377 [NASA ADS] [CrossRef] [Google Scholar]
 York, D. G. Adelman, J., Anderson, J. E., Jr., et al. 2000, AJ, 120, 1579 [NASA ADS] [CrossRef] [Google Scholar]
 Zhao, G., Chen, Y., Shi, J., et al. 2006, Chin. J. Astron. Astrophys., 6, 265 [NASA ADS] [CrossRef] [Google Scholar]
Footnotes
 ... MOPED^{}
 MOPED is protected by US Patent 6,433,710, owned by The University Court of the University of Edinburgh (GB)
 ... registration^{}
 http://www.blackfordanalysis.com
 ... webpage^{}
 http://www1.appstate.edu/dept/physics/spectrum/spectrum.html
 ... Database^{}
 http://physics.nist.gov/PhysRefData/ASD/index.html
 ... webside^{}
 http://kurucz.harvard.edu/linelists.html
 ... IRAF^{}
 IRAF is distributed by the National Optical Observatory, which is operated by the Association of Universities of Research in Astronomy, Inc., under contract with the National Science Foundation.
All Tables
Table 1: Strongest lines in FG dwarf stars.
Table 2: Offsets () and scatter () of the differences (raw  column) between the methods: MA (this work).
Table 3: Fe I and II lines used for the highresolution fits.
Table 4: Parameters of the stars (Col. 1) obtained with MA for both types of analysis discussed in the text.
All Figures
Figure 1: (A) Fiducial model with parameters , K and . Other panels indicate the weighting vectors according to Eq. (5) for metallicity (B), temperature (C) and surface gravity (D). 

Open with DEXTER  
In the text 
Figure 2: Correlations between the Fisher matrix values obtained from the full and compressed likelihoods for metallicity (upper panel) and effective temperature (lower panel) for 75 randomly selected stars of SEGUE. The line corresponds to the onetoone relation. 

Open with DEXTER  
In the text 
Figure 3: Likelihoods of the compress data set ( left) and full data set ( right) in the parameter space of effective temperature and metallicity for four randomly selected SEGUE stars. In each panel eight equally spaced contour levels are plotted. The triangle and diamond correspond to the maximum point of the compressed and the full data set, respectively. Both maxima lie in the first confidence contour level, meaning that both data sets reach the same solution for these two parameters. 

Open with DEXTER  
In the text 
Figure 4: Radial velocities found using minimum fluxes of strong lines given by Table 1 (MAX) compared with those of the SEGUE database. The difference of the radial velocities given by the SEGUE database from those obtained by our method is indicated in the legend as offset, with its standard deviation as scatter. 

Open with DEXTER  
In the text 
Figure 5: Example of the fit between a randomly selected SEGUE star (crosses) and a synthetic spectrum (red dashed line). The legend indicates the stellar atmosphere parameters of the model and the value of the reduced of the fit. The upper panel is the fit using all the points of the spectral region [3850, 5200] Å. The lower panel is the fit using only the data points where the lines of Table 1 are located. 

Open with DEXTER  
In the text 
Figure 6: Upper panel: metallicity ( left), effective temperature ( middle) and surface gravity ( right) obtained using MA for the entire spectral range of [3900, 5200] (whole) compared to that with selected spectral windows (win) for a sample of 150 randomly selected stars. The offset (mean difference of ``win  whole'') of the results and its scatter (standard deviation) is indicated at the bottom right of each plot. The line has a slope of unity. Lower panel: as upper panel, but investigating the influence of using our radial velocities (rv MAX) or the SEGUE ones (rv SEGUE). Offset and scatters are calculated from the difference ``SEGUEMAX''. 

Open with DEXTER  
In the text 
Figure 7: Upper plot: comparison of the results from the SEGUE Stellar Parameter Pipeline (SSPP) with different methods: this work (MAX, black), (Lee et al. 2008a, NGS1, blue), (Re Fiorentin et al. 2007, ANNSR, red) and (Allende Prieto et al. 2006, k24, green). Lower part: comparisons of individual methods for a randomly selected subsample of stars. Each plot has a line with slope of unity. The different offsets and scatters between the results are indicated in Table 2. In each plot, a selection of 150 random stars has been used. 

Open with DEXTER  
In the text 
Figure 8: Histograms of the differences of the results obtained using 8 different fiducial models for the compression. Left: metallicity, middle: temperature and right: gravity. The parameters of the fiducial model are indicated in the right side of each plot, labeled as [[Fe/H], , ]. Each histogram has a Gaussian fit, where the peak () and the standard deviation () are indicated in the top legend. 

Open with DEXTER  
In the text 
Figure 9: Fit of the spectrum of the star HD 195633 (points) with the best model (red dashed line). The parameters of the synthetic spectrum are [Fe/H] = 0.5, K and , with . The blue line corresponds to the model with [Fe/H] = 0.6, K and = 3.86 ( , obtained with the ``classical'' approach. Each panel represents a spectral window where the lines from Table 3 are labeled at the bottom. 

Open with DEXTER  
In the text 
Figure 10: Upper panels: Correlations between the results obtained with the automatic fitting (MAX) and the classical method (EW) for metallicity ( left), temperature ( middle) and gravity ( right). In each panel, the onetoone line is overplotted and a legend containing the mean difference (in the sense ``EWMA'') and its standard deviation. Here the three parameters were determined simultaneously and directly from the spectra by our method. The stars with special symbols are indicated in the legend and correspond to special cases where the offset between the two methods is particularly large (discussed in text). Lower panels: As upper panels, but MA results were obtained forcing the method to find the bestfitting model close to the gravities computed from the parallax formula, Eq. (14). Individual data can be found in Table 4 (columns ``MA, free'' for the upper case and ``MA, restr.'' for the lower one). 

Open with DEXTER  
In the text 
Copyright ESO 2010