We found that the separation into such small temperature regions yielded
improved parametrization results. This is understandable as the classification
results for the stellar parameters, especially
and [M/H], depend
upon the presence of spectral features in a DISPI which are also closely
related to the temperature of a star. This effect was also found in
Weaver & Torres-Dodgen (1997), using spectra in the near-infrared and classifying
them in terms of MK stellar types and luminosity classes. For our ANN work we chose simple temperature ranges, also aiming at database subsamples of similar size. Though some of the intervals roughly correspond
to the temperatures found in certain MK classes which are characterized by
certain line-ratios, i.e. common physical characteristics, the chosen distinction was
motivated to allow for a reasonable training time for the networks. Another
reason was to see what can be learned from DISPIs in different temperature
regimes in principle. The mid-temperature samples
(M-samples) were defined for the range in which the
Balmer jump and the H-lines (e.g. H
)
change their meaning as
indicators for temperature and surface gravity (see e.g. Napiwotzki et al. 1993).
Under real conditions one would have to employ a broad classifier to first separate
DISPIs into smaller (possibly overlapping) temperature ranges. This could be
also based on neural networks, but also on other methods, such as minimum
distance methods. Each temperature sample was finally divided into two
disjoint parts, the training- and the application data. This means that
our classification results (see Sect. 6) are from DISPIs in the
gaps of our training grid.
The generalization performance of a network or its ability to classify
previously unseen data is influenced by three factors: The size of
the training set (and how representative it is), the architecture of
the network and the physical complexity of the specific problem, which also
includes the presence of noise. Though there are distribution-free, worst-case
formulae for estimating the minimum size of the training set (based on
the so called VC dimension, see also Haykin 1999), these are often
of little value in practical problems. As a rule of thumb, it is
sometimes stated (Duda et al. 2000) that there should be (W
10)
different training samples in the training set, W denoting the total
number of free parameters (i.e. weights) in the network. In our
network without extinction there were 452 weights. Thus, in some
cases, there were fewer training samples than free
parameters. However, we found good generalization performance (see
Sect. 6 and results therein). This may be due both to (1)
the "similarity" of the DISPIs in a specific
range, giving
rise to a rather smooth (well-behaved) input-output function to be
approximated, and (2) redundancy in the input space. Both give rise to
a smaller number of effective free parameters in the network.
We also tested whether there are significant differences between
determining each parameter separately in different networks and determining all parameters
simultaneously. In the first case each network would have only one ouput node while in the latter case the network had multiple outputs. If the parameters (
,
etc.) were independent of each other, one could train a
network for each parameter separately. However, we know that the
stellar parameters influence a stellar energy distribution
simultaneously at least for certain parameter ranges, (e.g. hot stars
show metal lines less clearly than cool stars).
Also, for specific spectral features, changes in the chemical composition [M/H] can sometimes mimic gravity effects (see for example Gray 1992). Varying
extinction can cause changes in the slope of a stellar energy distribution
which are similar to those resulting from a different temperature.
sample | temperature range | without/with ext. |
L1 | 2000 K ![]() ![]() |
330/2300 |
L2 | 4000 K ![]() ![]() |
570/3980 |
L3 | 6000 K ![]() ![]() |
500/3500 |
M1 | 8000 K ![]() ![]() |
400/2800 |
M2 | 10 000 K ![]() ![]() |
180/1200 |
H1 | 12 000 K ![]() ![]() |
390/2700 |
H2 | 20 000 K ![]() ![]() |
450/3100 |
Recently, Snider et al. (2001) determined stellar parameters for low-metallicity stars from stellar spectra (wavelength range from 3800 to 4500 Å). They reported better classification results when training networks on each parameter separately. We tested several network topologies with the number of output nodes ranging from 1 to 3 (in case of extinction from 1 to 4) in different combinations of the parameters. It was found that single output networks did not improve the results. We therefore classified all parameters simultaneously.
Copyright ESO 2003