The eye is extremely good at finding patterns and comparing shapes. In this first paragraph, we shall analyze visually the color histograms and CPFs. Of course, this analysis is only qualitative, and no claim is made with respect to the significance of these descriptions. They are meant to attract the attention of the reader to features that might eventually become significant - or may disappear when more data become available. In the next section, we will reconsider these comparisons with the cold (and less imaginative) eye of statistical tests.
In this section, we will apply statistic tools to the available dataset in order to cast some light on the question of similarities and differences between the different classes of objects.
The problem at hand is to compare samples of 1D continuous distributions (colors, e.g. V-R), in order to decide whether they are statistically compatible. We will consider the MBOSS classes two by two. For that purpose, we shall use the t-test, the f-test, and the Kolmogorov-Smirnov (KS) test, which are described in more detail in Appendix B; each of them produce a probability Prob. Low values of Prob indicate that the distributions are statistically incompatible, but larger values can only be interpreted as stating that the distributions are not incompatible, not that they are equal; this is also discussed in more details in Appendix B.
In order to get a known comparison when studying the real MBOSS populations, we introduced two pairs of artificial subsets of the objects. They are defined as following:
Table 3 lists the mean colors of the different classes. The color of an object is function of the nature of its surface and of the reddening and resurfacing it experienced. For a given population, the mean color will therefore give an information on the equilibrium reached between the aging reddening and the different re-surfacing processes.
The question we address in this section is whether the mean color of different classes are significantly different. The traditional way to compare the means of distributions is to use Student's t test; the implementation used for this work is described in Appendix B.2.1. The values of t and Prob are listed in Table C.3; the results for the artificial classes are displayed in Table C.4.
The variance of the color distribution contains some information on the diversity of the population, and on the range covered by the reddening and resurfacing processes. For instance, one could expect that - although reaching a different mean equilibrium - the aging, the collisions and the cometary activity broaden the color distribution in a similar way, ranging from bluish, fresh ice, to deep red, undisturbed, aged surface.
In this section, we will determine whether the variances of the color distributions are significantly different (independently of their mean, that can be either similar or different). This is quantified using the f-test, described in Appendix B.2.2. The values of F and Prob are listed in Table C.5.
Obviously, the whole information from a distribution is not contained in its two first moments (mean and variance). A more complete comparison of the color distributions is therefore interesting. The ideal statistics tool for this purpose is the KS test (described in Appendix B.2.3), in which the two samples are compared through their complete Cumulative Probability Function (CPF). The values of d and the associated probability Prob are listed in Table C.6 for the real classes of objects. Those for the test classes are available only electronically.
Copyright ESO 2002