Updated sunspot group number reconstruction for 1749–1996 using the active day fraction method^{⋆}
^{1} Department of Physics, University of Helsinki, 00014 Helsinki, Finland
^{2} Space Climate Research Unit, University of Oulu, 90014 Oulu, Finland
email: Ilya.Usoskin@oulu.fi
^{3} Sodankylä Geophysical Observatory, University of Oulu, 90014 Oulu, Finland
^{4} Ioffe PhysicalTechnical Institute, 194021 St. Petersburg, Russia
Received: 4 October 2016
Accepted: 6 March 2017
Aims. Sunspot number series are composed from observations of hundreds of different observers that require careful normalization to standard conditions. Here we present a new normalized series of the number of sunspot groups for the period 1749–1996.
Methods. The reconstruction is based on the active day fraction (ADF) method, which is slightly updated with respect to previous works, and a revised database of sunspot group observations.
Results. Stability of some key solar observers has been evaluated against the composite series. The Royal Greenwich Observatory dataset appears relatively stable since the 1890s but is approximately 10% too low before that. A declining trend of 10–15% in the quality of Wolfer’s observations is found between the 1880s and 1920s, suggesting that using him as the reference observer may lead to additional uncertainties. Wolf (small telescope) appears relatively stable between the 1860s and 1890s, without any obvious trend. The new reconstruction reflects the centennial variability of solar activity as evaluated using the singular spectrum analysis method. It depicts a highly significant feature of the modern grand maximum of solar activity in the second half of the 20th century, being a factor 1.33–1.77 higher than during the 18 and 19th centuries.
Conclusions. The new series of the sunspot group numbers with monthly and annual resolution is provided forming a basis for new studies of the solar variability and solar dynamo for the last 250 yr.
Key words: Sun: activity / sunspots
Monthly values of the reconstructed sunspot are available at the CDS via anonymous ftp to cdsarc.ustrasbg.fr (130.79.128.5) or via http://cdsarc.ustrasbg.fr/vizbin/qcat?J/A+A/601/A109
© ESO, 2017
1. Introduction
Sunspots, the dark spots on the Sun, are easily observed even with a basic optical instrument, and this was done by many generations of professional and amateur astronomers throughout the centuries. As a result, the sunspot number forms the longest systematic scientific series used as a quantitative index of the level of variable solar activity (Hathaway 2015). Because of its length, the sunspot number series includes records from hundreds of observers with different optical instruments, measuring/recording techniques, habits, etc. This unavoidably calls for a need to reduce the data of individual observers to a “standard observer”, which implies not only a person but material/instrument and conditions. Because of this, the sunspot number is a relative number.
The first consistent sunspot number series was produced by Rudolf Wolf of Zürich who calibrated different observers to his own observational conditions. To reduce data from different observers, Wolf used a simple linear scaling of sunspot counts from each observer to standard observers (the socalled k −factors). This is often referred to as daisychaining, especially when the number of standard observers is large. Later, the Wolf sunspot number (WSN) series was continued as the international sunspot number series (ISN) at the Royal Observatory of Belgium and the Solar Influences Data Center, Sunspot Index and Longterm Solar Observations (SILSO)^{1}). However, several inhomogeneities have been found in the WSN/ISN, and an updated ISN series (version 2, denoted as ISN_v2) had been released (Clette et al. 2014). It is important to notice that ISN_v2 uses Adolf Wolfer, not Rudolf Wolf, as a “standard observer” leading to the higher (by a factor of 1.667) overall ISN values versus the “classical” WSN/ISN datasets. The ISN_v2 still uses the k −factor methodology for calibration of different observers. We also note that the original raw data for the WSN series are not available in a digital format, making a full revision of this series currently impossible, although a progress in this direction is on its way (Friedli 2016).
The WSN/ISN series is based on the counts of both sunspot groups and individual sunspots, with the former being weighted with a factor of ten: (1)where G and S are the numbers of sunspot groups and individual sunspots, respectively, and k is a correction factor, characterizing each observer. However, resolving individual spots may be imprecise with poor instrumentations, and a new series, based only on sunspot groups, was proposed, called the group sunspot number, GSN (Hoyt et al. 1994; Hoyt & Schatten 1998). The GSN is more robust than WSN regarding observational conditions (e.g., Usoskin 2017). There is still a potential problem related to the grouping of individual spots, which may have been considered by earlier observers in a different manner to that currently accepted (Clette et al. 2014). This uncertainty is related to both WSN/ISN and GSN but can be fixed by redefining groups in historical sunspot drawings (Arlt et al. 2013). The GSN series produced by Hoyt & Schatten (1998) also uses the linear scaling and daisychaining method to reduce different data to the same reference observer, for which the RGO was chosen. The GSN is constantly scaled up by a factor 12.08 to make it comparable with the WSN series. The main advantage of the GSN series is that Hoyt & Schatten (1998) had collected and published the original database of raw data, including all the records of individual observers. This makes it possible to revise the entire series if needed. Since some corrections and additions have been recently made to this dataset, a revised database of the sunspot group numbers, separate for each observer, is published (Vaquero et al. 2016, referred to as V16 hereafter). The GSN series was revised by Svalgaard & Schatten (2016) who performed a full recalibration of the observers using a modified daisychaining method with a reduced number of links (the “backbone” method). The revised backbone GSN series suggests that the level of solar activity was relatively high in the 18th and 19th centuries, much higher than that implied by the original GSN series by Hoyt & Schatten (1998) and by WSN.
Thus, all the earlier series were based on the parametric k −factor calibration method (daisychain, linear scaling). However, it has been shown recently (Lockwood et al. 2016a,b; Usoskin et al. 2016a) that the linear k −factor methodology may be inaccurate when applied to sunspot numbers and needs to be replaced by a modern nonparametric method. Two such methods have been proposed recently: the activeday fraction (ADF) method (Usoskin et al. 2016b, referred to as U16 hereafter) and the method based on the ratio of the number of individual sunspots to that of sunspot groups (Friedli 2016). Both these methods use absolute calibration of observers to the standard one and are thus free of daisychaining and arbitrary choices.
Here we provide a new sunspot number reconstruction using the ADF method originally introduced by Usoskin et al. (2016b), the revised and corrected dataset of the sunspot groups (Vaquero et al. 2016), a larger set of observers, and a slight refining of the calibration method and estimate of its uncertainties.
2. Data
2.1. The reference dataset
As the reference dataset we used, similarly to U16, the database^{2} of sunspot groups of the RGO. RGO data is available from 1874 onwards, but the early part of the database may suffer from unstable quality. Which period may be affected by this is still debated Cliver & Ling (Cliver & Ling; see Sect. 5), but it is conservatively considered that the series is fairly homogeneous since 1900 (Clette et al. 2014; Usoskin et al. 2016b). Accordingly, we use the RGO data for the period 1900–1976 as the reference dataset to calibrate observers, but the whole RGO dataset (1874–1976) is included in this work as an observer (see below). This period includes seven complete solar cycles, # 14 through 20. As the group size we used the observed (uncorrected for foreshortening, viz. as observed from Earth) umbral area of the sunspot groups in units of msd (millionths of the solar disk). We have tested the robustness of the results against the exact length of the reference dataset (Sect. 5.1).
2.2. Observers
Here we considered major observers with long records of sunspot data covering the periods of the 18th through 20th centuries. We used the same set of observers as in U16, except for Stark, whose reliability is unsettled (Hoyt & Schatten 1998). Further, we added 11 more observers for the 20th century, extending the database up to 1996, which is comparable with the GSN (Hoyt & Schatten 1998). As data for individual observers, we used the daily number of sunspot groups collected by Vaquero et al. (2016). This database^{3} is based on the initial data of sunspot group records gathered by Hoyt & Schatten (1998) but includes important updates and corrections.
Data for Schwabe were taken from a recent compilation by Arlt et al. (2013)^{4} based on digitized drawings and notes of Schwabe. In this compilation, sunspots were regrouped using modern definition of sunspot groups which is different from the original Schwabe grouping (Senthamizh Pavai et al. 2015).
All the observers used in this study are listed in Table 1. Their data coverage is shown in Fig. 1.
Observers used in this work.
Fig. 1 Observational periods of the observers used in this study (see also Table 1). The reference dataset of RGO is shown in orange. We note the periods used for calibration of the observers may be shorter than the total observational periods shown here. 

Open with DEXTER 
3. Calibration method
3.1. Calibration of observers
Each observer has been calibrated to the reference RGO dataset, following the ADF method invented by U16. The ADF is the ratio of active days (with at least one sunspot group reported) to the total number of observational days per month. The method is based on comparing the ADF statistic of an observer to be calibrated with that of the reference dataset. The fraction of active days within a month is a robust indicator of solar activity around solar minima and makes it possible to calibrate different observers (Harvey & White 1999; Kovaltsov et al. 2004; Vaquero et al. 2012; Vaquero et al. 2015). Here we have slightly refined the original method, in particular in the part related to the compilation of monthly values (see Sect. 4). We have also revised the indirect calibration of Staudacher (see Sect. 3.4).
The ADF method used here is slightly modified with respect to the original one (U16) in the following:

When computing the ADF for individual observers, we did notapply here the limitation of considering only months with thenumber of observational days n ≥ 3, applied in U16. This has almost no effect for observers with sufficiently high observation frequency, in particular in the 19th and 20th centuries, but may distort the statistic for observers with low data coverage and uneven distribution of observational days. Accordingly, we have applied this limitation for Derfflinger and Hershel whose data coverage fraction was 11% and 5%, respectively.

When constructing the conversion matrix (Sect. 3.3), we accounted for the uncertainties in the definition of the observational threshold S_{s}, while only the mean S_{s} values were used by U16.

Data of Staudacher were calibrated differently here (see Sect. 3.4).

When calculating the monthly mean G −values and their uncertainties from daily values we used here a Monte Carlo method, while a weighted averaging method was used by U16.
The effect of these improvements are discussed in Sect. 6.1.
3.2. Assessment of the observer’s quality
The calibration method is based on the idea that the “quality” of each observer is characterized by his/her observational acuity, or an observational threshold area S_{S}. The threshold implies that the observer can see and report all the groups with the area larger than S_{S}, while missing all smaller groups. Here we assume that the observational threshold is constant for an observer during the entire period of his/her observations but a time variability of the acuity can be considered in subsequent works. In Sect. 5, we discuss this issue in more detail. The reference dataset of RGO is assumed to be “perfect” in the sense that RGO does not miss any spots (viz. S_{S} = 0).
Similarly to U16, we first made “calibration” curves using the reference dataset. As a calibration curve, we used the cumulative distribution function (CDF) of the occurrence, in the reference dataset, of months with the given ADF. A family of such curves was produced for different values of S_{S} (all sunspot groups with an area smaller than that were considered as not observed). Thus, each calibration curve uniquely corresponds to a value of the observational threshold S_{S}. Calibration curves were produced for different values of the filling factor f (the ratio of the number of days with reported observations, including nospot observation, to the total number of days during the observation/calibration period), by randomly removing (1−f) fraction of daily values from the RGO reference dataset to simulate a realistic observer. We performed 100 such random subsamplings and calculated the mean and the asymmetric twotail 68% confidence intervals for each case.
Fig. 2 Cumulative distribution function of the reference dataset for S_{S} = 25 msd and the filling factor f = 0.92 (gray line with 1σ uncertainties) compared to that for Quimby (dots with error bars). 

Open with DEXTER 
For each observer, we constructed a CDF curve using his/her observations during the calibration period. The ADF and the subsequent CDF were calculated for all months with observations; not only months with three or more observational days, as was done in U16. This limitation was, however, applied to Derfflinger and Hershel. An example of the CDF curve for observer Quimby is shown in Fig. 2. It is important that solar activity during the calibration period roughly corresponds to that in the reference dataset (U16). The reference dataset covers a wide range of solar cycles, from moderate cycle 14 to the very high cycle 19, but weak cycles are not presented there, thus this method cannot work reliably for the periods of grand minima such as the Maunder or Dalton minimum. Accordingly, we selected for calibration of each observer periods with a relatively good coverage by the data and covering full solar cycles (except for the case of Schubert – see U16). If the observer had a sufficiently long period of direct overlap with the reference dataset, the period of overlap was used for calibration. In these cases, the reference calibration curves were also calculated for the same overlap period. The list of the selected observers and their calibration periods is presented in Table 1.
The observational threshold for each observer was defined by fitting the family of the calibration curves to the actual CDF curve of this observer, as shown in Fig. 2. The bestfit value of S_{S} and its 68% (± 1σ) confidence interval were defined by the χ^{2} method. The minimum value corresponds to the bestfit estimate of the observational threshold, while the values of S_{S} corresponding to bound the 68% confidence interval.
The values of the acuity observational threshold S_{S} are shown along with the 68% confidence intervals in the last column of Table 1 for each observer. One can see that it varies from very small numbers around zero for good observers up to 60–70 msd for poorer observers. In the cases of the National Astronomical Observatory of Japan (NAOJ) and Space Environment Laboratory (SEL), we found that their quality is better than that of RGO, that is, a formally negative threshold would have been obtained in the calibration. Since the negative threshold cannot be defined for the reference dataset, we further consider no threshold for them, assuming them to be of the same observational quality as RGO. The negative threshold would lead to a slight overestimate (1–2%) of the values of the final G −series during the second half of the 20th century.
3.3. Correction matrix
Once the observational threshold S_{S} has been defined for an observer, a correction matrix can be constructed in the following way. From the entire reference dataset, a distribution of the daily values of G_{ref} (the number of sunspot groups of all sizes in the reference dataset for a given day) as a function of G_{S} (the number of sunspot groups with the size ≥S_{s} for the same day) is constructed and normalized to unity in the “vertical” direction so that it gives a probability to observe the “true” number of groups G_{ref} for a day when the observer reported G_{S} groups. In order to account for the uncertainties of the defined value S_{S}, the matrix was constructed not only for the bestfit values (as done by U16) but averaging matrices for all the (integer) values of S_{S} from the corresponding 68% confidence interval. By construction, G_{ref} ≥ G_{S}. An example of the correction matrix is shown in Fig. 3 for Quimby.
Fig. 3 Panel a): correction matrix for Quimby giving the probability of finding the value of G_{ref} in the reference dataset on a day when G_{Quimby} sunspot groups were reported by Quimby. Panel b): pdf (probability density function) of G_{ref} to be found in the reference dataset for days with G_{Quimby} = 10 (the crosssection of panel a at the blue dashed line). 

Open with DEXTER 
Such matrices are further used to correct the actual observation with a given threshold value to the reference level.
3.4. Calibration of Staudacher
The only observer who cannot be calibrated directly using the ADF methods is Staudacher since he did not report spotless days properly (see U16 for more detail). On the other hand, Staudacher was a key observer covering the second half of the 18th century (although with sparse observations), and it is important to consider him for that period. Since the data from Staudacher overlaps with observations of two other observers, Horrebow and Schubert, who can be calibrated using the ADF method, we have performed an indirect correction of the Staudacher data via Horrebow and Schubert.
For all the days with reported observations of Staudacher, we checked Schubert’s and Horrebow’s data for observations on the same day. If none was found, we checked for observations on the previous day and, if none was found, on the next day. If no observations of Schubert or Horrebow were found on the neighboring days, we checked for the available data two days before, and finally two days after the day with Staudacher’s observation. We have checked that each Staudacher’s observation was used not more than once in the analysis. Although using the observations from 1–2 neighboring days may introduce a small uncertainty due to shortliving small groups (e.g., Willis et al. 2016), this is outweighed by the improvement of statistics. We found 138 days, when observations of Staudacher coincided with data from either Schubert or Horrebow, 120 days when the observations were separated by one day, and 44 cases when they were separated by two days, leading to 302 days altogether for calibration.
Next, for all daily values of Staudacher G_{Stau} from the subsample described above, we collected the corresponding daily values of G^{∗} for Schubert or Horrebow, already normalized to the reference level using the ADF method. As a result, we composed a correction matrix (shown in Fig. 4), which allows us to convert the number of sunspot groups reported by Staudacher to the reference observer, in the same way as used for other observers.
Fig. 4 Correction matrix of Staudacher giving the probability of finding the value of G_{ref} in the reference dataset on a day when G_{Staudacher} groups were reported by Staudacher. 

Open with DEXTER 
4. Construction of the composite series
4.1. Daily values
Using the correction matrices, we first calculated PDFs of the corrected (to the reference observer) G −values for each observer and day. An example of such PDFs is shown in Fig. 5b for the day of 19Feb.1869 for three observers whose records are available for this day: Wolf, Schmidt, and Weber (colored lines in the figure). Next, we made a sum of all the available individual PDFs for the day and renormalized it again to the unity. This makes a composite PDF of all the observers for this day (gray bars in Fig. 5b). Such composite daily PDFs of the corrected G −values were made for all days (see an example for the month of February 1869 in Fig. 5a). This dataset (available from the authors upon request) makes a basis for further computations of the monthly time series.
Fig. 5 Panel a): distribution of PDF (grayscale on the right) for corrected number of sunspot groups for February 1869. Panel b): PDF of the corrected G values for the day of 19Feb.1869 (crosssection of panel “a)” at the dashed vertical line). The composite PDF is shown by gray bars, while PDF distributions for Wolf, Schmidt, and Weber are shown as blue, red, and green lines, respectively. 

Open with DEXTER 
4.2. Monthly series
Using the daily PDF series discussed above, we constructed the monthly corrected number of sunspot groups using a Monte Carlo method. Within each month we considered all days with available observations. For each such day, we randomly took G −value corresponding to the PDF distribution (an example is shown in Fig. 5b), and then computed the corresponding monthly mean G −value as the arithmetic mean. This procedure was repeated 1000 times so that an ensemble of 1000 monthly G values was obtained. From this ensemble, we calculated the mean and the bounds of the (asymmetric) 68% twoside confidence interval (corresponding to the generally asymmetric ±1σ interval) for the monthly G −value. For the example shown in Fig. 5a (February 1869) the monthly number of sunspot groups was found to be .
There is an uncertainty related to calculation of monthly values from a small number of sparse daily observations, when a simple arithmetic average tends to overestimate (by up to 20–25%) the number of sunspot groups for periods of high activity if the number of daily observations per month is smaller than three (Usoskin et al. 2003). This may affect the values for the 18th century and Dalton minimum where data coverage was low, giving the numbers presented here as a conservative upper limit. However, this effect does not influence the calibration and correction procedure since they operate with daily data.
The final composite series is shown in Fig. 6 and is available in digital format at CDS.
Fig. 6 Monthly values of the final composite series of number of sunspot groups. Error bars (68% twoside confidence intervals) are shown in gray. This series is available at CDS. 

Open with DEXTER 
4.3. Annual series
From the monthly values, we computed the annual mean G −values and their uncertainties using the Monte Carlo method (with 1000 ensemble members) similar to that applied to compute monthly values from daily ones. The resulting series of the annual numbers of sunspot groups is shown in Fig. 7a as the black curve with the 68% confidence intervals shown in gray, and in Table A.1.
Fig. 7 Panel a): annual mean numbers of sunspot groups. The black line with gray shading depicts the result of this work with the 68% confidence interval. Numerical values are given in Table A.1. Other colored curves with symbols show reconstructions of G by H98 (Hoyt & Schatten 1998), S16 (Svalgaard & Schatten 2016), and U16 (Usoskin et al. 2016b). Panel b): ratio between the colored plots (shown in panel “a)” and following the same notation) to the result of this work. The ratio is not shown for years with low activity (G < 3). 

Open with DEXTER 
5. Consistency of the result
Fig. 8 Monthly series of sunspot group numbers by individual observers. 

Open with DEXTER 
First we computed the monthly series of G −values for each observer using the same method as described in Sect. 4.2 but applying it to data of only this observer (viz. without construction of the composite series). The resulting series are shown in Fig. 8. One can see that there is a good agreement between different observers, especially in the 19th and 20th centuries. The agreement is worse around the Dalton minimum, when the reconstructions based on data of Herschel and Derfflinger diverge, suggesting that the level of solar activity during that period is relatively uncertain. On the other hand, we stress that since the ADF method is free of daisychaining and based on a direct calibration of the observers to the reference one, the big uncertainty around the Dalton minimum does not affect other periods, even before it.
It is difficult to judge the stability of observers and their calibration from simply overplotting the series as done in Fig. 8. We also studied, as the measure of the observer’s stability, the ratio of the G −values (annually averaged to avoid noisy data), obtained using only data from this observer, to that of the composite series constructed using all but this observer’s data. In order to avoid the ratio of small numbers, we excluded years when the mean number of sunspot groups G was below three.
5.1. RGO data
As an example, Fig. 9 presents the ratio of the annual G −values only from RGO to that of the composite series without RGO. One can see that the ratio is close to unity around solar cycle maxima, as expected if the calibration was done correctly, always being within ± 20%, but there are some systematic features. The ratio is systematically too low for the first cycle covered by RGO data, before ca. 1890. This implies that the RGO data underestimate the number of sunspot groups by approximately 10% during that period. This inhomogeneity in the earlier years of the RGO dataset is relatively well known (see, e.g., Clette et al. 2014), but its exact extent is still debated (Cliver & Ling 2016). Most studies (Sarychev & Roshchina 2009; Carrasco et al. 2013; Aparicio et al. 2014; Willis et al. 2016) limit the effect of undercounts to the period before ca. 1885, which is likely related to the secondary magnifier installed at Greenwich in 1884 (Cliver & Ling 2016). However, Cliver & Ling (2016) claim that the inhomogeneity might have extended until ca. 1915. Clette et al. (2014) stated that the RGO data is homogeneous at least since 1900. Our result confirms that the RGO data suffers from the inhomogeneity (10–15% undercount of sunspot groups) only before 1890, while the ratio during the period 1890–1910 is around unity and fully consistent with that for the period after 1930. Thus, our choice of the reference period 1900–1976 is safe from this point of view. We note that while this result is consistent with others (Sarychev & Roshchina 2009; Carrasco et al. 2013; Aparicio et al. 2014; Willis et al. 2016), it differs from that of (Cliver et al. 2015; Cliver & Ling 2016) who proposed a smooth parabolic “learning curve” of the RGO before 1920.
Fig. 9 Ratio between annual mean G −values obtained using only RGO data to those from the composite series computed without RGO. Ratios for the years with low activity (G < 3) are not shown. Error bars depict the 68% confidence interval for the ratio. Blue stars correspond to the years of official solar cycle minima. 

Open with DEXTER 
There is another interesting feature in Fig. 9 related to a bump during 1910–1930, when the RGO ratio is approximately 10–15% higher than unity, suggesting that the RGO was counting more sunspot groups than other (normalized) observers. Although it may not be excluded that it is not RGO showing higher values but other observers degrading in quality during that period (see an example of Wolfer below), the number of other observers during these years was five (Fig. 1), and it is unlikely that they degraded simultaneously. We note that this period was characterized by the change of the observers generations – Wolfer, Quimby, Broger and Guillaume ceased their observations, while Luft, Brunner, and, later, the Madrid Observatory started.
The period after 1930 is characterized by the ratio around unity, implying a good consistency in the RGO data series. Thus, the RGO series depicts a fair stability and is suitable to be the reference dataset, especially after 1890.
We have also tested the stability of the results versus the exact choice of the reference dataset period. While the main reconstruction is based on the RGO period 1900–1976, we have checked other periods as well. The use of the full RGO dataset 1878–1976 as the reference period leads to a systematic decrease of the S_{s} values by ≈5 msd in comparison with the values shown in Table 1 for the calibrated observers. The final result in this case appears very close to the present one, with slightly lower G −values, within the error bars. The use of the RGO dataset for the period 1913–1976, which was stable according to Cliver & Ling (2016), leads to slightly poorer statistics and a systematic increase of the S_{s} values by 5–10 msd. The final series based on this reference period yields slightly higher G −values but still in agreement with the main result within error bars. We have also tested the effect of removing the “bump” period of 1913–1933 (discussed above) from the reference period. It appears similar to the previous case, that is, an increase in the S_{s} values by 5–10 msd and the final series consistent with the main result within error bars. We also determined that shrinking the reference period even further to 1933–1976 completely smears the result for two reasons. First, the statistic is low; only four solar cycles. But even more important is the fact that the cycles after the 1940s were very active, not being representative of the normal level of solar activity, which is the basic condition of the ADF method. For instance, there was not a single month with ADF = 0 during the period of 1933–1976. This leads to a formally very strong offset of the obtained S_{s} values being 25–40 msd greater than those in Table 1 and consequently to unrealistically high G −values. Accordingly, we conclude that the method is robust against the exact choice of the start of the reference period in a wide range, from 1878 till 1913, but the use of only high solar cycles leads to a violation of the basic assumption of the ADF method.
5.2. Wolfer
We performed a similar analysis of the stability also for Wolfer, who is the reference observer for the ISN_v2 series (Clette et al. 2014) and the primary “backbone” observer for the S16 series. The result is shown in Fig. 10. One can see that there is a clear trend implying that the quality of Wolfer as an observer was slightly degrading in time, meaning that he observed 10% more groups than others in the 1880s but 5% less than others in the 1910s. Thus, we can conclude that the ability of Wolfer to detect sunspot groups was slightly degrading, by 10–15%, through his scientific life, with respect to other observers. We have checked that this trend is not caused by the putative drift in the RGO data (Cliver & Ling 2016) by excluding the RGO dataset from the denominator of the ratio shown in Fig. 10. The trends remains qualitatively unaltered.
Fig. 10 Same as in Fig. 9 but for Wolfer. 

Open with DEXTER 
5.3. Wolf
Figure 11 shows the ratios for Wolf, who is the reference observer for the WSN and ISN_v1 series. One can see that there is a clear enhancement in the beginning of the series around 1850, when Wolf counted approximately 30% more groups relative to other observers. This is likely related to the use of another (larger) telescope by Wolf. However, since circa 1860, his quality is around unity implying a fair stability, within ±20%. Interestingly, for the period of overlap with Wolfer after 1880, the ratio for Wolf depicts a downward trend, which was interpreted by many as a sign of degradation of Wolf’s eyesight (e.g., Clette et al. 2014). However, it may not be the case since the ratio during the period 1880–1895 is fully consistent with that during the period of 1857–1875.
Fig. 11 Same as in Fig. 9 but for Wolf. 

Open with DEXTER 
5.4. Other observers
We performed a similar analysis for other observers (not shown) and found no specific features to be mentioned. We note that this method of using the ratio only works if the number of overlapping observers is high enough. Accordingly, when the number of regular observers is less than four, it becomes unclear. Unfortunately, because of this, we cannot evaluate the stability of crucial observers before 1850.
This analysis suggests that for some especially longobserving observers an assumption on the stability of their observational quality may not be exactly valid. However, this assumption makes a basis for all the existing sunspot series. It will be the subject of forthcoming work to assess this issue and to take it into account.
6. Discussion
The final composite series of the number of sunspot groups constructed by the ADF method is shown in Figs. 6 (monthly values) and 7a (annual) along with the 68% confidence intervals.
6.1. Comparison with other series
In Fig. 7, we compare the results of this work with previously published annual values of the number of sunspot groups G: the original GSN series (divided by 12.08 to obtain the values of G) for the period 1610–1995 by Hoyt & Schatten (1998), H98; the “backbone” G −series for 1610–2015 by Svalgaard & Schatten (2016), S16; and the series, also based on ADF method, for the period 1749–1899 by U16. It is important to note that the H98 series is calibrated to the reference RGO series using a k −factor method. The normalization is direct for the period of 1874–1976 covered by the RGO data and includes a daisychain normalization outside this period. The S16 series is based on the “backbone” method, which uses key backbone observers, calibrated to the reference one. The backbone observers were Staudacher, Schwabe, Wolfer and Koyama, who did not directly overlap with one another (see Table 1) and thus can be linked together only via a multistep daisychain procedure of linear normalization by means of k −factors. Wolfer was selected as the reference observer, and thus the S16 series is free of daisychain calibration only for the period 1880–1928, when direct Wolfer data is available. For all other periods, it includes a multistep daisychain normalization. The U16 series uses the RGO dataset for the period 1900–1976 as the reference, but presents data only before 1900. Normalization is performed using the ADF method, which is free of daisy chaining. The present result is also calibrated to the RGO dataset (1900–1976) using the ADF method. For the period 1900–1976, we directly applied the ADF method but used the exact overlap of the observers with the RGO data, while a statistical comparison forms a basis for normalization outside this period. This method is also free of daisy chaining.
The series are overplotted in Fig. 7a, while panel b shows the ratio of individual series G −values to those of the present result for years with the annual number of sunspot groups not smaller than three. Some specific periods can be identified for a detailed discussion.
After 1910, the present result is fully consistent with the H98 series and the ratio is around unity (1.01 ± 0.04). This is understandable since both series are directly calibrated to the RGO dataset during this period, and the quality and quantity of observers was high in the 20th century. Accordingly, the number of groups is most precisely defined for this period. However, the S16 series is systematically lower by approximately 10% (0.90 ± 0.04) in the 20th century. This discrepancy is likely related to the normalization method of Svalgaard & Schatten (2016), which uses Wolfer and Koyama as “backbones” for the 20th century. Since these two observers did not overlap, Svalgaard & Schatten (2016) used crossnormalization, including a multistep daisychain procedure to reduce Koyama backbone data to the reference Wolfer conditions, that may introduce additional uncertainties. We note that the data of Koyama agree with the result of this work (Fig. 8) within 5% for the period 1947–1976 (overlap between Koyama and the RGO reference dataset). Unfortunately, full information of the calibration for this backbone is not available from Svalgaard & Schatten (2016) to investigate this question in full detail.
For the period 1880–1900, the present result is in full agreement with the S16 series (the mean ratio over this period is r = 0.98 ± 0.05), and we consider it as a good sign, since this was the period of Wolfer (the reference observer for the S16 series) observations when no daisychain normalization was applied in the S16 series. On the other hand, the H98 series is too low by 10% (r = 0.89 ± 0.04), probably related to the inhomogeneity in the RGO data series in its earlier part (see Sect. 5.1), as noted by Clette et al. (2014) and Cliver & Ling (2016). The U16 series is slightly lower than the present one but consistent with the unity (0.96 ± 0.05).
The middle of the 19th century (1830–1870) is characterized by a great excess of the S16 series by approximately 30% (r = 1.29 ± 0.08). Keeping in mind that the S16 series agrees with our reconstruction for the period of the Wolfer observations (see above), we may propose that this discrepancy is related to the calibration of the Schwabe backbone to Wolfer via Wolf in the S16 series. As argued recently (Lockwood et al. 2016b; Usoskin et al. 2016b,a), the use of the linear k −factor as a conversion between Wolf and Wolfer data may lead to an overcorrection. The H98 series is on average consistent with the present result (r = 0.96 ± 0.1) but the ratio is inhomogeneous. While it is around unity before 1848, it is systematically lower by 10% (r = 0.89 ± 0.05) after that. This implies that the data by Wolf were likely undercorrected by Hoyt & Schatten (1998) by approximately 10%. The U16 series is insignificantly lower than the present result (r = 0.96 ± 0.07), being generally consistent with it, with the discrepancies related to the slightly updated methodology used here. This difference may be related to the different restrictions to the rare observations applied here and in U16 (see Sect. 3.1) and can serve as an estimate of the corresponding uncertainty.
For the period before the Dalton minimum, the S16 series is slightly higher than the present result (r = 1.1 ± 0.03), but the ratio is inhomogeneous. While the G −values of the S16 series are consistent with our data before 1760, they are approximately 7% higher than those in the 1760–1780s. This suggests that the normalization of Horrebow could be a reason for the discrepancy. The H98 series, on the contrary, is systematically and significantly lower (r = 0.76 ± 0.03) than the present result before the Dalton minimum, suggesting that it may be underestimated for that period. It is important to note that both the S16 and H98 series, based on the daisychain calibration procedure, dramatically lose quality before the Dalton minimum because of a lack of highquality data at the turn of the 18th and 19th centuries. This makes it very difficult, if even possible, to make a “calibration bridge” across the Dalton minimum to relate the observers of the Staudacher era to those of the Schwabe era. Nevertheless, the uncertainties of the daisychain k −factor calibration grow significantly before the Dalton minimum. The U16 series is somewhat lower than the present one for the 1750s–1790s (0.93 ± 0.03), because of the different ways to normalize the data of Staudacher (Sect. 3.4), who was the key observer for that period. Although the ADF method is free of daisychaining, uncertainties are also large (10–15%) for the 18th century (see the shaded areas in Fig. 7b) because of the sparse data. The present series agrees with the U16 one within these uncertainties, although the latter tends to run systematically over the lower bound.
To summarize, the level of sunspot group activity yielded by the new series presented here lies between those for S16 and U16, but significantly higher (by 24%) than that for the H98 series, in the second half of the 18th century before the Dalton minimum. Due to large uncertainties for that period, all the series except H98 are marginally consistent with each other. The new series is consistent with U16 and is marginally consistent with H98 ones during the 19th century, but is significantly lower (by 25–30%) than the S16 one, in particular refuting the high level of activity during the mid19th century suggested by the S16 series. However, the new series agrees with the S16 one during the 1880–1890s, more precisely, during the period of observations by Wolfer who is the reference observer for the S16 series. In the 20th century, when the quality and quantity of observational data was high, the new series is fully consistent with the H98 series but is significantly (10%) higher than the S16 one.
6.2. Centennial variability
Although the sunspot activity is dominated by the 11yr Schwabe cycle, centennial variability is also apparent in the time evolution of the composed series. It is expressed in low (e.g., during the Dalton minimum around 1800) and high (middle of the 20th century) solar cycles. There is no established way to define the centennial variability. Sometimes decadal or cycleaveraged values are used to represent the centennial evolution in consistency with cosmogenic isotope data (Usoskin 2017), or a linear trend over the envelope of solar cycles is considered (Clette et al. 2014). Here we use the nonparametric Singular Spectrum Analysis (SSA; Vautard et al. 1992), which decomposes a time series into several components with distinct temporal behaviors and is very convenient to disentangle longterm trends and quasiperiodic oscillations. The method is based on the KarhunenLoeve spectral decomposition theorem (Kittler & Young 1973) and ManéTakens embedded theorem (Mane 1981; Takens 1981).
The two first SSA components of the final composite series are shown in Fig. 12. We used the range of the time window for the SSA as 40–80 yr, where the result is stable. If the time window is too short, the 11yr cycle leaks into the first SSA component, while if it is too long, the pattern becomes smeared. One can see that the time series is decomposed into a longterm trend or centennial variability (the first SSA component) and the 11yr cycle (the second component). These two components represent 74% of the overall variability of the series.
Fig. 12 First two components of the SSA analysis of the reconstructed series (panels A) and B), respectively) The black curves depict the mean while the shaded area depicts the full range (corresponding to the time window of 40–80 yr) of the SSA component values. The red and blue lines represent the first SSA component for the S16 and H98 series, respectively. 

Open with DEXTER 
The series presented here depicts the relatively high activity in the mid18th century (G = 4.5 ± 0.5) which decreases to 3.5–4 during the entire 19th century and then rises to around 6 in the second half of the 20th century. This implies the significance of the Modern grand maximum of solar activity (Solanki et al. 2004), meaning that the level of centennial sunspot activity in the second half of 20th century was a factor 1.33–1.77 higher than in the 18th and 19th centuries.
Figure 12a shows the first SSA components also for the H98 and S16 series (the U16 series is close to the one presented here and is not shown). One can see different patterns of the centennial evolution (the primary SSA component) for these series. The H98 series yields a monotonic increase of activity by a factor 2.5 between the mid18th and late 20th century. On the contrary, the S16 series suggests an approximately constant, slightly oscillating level in the range of G between four and five, with an approximately 100yr period, without a clear grand maximum. We note however, that the existence of the Modern grand maximum is independently confirmed by data from cosmogenic isotopes (e.g., Abreu et al. 2008; Steinhilber et al. 2012; Inceoglu et al. 2015; Usoskin et al. 2014).
7. Conclusions
A new revisited series of the numbers of sunspot groups is presented for the period 1749–1996, reconstructed by applying the active day fraction method to a revised database of sunspot group observations (Vaquero et al. 2016). The new reconstruction agrees with the “classic” GSN series by Hoyt & Schatten (1998) in the 20th centuries but is systematically higher than that in the 18th century, suggesting solar activity in the mid18th century slightly higher than previously believed. On the other hand, the new series is systematically lower than that by Svalgaard & Schatten (2016) in the 18th and especially 19th century, implying that the latter overestimated the level of activity.
We have estimated the stability of some key solar observers. The RGO dataset appears relatively stable against all other observers since the 1890s but is approximately 10% too low before circa 1885, as proposed earlier (Sarychev & Roshchina 2009; Carrasco et al. 2013; Aparicio et al. 2014; Willis et al. 2016; Clette et al. 2014). However, the conclusion by Cliver & Ling (2016) that the RGO data are of uneven quality before 1915, is not confirmed. A declining trend of 10–15% in the quality of Wolfer’s observation is found between the 1880s and 1920s, suggesting that using him as the reference observer may lead to additional uncertainties. On the other hand, Wolf (small telescope) appears fairly stable between the 1860s and 1890s, without an obvious trend.
The new reconstruction reflects the centennial variability of solar activity. Using the SSA method, we decomposed different series into the primary centennial component (Fig. 12a) and the secondary 11yr solar cycle. The new series confirms the existence of the significant Modern grand maximum of solar activity in the second half of the 20th century, which appears a factor 1.33–1.77 higher than during the 18th and 19th centuries. This is different from both the H98 series, which shows a strong centennial trend with the growth of activity by a factor of 2.5 between the mid 18th and 20th centuries, and the S16 series, which shows no centennial trend. The existence of the
Modern grand maximum is known independently also from cosmogenic isotope data (e.g. Abreu et al. 2008; Steinhilber et al. 2012; Inceoglu et al. 2015).
The new series, available in Table 1 (annual values) and in CDS (monthly values), forms a basis for new studies of the solar variability and solar dynamo for the last 250 yr.
Available at http://solarscience.msfc.nasa.gov/ greenwch.shtml
Available at http://haso.unex.es/?q=content/data
www.aip.de/Members/rarlt/sunspots/schwabe, version 1.3 from 12 August 2015.
Acknowledgments
I.G.U. and G.A.K. acknowledge support by the Academy of Finland to the ReSoLVE Center of Excellence (project No. 272157).
References
 Abreu, J. A., Beer, J., Steinhilber, F., Tobias, S. M., & Weiss, N. O. 2008, Geophys. Res. Lett., 352, L20109 [NASA ADS] [CrossRef] [Google Scholar]
 Aparicio, A. J. P., Vaquero, J. M., Carrasco, V. M. S., & Gallego, M. C. 2014, Solar Phys., 289, 4335 [NASA ADS] [CrossRef] [Google Scholar]
 Arlt, R., Leussu, R., Giese, N., Mursula, K., & Usoskin, I. G. 2013, MNRAS, 433, 3165 [NASA ADS] [CrossRef] [Google Scholar]
 Carrasco, V. M. S., Vaquero, J. M., Gallego, M. C., & Trigo, R. M. 2013, New Astron., 25, 95 [NASA ADS] [CrossRef] [Google Scholar]
 Clette, F., Svalgaard, L., Vaquero, J., & Cliver, E. 2014, Space Sci. Rev., 186, 35 [NASA ADS] [CrossRef] [Google Scholar]
 Cliver, E. W., & Ling, A. G. 2016, Solar Phys., 291, 2763 [NASA ADS] [CrossRef] [Google Scholar]
 Cliver, E. W., Clette, F., Svalgaard, L., & Vaquero, J. M. 2015, Centr. Europ. Astrophys. Bull., 39, 1 [NASA ADS] [Google Scholar]
 Friedli, T. 2016, Solar. Phys., in press [Google Scholar]
 Harvey, K., & White, O. 1999, J. Geophys. Res., 104, 19759 [NASA ADS] [CrossRef] [Google Scholar]
 Hathaway, D. H. 2015, Living Rev. Sol. Phys., 12, 4 [NASA ADS] [CrossRef] [Google Scholar]
 Hoyt, D. V., & Schatten, K. H. 1998, Sol. Phys., 179, 189 [NASA ADS] [CrossRef] [Google Scholar]
 Hoyt, D., Schatten, K., & NesmesRibes, E. 1994, Geophys. Res. Lett., 21, 2067 [NASA ADS] [CrossRef] [Google Scholar]
 Inceoglu, F., Simoniello, R., Knudsen, V. F., et al. 2015, A&A, 577, A20 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Kittler, J., & Young, P. C. 1973, Pattern Recognition, 5, 335 [CrossRef] [Google Scholar]
 Kovaltsov, G. A., Usoskin, I. G., & Mursula, K. 2004, Sol. Phys., 224, 95 [NASA ADS] [CrossRef] [Google Scholar]
 Lockwood, M., Owens, M. J., Barnard, L., & Usoskin, I. G. 2016a, ApJ, 824, 54 [NASA ADS] [CrossRef] [Google Scholar]
 Lockwood, M., Owens, M. J., Barnard, L., & Usoskin, I. G. 2016b, Sol. Phys., 291, 2829 [NASA ADS] [CrossRef] [Google Scholar]
 Mane, R. 1981, in Dynamical Systems and Turbulence, eds. D. Rand, & L. Young (SpringerVerlag), Lect. Notes Math., 898, 230 [Google Scholar]
 Sarychev, A. P., & Roshchina, E. M. 2009, Sol. Syst. Res., 43, 151 [NASA ADS] [CrossRef] [Google Scholar]
 Senthamizh Pavai, V., Arlt, R., DasiEspuig, M., Krivova, N. A., & Solanki, S. K. 2015, A&A, 584, A73 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Solanki, S., Usoskin, I., Kromer, B., Schüssler, M., & Beer, J. 2004, Nature, 431, 1084 [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Steinhilber, F., Abreu, J., Beer, J., et al. 2012, Proc. Nat. Acad. Sci. USA, 109, 5967 [NASA ADS] [CrossRef] [Google Scholar]
 Svalgaard, L., & Schatten, K. H. 2016, Sol. Phys., 291, 2653 [NASA ADS] [CrossRef] [Google Scholar]
 Takens, F. 1981, in Dynamical Systems and Turbulence, eds. D. Rand, & L. Young (SpringerVerlag), Lect. Notes Math., 898, 366 [Google Scholar]
 Usoskin, I. G. 2017, Living Rev. Sol. Phys., 14, 3 [NASA ADS] [CrossRef] [Google Scholar]
 Usoskin, I., Mursula, K., & Kovaltsov, G. 2003, Sol. Phys., 218, 295 [NASA ADS] [CrossRef] [Google Scholar]
 Usoskin, I. G., Hulot, G., Gallet, Y., et al. 2014, A&A, 562, L10 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Usoskin, I. G., Kovaltsov, G. A., & Chatzistergos, T. 2016a, Sol. Phys., 291, 3793 [NASA ADS] [CrossRef] [Google Scholar]
 Usoskin, I. G., Kovaltsov, G. A., Lockwood, M., et al. 2016b, Sol. Phys., 291, 2685 [NASA ADS] [CrossRef] [Google Scholar]
 Vaquero, J. M., Trigo, R. M., & Gallego, M. C. 2012, Sol. Phys., 277, 389 [NASA ADS] [CrossRef] [Google Scholar]
 Vaquero, J. M., Kovaltsov, G. A., Usoskin, I. G., Carrasco, V. M. S., & Gallego, M. C. 2015, A&A, 577, A71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Vaquero, J., Svalgaard, L., Carrasco, V., et al. 2016, Sol. Phys., 291, 3061 [NASA ADS] [CrossRef] [Google Scholar]
 Vautard, R., Yiou, P., & Ghil, M. 1992, Phys. D Nonlin. Phen., 58, 95 [NASA ADS] [CrossRef] [Google Scholar]
 Willis, D. M., Wild, M. N., & Warburton, J. S. 2016, Sol. Phys., 291, 2519 [NASA ADS] [CrossRef] [Google Scholar]
Appendix A: Annual number of sunspot groups
All Tables
All Figures
Fig. 1 Observational periods of the observers used in this study (see also Table 1). The reference dataset of RGO is shown in orange. We note the periods used for calibration of the observers may be shorter than the total observational periods shown here. 

Open with DEXTER  
In the text 
Fig. 2 Cumulative distribution function of the reference dataset for S_{S} = 25 msd and the filling factor f = 0.92 (gray line with 1σ uncertainties) compared to that for Quimby (dots with error bars). 

Open with DEXTER  
In the text 
Fig. 3 Panel a): correction matrix for Quimby giving the probability of finding the value of G_{ref} in the reference dataset on a day when G_{Quimby} sunspot groups were reported by Quimby. Panel b): pdf (probability density function) of G_{ref} to be found in the reference dataset for days with G_{Quimby} = 10 (the crosssection of panel a at the blue dashed line). 

Open with DEXTER  
In the text 
Fig. 4 Correction matrix of Staudacher giving the probability of finding the value of G_{ref} in the reference dataset on a day when G_{Staudacher} groups were reported by Staudacher. 

Open with DEXTER  
In the text 
Fig. 5 Panel a): distribution of PDF (grayscale on the right) for corrected number of sunspot groups for February 1869. Panel b): PDF of the corrected G values for the day of 19Feb.1869 (crosssection of panel “a)” at the dashed vertical line). The composite PDF is shown by gray bars, while PDF distributions for Wolf, Schmidt, and Weber are shown as blue, red, and green lines, respectively. 

Open with DEXTER  
In the text 
Fig. 6 Monthly values of the final composite series of number of sunspot groups. Error bars (68% twoside confidence intervals) are shown in gray. This series is available at CDS. 

Open with DEXTER  
In the text 
Fig. 7 Panel a): annual mean numbers of sunspot groups. The black line with gray shading depicts the result of this work with the 68% confidence interval. Numerical values are given in Table A.1. Other colored curves with symbols show reconstructions of G by H98 (Hoyt & Schatten 1998), S16 (Svalgaard & Schatten 2016), and U16 (Usoskin et al. 2016b). Panel b): ratio between the colored plots (shown in panel “a)” and following the same notation) to the result of this work. The ratio is not shown for years with low activity (G < 3). 

Open with DEXTER  
In the text 
Fig. 8 Monthly series of sunspot group numbers by individual observers. 

Open with DEXTER  
In the text 
Fig. 9 Ratio between annual mean G −values obtained using only RGO data to those from the composite series computed without RGO. Ratios for the years with low activity (G < 3) are not shown. Error bars depict the 68% confidence interval for the ratio. Blue stars correspond to the years of official solar cycle minima. 

Open with DEXTER  
In the text 
Fig. 10 Same as in Fig. 9 but for Wolfer. 

Open with DEXTER  
In the text 
Fig. 11 Same as in Fig. 9 but for Wolf. 

Open with DEXTER  
In the text 
Fig. 12 First two components of the SSA analysis of the reconstructed series (panels A) and B), respectively) The black curves depict the mean while the shaded area depicts the full range (corresponding to the time window of 40–80 yr) of the SSA component values. The red and blue lines represent the first SSA component for the S16 and H98 series, respectively. 

Open with DEXTER  
In the text 