The SPECFIND V3.0 catalog of radio continuum cross-identifications and spectra: Reaching lower frequencies

Context. Many radio continuum catalogs with different sensitivity limits and spatial resolutions are published via the VizieR database at the Centre de Données astronomiques de Strasbourg. The diversity of spatial resolutions of different catalogs makes the crossidentification of different flux density measurements of individual sources complex. The SPECFIND tool is able to handle radio surveys at different frequencies from different instruments with different resolutions. Aims. Since the former version of the SPECFIND catalog was released ten years ago, hundreds of new radio continuum catalogs have been published. We upgraded the SPECFIND tool to reach a wider frequency range, especially the lower-frequency radio regime, as well as to have better spatial sky coverage. Methods. We adapted selection criteria and applied them to all of the radio catalogs listed in the VizieR database to define a final sample of new catalogs. We unified the new catalogs and implemented them in the SPECFIND tool. The new SPECFIND V3.0 radio cross-identification catalog was constructed using 204 input tables from 160 VizieR radio continuum catalogs to cross-identify flux density measurements of individual sources and fit their spectral slopes. We discuss the frequency and sky coverage of all processed catalogs and compare the results to the previous version. Furthermore, we present and investigate peaked spectrum (PS) sources with spectral breaks around 1.4 GHz and 325 MHz. Results. By increasing the number of input catalog tables that were implemented in SPECFIND from 115 to 204 (89 new catalog tables and two updates), we improved the number of resulting spectra from ∼107 500 to ∼340 000 and increased the number of crossidentified sources from ∼600 000 to ∼1.6 million. The final SPECFIND V3.0 catalog is publicly available via VizieR. By applying SPECFIND to two subsamples of the catalogs with frequency cuts at 325 MHz and 1.4 GHz, spectral break and PS source candidates could be identified. We encourage follow-up observations of these candidates to confirm their nature because the population we identify has a relatively low reliability. Conclusions. The SPECFIND V3.0 catalog is a very useful resource and a powerful open access tool, reachable via VizieR. By tripling the resulting spectra and including many radio continuum surveys from the last 50 years, we provide a significantly extended catalog of cross-identified radio continuum sources. Furthermore, the SIMBAD database will be updated using the SPECFIND V3.0 catalog and will contain more radio continuum data, serving the needs of future projects.


Introduction
Within the last few decades, a variety of radio continuum observations of major parts of the sky have been conducted, leading to many different radio continuum source catalogs. The observations are usually undertaken with different interferometric arrays or single dish telescopes, leading to a huge range of resulting resolutions and sensitivities as well as pointing precisions. The SPECFIND tool (Vollmer et al. 2005a), with applications and later versions (Vollmer et al. 2005b(Vollmer et al. , 2008(Vollmer et al. , 2010, was introduced to handle these diverse radio catalogs. It uses catalogs from the VizieR database 1 (Ochsenbein et al. 2000) of the Centre de Données astronomiques de Strasbourg (CDS) to cross-identify radio sources of the different catalogs and produces spectral energy distributions (SEDs) over a large frequency range in the radio SPECFIND V3.0 catalog and full Appendix figures are only available at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/ cat/VIII/104 1 http://vizier.u-strasbg.fr/viz-bin/VizieR continuum regime. Since the release of the SPECFIND V2.0 catalog, several of the recently published radio continuum catalogs have significantly improved the sky coverage, especially in the low-frequency regime for example, the TIFR GMRT Sky Survey Alternative Data Release (TGSSADR, Intema et al. 2017), the GaLactic and Extragalactic All-sky MWA survey (GLEAM, Hurley-Walker et al. 2017), or the LOFAR Two-metre Sky Survey DR1 (LOTSS DR1, Shimwell et al. 2019). Therefore, we upgraded SPECFIND to include these along with other VizieR catalog tables with more than 150 entries. This has nearly doubled the number of ingested catalog tables and increased the number of input sources by a factor of 60, with more than 5 million input sources in SPECFIND V3.0.
The radiation in the radio continuum is dominated by emission originating from relativistic cosmic ray electrons (CREs) gyrating around magnetic field lines and, while doing so, emitting nonthermal synchrotron emission perpendicular to the magnetic field orientation. An ensemble of relativistic electrons with a wide energy range of individual energy distributions following a power-law relation (see, e.g., Pacholczyk & Swihart 1970;Condon 1992) leads to the observed SED of synchrotron radiation following a power law, where the observed flux density S ν at frequency ν is proportional to ν α with the spectral index α. If considered in log-log space (log(frequency) vs. log(intensity)), the spectral index α is the slope of the linear relation, commonly α ∼ −0.7 (Condon 1992). A contribution of thermal emission is expected depending on the frequency (e.g., 20% at 6 cm wavelength, Condon 1992) which has a flat spectral slope of α = −0.1 in the log-log space. The majority of radio continuum sources show a resulting linear slope in the radio regime. Therefore, the SPECFIND tool uses a linear fit for the spectra. Nevertheless, spectral flattening or inversion can occur toward lower frequencies due to synchrotron self-absorption or free-free absorption as well as spectral steepening toward higher frequencies due to the aging of CREs (see, e.g., O'Dea 1998).
Gigahertz-peaked spectrum (GPS), high-frequency peaked (HFP), or compact steep spectrum (CSS) sources are powerful radio continuum sources showing inverted spectra with a positive spectral slope in the megahertz regime up to a turnover frequency with a negative spectral index toward higher frequencies (see the reviews O'Dea 1998; O'Dea & Saikia 2021). The spectral break of GPS sources can be observed in different sources, such as quasars, active galactic nuclei or galaxies. Variable GPS sources are often connected to blazars (e.g., Tinti et al. 2005;Ross et al. 2021). A source is classified differently, based on the turnover frequency and the turnover curvature, as well as a spectral index of α ≥ 0.5 below the associated turnover frequency. While CSS sources are the least compact (up to ∼20 kpc) in comparison to the other two, they show their turnover frequency in the megahertz regime ≤500 MHz. GPS sources have turnover frequencies in the gigahertz regime of ∼0.5−5 GHz. They are more compact (∼1 kpc) than CSS sources. HFP sources are defined to have turnover frequencies ≥5 GHz and are very compact (≤1 kpc). These three different types of sources are referred to as peaked spectrum (PS) sources (O'Dea & Saikia 2021) and could represent an age sequence, where HFP sources are younger stages of GPS sources, which ultimately transform into CSS sources and then even into larger and more powerful radio sources. This aging scenario is concluded from observations of turnover frequencies and linear sizes of different objects (e.g., Fanti et al. 1990). The steepening of the spectrum toward lower frequencies can be explained by two different mechanisms: via synchrotron self-absorption or free-free absorption (e.g., Snellen et al. 1998).
In this new SPECFIND upgrade, we are able to detect sources that peak at around 1.4 GHz and 325 MHz. While belonging to the overall class of PS sources, we call the sources with turnover frequencies around 1.4 GHz satisfying α ≥ 0.5 below the turnover frequency GPS source candidates. We call sources with turnover frequencies around 325 MHz megahertzpeaked spectrum (MPS) source candidates. Furthermore, we classify a sample of PS sources that have α ≥ 0.3 below the turnover frequency and α ≤ −0.3 above the turnover frequency as concave source candidates.
The paper is organized as follows. In Sect. 2 we provide basic explanations of how the SPECFIND tool works. Section 3 describes the upgrade in terms of software and the selection criteria for the newly added radio continuum catalog tables. In Sect. 4 the results are presented and compared to the previous version of the SPECFIND catalog. We describe an application of the SPECFIND tool to find PS sources and provide example sources that have a clear spectral break either around 325 MHz (MPS source candidates) or around 1.4 GHz (GPS sources candidates) in Sect. 5. In Sect. 6 we explain how to access the public SPECFIND V3.0 catalog via VizieR as well as the structure of the published tables. The summary and conclusions are provided in Sect. 7.

The SPECFIND tool
Generally, the SPECFIND tool cross-identifies flux density measurements of sources from radio continuum catalog tables at different frequencies from the VizieR database and fits a single power law (linear spectral slope in log-log space) to the crossidentified flux density measurements. In principle, SPECFIND allows for one break in the spectrum, which means it can fit two different slopes to the radio continuum spectrum. However, since the SPECFIND algorithm is optimized for the robust fitting of a single spectral slope, spectral breaks are rarely fit to the data (see Fig. 10 in Vollmer et al. 2010). Similarly, any curvature in a spectrum due to flattening toward lower frequencies (synchrotron self-absorption or free-free absorption) or steepening toward higher frequencies (CRE aging) is only very rarely fit by the SPECFIND algorithm. Instead, a single spectral index is determined for the part of the spectrum with the highest frequency coverage above or below the break frequency. For SPECFIND V3.0 we undertook a more detailed analysis of the cases that show a spectral break with different spectral indices on the lower and the higher-frequency part of our catalog sample (Sect. 5).
In the context of the SPECFIND cross-identification of flux density measurements of radio continuum sources the following terms are important. The different VizieR catalogs can contain one or more tables. A table in VizieR belongs to a catalog. Therefore, we often use the term "catalog table". Relevant tables contain at least sky coordinates and a radio continuum flux density. Additional parameters are the error on flux density, source size, and position angle. We produced SPECFIND input tables from these tables. If flux density measurements at different frequencies are present in a VizieR table, it is split into different SPECFIND input tables. Cross-identified flux density measurements from different tables belong to one object in the SPECFIND catalog. Each object contains at least three flux density measurements observed at independent frequencies. Every object has one associated spectrum that is the collection of the associated flux density measurements.
SPECFIND uses its own requirements for the catalog entries: coordinates in J2000 and their associated uncertainties; flux density and its associated uncertainty; major and minor axis and position angle; source name. More details on these SPECFIND catalogs and how we unified and ingested them will be explained in Sect. 3.3.
SPECFIND is a hierarchical code. It classifies a flux density measurement j as parent, sibling or child with respect to a given flux density measurement i at different stages where stage 2 and 3 are refinements of stage 1.
stage 1: depending on proximity criteria: -parent: flux density measurement j has a larger extent or was observed with a lower angular resolution than flux density measurement i, -sibling: flux density measurement j has a comparable extent or was observed with a comparable angular resolution (within 25%) to that of flux density measurement i, -child: flux density measurement j has a smaller extent or was observed with a higher angular resolution than flux density measurement i.
stage 2: depending on flux densities at the same frequency: -parent: flux density measurement j has a larger extent or resolution and has a larger flux density than flux density measurement i, -sibling: flux density measurement j has a comparable extent or resolution and has the same flux density within the errors as flux density measurement i, -child: flux density measurement j has a smaller extent or resolution and a smaller flux density than flux density measurement i.
stage 3: depending on flux densities at different frequencies, based on the expected radio spectral index: -parent: flux density measurement j has a larger flux density than expected from the radio spectrum that includes flux density measurement i, -sibling: flux density measurement j fits into the radio spectrum that includes flux density measurement i, -child: flux density measurement j has a smaller flux density than expected from the radio spectrum that includes flux density measurement i.
At the end of this procedure flux density measurement i and its siblings are considered the same flux density measurement. Once the cross-identification based on the flux density measurements at the same frequency is done, the family dependences are verified, which means for a given flux density measurement cross-checks are performed. These checks are performed for all SPECFIND catalog entries: -If flux density measurement j is a sibling of flux density measurement i, then flux density measurement i must also be a sibling of flux density measurement j.
-If flux density measurement j is a child of flux density measurement i, flux density measurement i must be a parent of flux density measurement j.
-If flux density measurement j is a parent of flux density measurement i, flux density measurement i must be a child of flux density measurement j.
The heart of SPECFIND is the spectrum-finding algorithm. It uses the method of the least absolute deviation to make a linear fit in the log ν− log S ν plane. This method is more robust against outlying points in a spectrum than a standard least-squares deviation (χ 2 ) fit (see Press et al. 2002). For this algorithm, the best way to find a maximum number of spectra without a too high risk of spectral misidentifications is to set the flux density errors of all flux density measurements that are smaller than 30% of their flux density to 30%. In this way all catalogs have approximately the same relative error. This scaling was found heuristically by Vollmer et al. (2005aVollmer et al. ( ,b, 2010. It led to a high number of crossidentifications with a relatively low number of misidentifications. Moreover, these relatively large errors can compensate for some flux density measurement variability and calibration offsets, for example, known for the WENSS (Hardcastle et al. 2016).
The structure of the spectrum-finding algorithm is explained the following. For a given set of flux density measurements for which all family relations were determined, their flux measurements at different frequencies are grouped together into an array and sorted by frequency. If the number of different frequencies is greater than two, the spectrum-finding algorithm passes through the following steps, where spectra are fitted to all SPECFIND catalog entries individually: 1. A least absolute deviation fit in the log S ν − log ν plane is performed: 2. If the spectrum is determined more than once, the number of flux density measurements that fit into the spec-trum is checked. If it decreases, the old fit parameters are used; 3. A check if flux density measurements fit into the spectrum is performed; if all flux density measurements fit, the algorithm goes to step 6; 4. If there are two flux density measurements of the same frequency, the one with the largest deviation from the fit is flagged and removed; 5. If all flux density measurements have different frequencies the flux density measurement with the largest deviation from the fit is flagged and removed, the algorithm goes to step 1; 6. If there are more than two independent points left and if the ratio between the largest and the smallest frequency interval is greater than 0.02, the final fit is performed; 7. The algorithm goes to step 1 and performs a second run with fir parameters of α = −0.9 and γ = log S ν −α log ν during the first N−4 steps of the loop, where N is the initial number of points in the spectrum (−0.9 is the mean spectral index of all radio flux density measurements); 8. If the number of fitted points with fixed γ and α exceeds that of the initial fitting procedure, this spectrum is accepted; otherwise, the spectrum of the first fitting procedure is accepted.
In order to avoid using points that are too close to one another in frequency, and therefore not independent, the frequency intervals between the different points of the spectrum are checked. The routine calculates the frequency intervals and determines the ratio between the second largest and the largest frequency interval. If this ratio is smaller than 0.02, the spectrum is rejected.
Then, in order to avoid ambiguous radio flux density measurements of a given frequency, which are attributed to two distinct physical objects, the "center of mass" coordinates are calculated for both objects, where the inverse of the survey resolution is used for the "mass". The ambiguous flux density measurement is then attributed solely to the object whose "center of mass" position is nearest to the flux density measurement position.
A completeness and uniqueness check for all spectra ensures that if a flux density measurement j fits the spectrum determined for flux density measurement i (where flux density measurement i is included), then flux density measurement i also appears in the spectrum of flux density measurement j. In this way it is ensured that a radio flux density measurement belongs to only one single physical object.
3. SPECFIND V3.0 SPECFIND V3.0 contains minor changes in terms of software. With the addition of 91 catalog tables the number of input catalog tables was almost doubled with respect to SPECFIND V2.0.

Software
The software was modified to improve the spatial crossidentification via a revised proximity criterion: if the proximity criterion is not fulfilled by one source, we now allow the exclusion of this spectral point, whereas in the previous version the entire associated object was removed. This increased the number of resulting spectra by 3%, which corresponds to several thousand spectra.

Adding new catalogs
In order to add relevant catalogs, we searched in the VizieR database for catalogs containing radio data with source positions available in the table. This search can be done directly in VizieR via the Unified Content Descriptor (UCD) search capability. The UCDs are an International Virtual Observatory Alliance (IVOA 2 ) standardized (Derriere et al. 2004;Martinez et al. 2018) description of astronomical quantities. The basic UCD search can be accessed on the VizieR web page. The other way to obtain catalogs with certain UCD criteria is via the TABFIND services, a Structured Query Language (SQL) search by the Tool for OPerations on Catalogs And Tables (TOP-CAT 3 , Taylor 2011). In order to deal with the resulting tables and to apply additional selection criteria we used TOPCAT. We obtained all radio data catalogs with positional and flux information included (UCD:"(pos.*)&(phot.flux*;em.radio*)". The VizieR database holds more than 1200 catalogs containing radio data 4 . Some of these catalogs consist of several tables, which leads to approximately 2400 tables containing radio data or being connected to radio data. To find useful catalogs for the SPECFIND tool, we first chose the number of records that is the number of rows within each table to be larger or equal to 150.
We ended up with approximately 1000 tables to investigate manually. We discarded all tables not related to radio continuum observations of point sources such as, all tables containing neutral hydrogen (HI) surveys or description tables containing no radio continuum data at all. Applying further criteria (radio continuum data between a few MHz and up to 31 GHz and no time series) led to roughly 110 tables. We discarded many deep field catalogs (e.g., the Swire field and the COSMOS field) because we need at least a few square degrees sky coverage within one table to have an impact on the SPECFIND results. In the last step we had to discard catalogs with large beams (>30 arcmin) since the cross matching fails if the beam is too large. In the following we summarize the selection criteria for our radio continuum catalogs: -at least 150 flux density measurements in the table -radio continuum data between a few MHz and up to 31 GHz -no time series -minimal flux density ≥0.01 mJy to avoid deep fields (like, e.g., the Hubble Deep Field) -sky coverage ≥2 deg 2 -beam sizes ≤30 arcmin. These criteria led to 91 new catalog tables 5 . Two catalog tables were replaced by a new version of the same data: VLA Lowfrequency Sky Survey Redux (VLSSr, Lane et al. 2014) and Faint Images of the Radio Sky at Twenty centimeters Version 2014 (FIRST14, Helfand et al. 2015). 89 catalog tables were added to the existing 115 ones of version 2.0. This resulted in 204 catalog tables originating from 160 VizieR catalogs that were included in SPECFIND V3.0 (Table A.1).

Unification
The VizieR radio catalogs have been published by a wide range of authors, with a multitude of different original purposes. As such, these catalogs have a diverse range of ways of expressing the properties of radio sources. VizieR provides a high level of homogenization so that they comply with the CDS standard for catalogs. To process the radio catalogs for SPECFIND we take this process a step further to provide a higher level of interop-erability of these catalogs. In particular by further unifying the catalogs for the SPECFIND tool (e.g., taking into account specific properties of radio sources).
This unification procedure was made efficient by using an ingestion and unification tool, developed at CDS. For more details on that and the entire procedure of unification of the different radio tables see Vollmer et al. (2010). The information of the observational characteristics was gathered for each catalog (mostly by manually searching in the associated paper). This included the beam size of the observations in arcsec, the minimum flux density of the observations (i.e., faintest detected source in the respective survey) in mJy and the flux density measurement error of the observations (if not mentioned, 15% was assumed). In Table A.1 we summarize the information about the observing frequency, the beam size, the minimum flux density, the number of sources, the percentage of the catalog that was processed by the SPECFIND tool and the reference with its associated VizieR catalog name.
Each ingested catalog table includes the general information (Number of sources in the table, frequency and beam size of the observations) and the source information for each source (coordinates RA and Dec, flux density and its error in mJy, source size in units of the beam size, its position angle and the source name). The source name was either used directly from the source names in the VizieR table or was newly assigned. When no acronym was provided by the authors of a catalog, a unique acronym was created directly linked to the corresponding publication and defined in coordination with the Dictionary of Nomenclature 6 . These acronyms are based on the initials of the first three authors followed by the year of publication. Since SPECFIND needs to distinguish between sources observed at different frequencies within the same VizieR table, a letter or the frequency has been added to the acronym when necessary.

Compability to V2.0
To ensure that SPECFIND version 3.0 and 2.0 are compatible we used the SPECFIND Comparison tool, which was developed at CDS (see Vollmer et al. 2010, for further explanation). This tool is able to compare the different output spectra from both versions. It finds differences in cross-identified flux density measurements between the two versions and shows this next to each other with a view of the corresponding spectrum. It is possible to merge both cross-identification spectra or chose one spectrum. Secondly it also enables us to add the sources from V2.0 which were originally not included in V3.0. In this tool, more than 7000 spectra were inspected by hand due to either different or missing spectra in version 3.0. In this way, we created a consistent final SPECFIND V3.0 catalog. The final catalog containing the spectra of the different sources and the spectral slopes was ingested into the VizieR database ("spectra" table). Additionally, all rejected spectra ("waste" table) and the catalog list ("beam" table) were added to the VizieR database (see Sect. 6).

Results
In total, SPECFIND V3.0 found 339592 objects with corresponding radio continuum spectra by processing 204 input catalog tables (see Table A.1). These objects have at least three independent frequency points, which means data points coming from three different SPECFIND input catalog tables with radio  continuum data observed at different frequency bands. The total number of cross-identified sources is ∼1.6 million.

Comparison to SPECFIND V2.0
In comparison to the 107500 resulting spectra of SPECFIND V2.0, the number of output spectra was increased by more than a factor of three. In Fig. 1 the sky coverage density maps of SPECFIND version 2.0 and 3.0 are shown next to each other. This shows the major improvement in the sky coverage that has been achieved in version 3.0. additionally covers the southern hemisphere, where fewer catalogs are available. For GLEAM, we use the mean spectral point to not give an overweight to this survey.

Frequency coverage and minimum flux density
As a sanity check, we compared the two updated catalogs, VLSSr and FIRST14, with their former versions within SPECFIND V3.0. By replacing the VLSS (Cohen et al. 2007) with the VLSSr catalog, the number of VLSS sources was increased from ∼68 000 to ∼92 000. By replacing the FIRST (White et al. 1997) with the FIRST14 catalog, the number of FIRST sources was increased from ∼810 000 to ∼946 000. SPECFIND V3.0 was able to process ∼67 000 VLSSr sources, in comparison to ∼53 600 VLSS sources. The same trend is visible for the FIRST catalog, which was updated to FIRST14. We were able to cross-identify ∼59 700 FIRST14 sources, in comparison to ∼53 800 FIRST sources. In summary, we found higher numbers of cross-identified sources in the new catalogs.

Spectral indices
The spectral index distribution (Fig. 3) shows a peak around α = −0.9, which is consistent with the former SPECFIND versions. The median spectral index is α = −0.75 with a semiinter-quartile-range (SIQR) of 0.28. The median and SIQR agree with other measurements: the cross-identification of VLSSr and NVSS led to α = −0.82 with an SIQR of 0.11 (Lane et al. 2014), that of SUMSS and NVSS to a median spectral index of −0.83 (Mauch et al. 2003). The median spectral index within the GLEAM band is about −0.8 . The distribution shows a tail to positive spectral indices, which is caused by sources with low flux densities at 325 MHz (Fig. 4). Since this tail is not present in the VLSSr-NVSS spectral index distribution, it is most probably caused by our selection bias (see Fig. 2): The WENSS survey together with the relatively shallow 5 GHz surveys (GB6, 87GB, MITG, BWE, PMN) and the deep TGSSADR survey favors the detection of sources with positive spectral indices. Additionally, we inspected the tail and discarded sources by hand with spectral indices >2 showing source confusion. Figure 4 represents the distribution of the spectral indices as a function of the measured Westerbork Northern Sky Survey (WENSS) flux density at 325 MHz. If not available, the flux density of another catalog at 325 MHz was used. If no flux density measurement existed at that frequency, we calculated the value by interpolating the flux density at 325 MHz from the spectral fit. The general appearance is consistent with that of V2.0. The majority of the objects (dark region) have spectral indices of ∼−0.7 irrespective of the flux densities at 325 MHz. This spectral slope is expected for synchrotron emission.  . This offset is not large enough to explain the SI deviations from the mean in the wing and the bump. We inspected sources in both regions by eye using the Aladin lite and the VizieR photometric viewer. There are many objects in the wing (S 325 < 100 mJy, SI > 0), which include WENSS/WISH and NVSS flux densities together with a flux density at a lower frequency (e.g., TGSSADR), or at higher frequency (e.g., PMN). We did not find a significant number of objects where source confusion was suspected. The objects in the bump mostly contain NVSS and WENSS/WISH sources together with sources at frequencies below 325 MHz. As before, we did not find obvious problems with these objects. We therefore conclude that the deviations from the mean SI are caused by flux density scale issues that are signal-to-noise dependent and the minimal observed flux densities or sensitivities of the different catalogs (Fig. 2). Drawing lines through S 325 = 100 mJy with spectral indices >0 and <−1.5 gives insight into which catalogs are expected to be involved in the objects in the wing and bump.

Number of sources
In Fig. 5 we present the distribution of the number of sources, which means the distribution of how many independent frequency points are contained in an object/spectrum. The distribution begins at a number of three sources (frequencies) since this is the minimum number to produce an output spectrum with the SPECFIND tool. There is a slight excess of the number of objects containing three or four sources, as was the case for SPECFIND V2.0 (see Vollmer et al. 2010). In SPECFIND V3.0 the maximum number of sources in a spectrum is 34, compared to 30 SPECFIND V2.0.

Complex radio sources
The SPECFIND tool finds mainly individual radio continuum point sources. However, sources in radio continuum catalogs can occur not only as simple point sources, but with diverse appearances as extended sources, complex or double sources. Additionally, there can be source confusion, with two or more physically unrelated sources located within a beam.
During the consistency check between versions 2.0 and 3.0, we inspected a subsample of 7000 complex sources by eye. While most of these sources are separated into different final spectra, we find a ∼6% contamination of double radio lobes or confused sources in crowded fields, which are assigned to one corresponding spectrum including all sub-sources. In most cases, this is apparent in the spectrum by parallel or intersecting lines of different spectral indices. Therefore, we advise the user to always inspect the data in Aladin Lite, which is available within the VizieR catalog of SPECFIND V3.0 (see Sect. 6).
The different radio continuum surveys have different resulting beams. Therefore, with a small beam (high resolution), extended sources are resolved, whereas with a large beam (low resolution), source confusion is an issue, if the source separation is smaller than the beam size. The SPECFIND tool is able to fit spectra to most of these sources (see Sect. 5 and appendix for further discussions on limitations). Nevertheless, the more complex a source is and the closer two or more different sources are, the harder is it to cross-identify the correct sources due to the different beams. In order to evaluate the results of SPECFIND V3.0, we revisited some of the sources shown in Vollmer et al. (2010) and compare the resulting spectra. These are shown in Figs. 6-8.
Most of the point sources and double sources that have been compared, show the same resulting spectra in the two versions (for example Fig. 6). For the three close sources around WN B2228.2+5940A (Fig. 7), we find a merged spectrum in SPECFIND V2.0 that includes two of the three sources. In SPECFIND V3.0 two individual spectra are found for two of the three sources with the third source that have no corresponding spectrum. In each version of the SPECFIND catalog, the SPECFIND tool missed one spectrum of the three sources. We find a tendency that the merged sources of version 2.0 are well separated in V3.0. In the case of the complex source WN B2040.8+4246 (Fig. 8), version 3.0 shows a cleaner result, because the points, which do not cover the full source and thus show too little flux densities, are excluded.

Examples of new radio continuum source spectra
In Fig. 9 we show three examples from the ∼232 500 new spectra of SPECFIND V3.0 that are not included in V2.0. As is visible in the histogram (Fig. 5), most spectra include three to five sources. Many of these spectra include sources of the NRAO VLA Sky Survey (NVSS, Condon et al. 1998) and TGSSADR due to the large sky coverage of both catalogs. One example of this kind of spectra is shown on the left of the source NVSS J135722+732125. Another new spectrum of source NVSS J171701+191740 is shown in the middle of Fig. 9. With ten associated frequency points, it shows a spectral break at around 5 GHz. In the spectral break analysis of our entire sample which is presented in Sect. 5, we show that we are mostly sensitive to spectral breaks below 1.9 GHz and thus this source is not included in our sample of spectral break sources. A spectrum of the double source NVSS J123317+670808 is presented on the right of Fig. 9. The twenty associated frequency points show two distinct spectral slopes, which represent two different sources.

Data analysis -Peaked spectrum sources
The upgrades of SPECFIND V3.0 described above led to a wellsampled frequency coverage over the radio spectrum and thus enabled us to investigate possible spectral breaks of PS sources. However, we were only very rarely able to detect these sources using the SPECFIND tool in its classical design where a single power law is fit to the data within the full frequency range. Since the SPECFIND tool tries to include as many sources as possible in the fit, spectral break sources often end up either losing the peak spectral point(s) or one side of the spectrum, either the low-or the high-frequency part. Thus, these source were mostly unrevealed.
To identify spectral break source candidates, we divided the sample of catalog tables into two frequency parts to fit individual spectral slopes to each side of the turnover frequency. We created two catalog subsamples, one subsample below and one subsample above a selected frequency cut. The corresponding frequency of the cut is included in both subsamples to allow for cross-identification. The SPECFIND tool was applied on these subsamples individually. The two complementary spectra of the two subsamples were then combined into one object if they contained the same flux density measurement at the common frequency. In a second step, the fitted spectral slopes of the two subsamples were compared for each object. This procedure was done for two different frequency cuts to find GPS source candidates around 1.4 GHz and to find MPS source candidates around 325 MHz. The comparison of the spectral slope between the high and low-frequency part of an object led to three different criteria to identify (1) spectral break sources (sb), (2) concave spectrum sources (conc), (3) GPS and MPS sources with the criterion of O'Dea (1998). The spectral slope of the subsample of lower or equal frequencies in comparison to the frequency cut is α low , the spectral slope of the subsample with frequencies higher than or equal to the frequency cut is α high . The different criteria are defined as: (1) spectral break sources (sb): |α low −α high | ≥ 0.3 (2) concave spectrum sources (conc): α low ≥ 0.3 and α high ≤ −0.3 (3) GPS and MPS sources (gps/mps): α low ≥ 0.5 and α high ≤ −0.3 Finally, we discarded convex sources from the spectral break sample to avoid false positive detections due to confusion. Convex PS sources fulfill criterion (1) for spectral break sources and have α low < α high . It turned out that most of the convex sources in our sample were composed of two distinct sources with α 1 < α 2 . The flux densities of source 1 dominate at low frequencies, those of source 2 dominate at high frequencies. Confusion of two sources with different constant spectral indices within the same resolution element most frequently leads to a convex spectrum. For objects with concave spectra source confusion can occur because of different spatial resolutions at low and high frequencies: by inspecting the NVSS maps of 100 objects with a spectral break at 325 MHz and a concave spectrum by eye we found 16 resolved sources. The majority of these sources are elongated with major axes between 1.5 and 3 . Sometimes, the objects contain several sources of the same survey at 325 MHz with similar flux densities. Most objects have a spectral index of the unresolved flux densities <−0.7. Whereas the high-frequency flux densities from low-resolution surveys fit this slope, the flux den-sities at low frequencies, which were observed with a resolution <1 , are significantly smaller because the object is resolved: the low-frequency emission is emitted only by a part of the object mimicking a flatter spectral index. Since the 325 MHz flux density has to fit the low-and high-frequency spectrum, these cases merely represent <20%.
In Fig. 10 three examples of PS source candidates are presented. In the top row of the figure, the resulting spectral fits of SPECFIND V3.0 are shown. In the bottom row, we show the results of the subsamples with the selection frequency (325 MHz or 1.4 GHz) marked by a dashed line. For the MPS source candidate NVSS J123048+485758 on the left (I), SPECFIND fitted mostly negative spectral slopes by excluding the three low-frequency spectral points. One positive slope was found by SPECFIND by ignoring several frequency points as well. The plot below shows that we were able to find the spectral break at around 360 MHz by dividing the frequency sample. For the concave source candidate NVSS J080637+774606 in the middle A17, page 9 of 28 column (II), the peak spectral point at 1.4 GHz was ignored in the spectral fit of SPECFIND and thus the spectrum becomes flat. By using the two subsample catalogs, we were able to reveal the spectral break around 1.4 GHz. For the GPS source candidate NVSS J120215+720024 on the right (III), SPECFIND V3.0 did not include the low-frequency points and thus no spectral break is visible in the upper plot. Looking below, the plot includes all data points and thus a spectral break is identified at 1.4 GHz.
Doing this analysis on a sample with inhomogeneously obtained flux density measurements in the radio continuum, several caveats and biases need to be mentioned. Here, we discuss the important ones.
Flux density scales. Different radio surveys are known to have different flux density scales. We checked all catalog tables against the NVSS and SUMSS. Following Vollmer et al. (2005a,b), the uncertainty in the flux density is measured as follows: ∆S = (S extr −S cat )/S cat , where S cat is the flux density from NVSS or SUMSS. We applied a linear regression to all flux densities and the associated uncertainties of SPECFIND objects, which contain NVSS or SUMSS sources. S extr is the flux density at 1.4 GHz of an object as calculated using the fitted SPECFIND spectrum. Spectral breaks are not taken into account. By fitting a Gaussian to the distribution of ∆S , we found that the relative offsets are in most cases comparable to the error of the measurements (Fig. A.2). The maximum ∆S of the processed catalogs is 12%. Only 6 small catalogs show ∆S larger than 10% (ATESP, B3a, [RLM94], [JAP2011], [HFT2009]4, PiGSS-L). 94 out of 147 catalogs show ∆S smaller than 5%. The statistical uncertainty or the standard deviation of the distribution increases the uncertainties of the spectral slopes, and the offsets lead to systematically lower or higher spectral indices (see Sect. 4.3). Such systematic uncertainties introduce biases in our source selections and decrease the completeness of our sample. The effect is expected (i) to decrease with an increasing number of independent flux densities within a spectrum and (ii) to be most important if the flux densities at the break frequency are affected by strong systematic uncertainties. In order to take the statistical and systematic uncertainties of the flux densities into account, the SPECFIND tool increases all flux density errors to 30% for the cross-identification, effectively smoothing out these variations but still maintaining sufficient precision to carry out the analysis. This procedure smooths spectral structure and decreases the completeness of our sample.
Spatial scales. Different interferometers are sensitive to different spatial scales on the sky. Therefore, the flux density measurements and thus the spectra can be influenced by different Fourier sampling of the source. To quantify this effect, we compared the flux densities of sources with both NVSS and FIRST14 data. For 4800 sources in our spectral break sample, which have both flux density measurements, we find a median flux density difference of 2.8% between the two surveys. Furthermore, the SPECFIND tool selects flux measurements that are consistent with a single spectral slope. If a source is resolved for the minority of the flux density measurements, SPECFIND will discard these lower flux density measurements (see left panel of Fig. 8). However, in the rather unlikely case that a source were only resolved in the low-frequency regime, this could lead to a concave spectrum.
Observation dates -variable sources. PS sources are expected to be variable. To deal with a limited amount of source variability, the SPECFIND tool increases the flux error to 30% for the cross-identification. Since the vast majority of the frequency measurements used in SPECFIND are uncorrelated in time, SPECFIND is able to pick out the flux measurements that are consistent with a single spectral slope. However, the smaller the number of cross-identified flux density measurements, the higher the probability of a random cross-identification. Variability can occur on timescales of years at megahertz frequencies.
At high frequencies (∼10 GHz) the variability timescale can be much shorter. Since SPECFIND PS sources have at least two flux densities measured at frequencies above the break frequency and two flux densities measured at frequencies below the break frequency, the probability of a random cross-identification is rather limited, especially at high frequencies where timescales are short.
With a frequency cut at 1.4 GHz we found 2908 spectral break source candidates, 110 concave spectrum source candidates, and 86 GPS source candidates (15 examples Table B.2). About 70% of our spectral break/PS sources are compact (see Appendix B.4). The full list of spectral break sources is available via the VizieR database. The description of the sample properties and their comparison with two recent samples of PS sources Sotnikova et al. 2019) are presented in Appendix B.
As expected, our GPS/MPS sample is far from being complete. The comparison with the results of Callingham et al. (2017) showed that we could only correctly identify ∼23% of their PS sources. We were able to find about 50% of their PS sources with a flux density in excess of 0.16 Jy at 200 MHz in our spectral break samples. This is caused by the fact that SPECFIND needs at least 5 flux density measurements at independent frequencies with inhomogeneous coverages and sensitivities of the input catalogs. We call this effect the catalog selection bias. For example, the GPS sources PKS 1934-638 and PKS 0008-421 are not present in the SPECFIND PS source candidate samples. PKS 1934-638 has a peak frequency at ∼1.4 GHz. Since there is no flux density measurement at this frequency in SPECFIND V3.0, the object was missed. It would have been there with the AT20G catalog and will certainly be there with the Australien Square Kilometer Array Pathfinder (ASKAP) Evolutionary Map of the Universe (EMU) catalog. PKS 0008-421 has a peak frequency at ∼600 MHz and is present in the SPECFIND V3.0 catalog with the two spectral slopes. Since the two spectra could not meet in a common point around the peak frequency, this object is not present in the SPECFIND PS source candidate samples. It would have been there with the AT20G or PKS90 catalogs and will certainly be there with the ASKAP EMU catalog. On the other hand, the five 3C sources (3C 48, 3C 49, 3C 138, 3C 277, and 3C 287) presented in Murgia et al. (1999) are all identfied as spectral break (325 MHz) objects by SPECFIND. In addition, the nine GPS sources presented in Stanghellini et al. (1997) are all identified as HFP sources (in the main catalog) or spectral break sources (at 1.4 GHz) by SPECFIND and 36 out of the 49 Parkes half-Jansky GPS radio galaxies (Snellen et al. 2002) are identified as GPS/MPS sources by SPECFIND 7 .
The fraction of false positives in our GPS/MPS source candidate samples is 20%. Up to about half of our MPS source candidates, which are not present in the Callingham et al. (2017) sample, or up to ∼25% of our MPS source candidate sample probably are flat spectrum sources with strong variability at frequencies in excess of 1 GHz. Since our PS source candidate samples are not complete, we warrant caution using these samples for population studies of PS sources.
The main result is presented in table "spectra" with every radio continuum source that have one entry row (see Fig. B.6 for the first rows of the VizieR table). The different objects/spectra are gathered by different sequence numbers (column "Seq"). All radio continuum sources with the same sequence number belong to the same object/spectrum. In the second column, the source name is given followed by the number of sources within the spectrum ("N"). The columns "a" and "b" are the spectral index and the abscissa of the spectral fit, respectively. The column "nu" contains the frequency of the catalog, followed by the flux density "S(nu)", its error "e" and the position of the source (Right Ascension and Declination). The column "SED" is a link to the spectral fit, which is the spectrum containing this source. The column "Radio+Opt" launches an Aladin Lite (Boch & Fernique 2014) 8 view of the source. The last column gives the beam size of the observation.
The "beam" table contains all 204 catalog tables from the 160 VizieR catalogs that were used in SPECFIND V3.0. It is similar to Table A.1. The "waste" table has the same general structure as "spectra", but contains measurements that were cross-identified by position but did not match the power-law spectrum. These points are included to the VizieR SED plot as well.
The spectral break sources are provided in two different tables. The table "ghzbreak_cand" contains all 3104 spectral break sources with turnover frequencies around 1.4 GHz including the 196 PS (concave/GPS) source candidates. The table "mhzbreak_cand" lists the 18 075 spectral break sources with turnover frequencies around 325 MHz including the 874 PS (concave/MPS/GMPS) sources. The structure of the tables is similar to the spectra table. The first column gives the running number "No". The sources belonging to the same spectrum have the same number. The second column provides the source name. The column "cat" gives the type (spectral break (sb), concave (conc), GPS/MPS (gps/mps)), followed by the column "resolved" including the information if the source is resolved (1) or not (0), the spectral slope and the abscissa associated with the source. In the next column, α low , the mean spectral slope of the lower-frequency part is given, followed by its error. In the next column, α high , the mean spectral slope of higherfrequency part is given, followed by its error. Then the frequency, the flux density and its error are listed. Lastly, the position (Right Ascension and Declination) and the beam/resolution are specified.

Summary and conclusions
We present a new version of SPECFIND, which has been successfully upgraded from 115 catalog tables in version 2.0 to version 3.0 with a final number of 204 processed catalog tables originating from 160 VizieR catalogs. 89 new catalog tables were ingested and two catalog tables were updated. The final number of resulting spectra was increased by more than a factor of three from ∼107 500 in version 2.0 to ∼340 000 in version 3.0. The number of objects with cross-identified sources was more than doubled from ∼600 000 in version 2.0 to ∼1.6 million in version 3.0. In comparison to the former version, more confused sources are separated into different objects. Nevertheless some spectra contain multiple sources and we note that the user should always check the spectra via the VizieR link to Aladin Lite within the catalog.
The 204 processed catalog tables span a wide frequency range between 16.7 MHz to 31 GHz, and cover major parts of the sky. The new low-frequency radio catalogs with large sky coverages represent important input data for SPECFIND V3.0. By applying SPECFIND on two subsamples of the catalogs with frequency cuts at 325 MHz and 1.4 GHz, 20 982 unique spectral break and 633 PS source candidates could be identified. A conservative estimate of the resolved PS sources fraction is about 30%. The comparison with the results of Callingham et al. (2017) showed that we could find about half of their PS sources in our spectral break sample due to our selection bias. About 23% of their PS sources could be consistently classified. The fraction of false positives in our PS candidate sample is estimated to be at maximum 20%. We encourage follow-up observations of these candidates to confirm their nature.
SPECFIND is based on the radio catalogs in VizieR, and as described, the results of SPECFIND are integrated back into VizieR as a value-added compilation with accompanying services for visualization and access to the data. Moreover, the cross-identified radio continuum sources will be ingested into the SIMBAD astronomical database. This is part of the global effort to make astronomy data interoperable, with the objective of enabling new science with combined data sets. SPECFIND is an example of interoperability based on standardization of spectral properties, and using this to unify data extracted from hundreds of individually published and heterogeneous catalogs. IVOA and CDS standards play an important role for combining data in SPECFIND, and we anticipate that interoperability of data over many spectral regimes and also for time domain data will open many new areas of research based on combined data.

Appendix B: Spectral break, GPS, and MPS sources
To identify spectral break source candidates, we divided the sample of catalog tables into two frequency parts to fit individual spectral slopes to each side of the turnover frequency. We created two catalog subsamples, one subsample below and one subsample above a selected frequency cut. The corresponding frequency of the cut is included in both subsamples to allow for cross-identification. The SPECFIND tool was applied on these subsamples individually. The two complementary spectra of the two subsamples were then combined into one object if they contained the same flux density measurement at the common frequency. The condition for the source classification can be found in Sect. 5.

B.1. Spectral break sources with peaks around 1.4 GHz
We let the SPECFIND tool find a spectral break around 1.4 GHz by dividing the sample of catalog tables into 128 catalog tables containing frequencies ≤ 1.4 GHz and 125 catalog tables containing frequencies ≥ 1.4 GHz. The overlapping catalog of both subsamples was mostly the NVSS. By doing so, we found a total number of 3104 sources with a spectral break: (1) 2908 spectral break source candidates (2) 110 concave spectrum source candidates (3) 86 GPS source candidates The 3104 spectral break source candidates have a median and semi-inter-quartile-range (SIQR) value of α high = −0.83 ± 0.54 on the higher-frequency range and α low = −0.34 ± 0.74 on the lower-frequency range. Since the convex sources were removed from the sample, the difference of the medians α high − α low = −0.49 is smaller than zero. Whereas the median slope at high frequencies is consistent with the mean slope obtained over the full frequency range (Vollmer et al. 2005a), the slope at lowfrequencies is significantly flatter. In Fig. B.1 we present the high and low-frequency spectral slopes of the entire two subsamples against each other, which means the spectral indices of the higher-frequency subsample (α high ) as a function of the spectral indices of the low-frequency subsample (α low ). Most of the spectral indices are located around the one-to-one relation and thus the spectral slopes of both halves of the spectrum are consistent with each other. We provide 15 examples of the 196 sources (from the marked region) in Tab. B.1 along with image cutouts (Fig. B.2). The full list of 3104 spectral break sources is available via the VizieR database.
Finally we show an example of the SPECFIND result for the GPS source candidates NVSS J133600+743755 compared to the result of a query for this source in the VizieR Photometry viewer 9 (Fig. B.3) being consistent with our result. The VizieR Photometry viewer plots all of the available photometric data points extracted from photometry-enabled catalogs in VizieR that fall within a given angular distance of a source position. It covers a much wider frequency range than considered for SPECFIND because it includes catalogs with measurements in other wavebands. The data shown in the Photometry viewer are converted automatically to consistent units using characterization metadata of the magnitude and flux density columns in VizieR tables, which are attached to the proper photometry filter and system (Allen et al. 2014). This automatic extraction of photometry measurements within a radius is different to the The low-frequency spectral slope is on the x-axis, the high-frequency spectral slope is on the y-axis. The diagonal line is the one-to-one relation. Most of the spectral indices are located around the one-to-one relation and thus the spectral slopes of both halves of the spectrum are consistent with each other. The square marks the classification range of concave and GPS source candidates. 196 sources (110 concave and 86 GPS source candidates) fall in this range. The mean error for all points is indicated in the upper left corner. more detailed selection criteria and cross matching provided by SPECFIND (as described in section 3). The result shown in the Photometry viewer provides a useful independent check, and the interactive viewer provides a way to explore how the radio data points compare to other wavebands. The example shown uses a 10 radius for the query, and we see that the Photometry viewer results for the radio frequency data points are in accordance with the SPECFIND V3.0 results.

B.2. Spectral break sources with peaks around 325 MHz
We let the SPECFIND tool find a spectral break around 325 MHz by dividing the sample of catalog tables into two subsamples. The first subsample contains 51 catalog tables with frequencies ≤ 325 MHz and the second subsample contains 168 catalog tables with frequencies ≥ 325 MHz. The overlapping catalogs are often the Texas Survey (TXS, Douglas et al. 1996), the Westerbork Northern Sky Survey (WENSS, Rengelink et al. 1997) or the Westerbork In the Southern Hemisphere (WISH, De Breuck et al. 2002). By doing so, we found a total number of 18 075 sources with a spectral break: (1) 17 201 spectral break source candidates (2) 327 concave source candidates (3) 547 MPS source candidates All 18 075 spectral break source candidates have a median and SIQR of the spectral slope of α high = −0.88 ± 0.26 on the higherfrequency part and α low = −0.44 ± 0.35 on the lower-frequency part. These numbers are consistent with those obtained for the sources with breaks at 1.4 GHz. In Figure B.4 we present the high and low-frequency spectral slopes of the two subsamples   Notes. Columns: "No" running number with the same number for sources of the same spectrum, "Name" provides the source name. The column "class" gives the type (spectral break (sb), concave (conc), GPS/MPS (gps/mps)). "α" and "b" are the spectral slope and the abscissa of the spectral fit associated with the source. "α low " is the mean spectral slope of the low-frequency part of the spectrum with its error "er", "α high " is the mean spectral slope of the high-frequency part of the spectrum with its error "er". "Freq" is the frequency, "S" the flux density and "S_e" its error. Lastly, the position (Right Ascension and Declination) and "res" the beam/resolution are specified. The last 6 columns refer to the source and survey given in the column "Name".
source candidates (within the marked region) in the appendix Tab. B.2 along with image cutouts in appendix Fig. B.5. The full table of all 18 075 spectral break source candidates is available via the VizieR database.

B.3. Uneven numbers of sources in both samples
We find roughly 6 times more sources with a spectral break around 325 MHz (18 075 sources) than sources with a break around 1.4 GHz (3104 sources). To investigate this difference in source numbers, we revisit the conditions that need to be fulfilled to identify spectral break sources via SPECFIND: -the selection frequency point, either 325 MHz or 1.4 GHz, is included in both subsample spectra of the source. -additionally to the selection frequency point, two independent frequency points must be found at each side of the peak frequency, which means each subsample has to contain three frequency points.  Table B.1 for the list of sources in each of the 15 spectra. More sources are available in the ViZieR catalog ghzbreak_cand.dat at the CDS.
Thus, to find a spectral break source, we need to find at least five different flux density measurements, which means five different catalog tables in total. To evaluate possible biases, we consider Fig. 2. Even though we have a similar number of catalog tables at frequencies above and below 1.4 GHz, the lowfrequency catalog tables cover a large fraction of the sky and contain a large number of sources. To show this quantitatively, Table B.3 presents the total number of sources for six different frequency bins. There are roughly six times more sources at frequencies smaller than 1.4 GHz than at higher frequencies. This trend is different for the frequency cut at 325 MHz: the number of sources at frequencies higher than 325 MHz is twice as high as the number of sources at frequencies lower than 325 MHz. Therefore, for the frequency cut of the catalogs at 1.4 GHz, the probability is much higher (nine times more sources) to find a corresponding high-frequency data point than to find a lowfrequency data point. This unequal number is probably the reason for a smaller total number of spectral break sources. At the frequency cut at 325 MHz, the probability to find a lowfrequency data point is only slightly smaller (2/3 times the sources number) than to find a high-frequency data point. This is probably the reason for the many more spectral break sources around 325 MHz.

B.4. Biases, false positive detections and number of compact sources -Comparison to other PS source samples
There are many studies about spectral break sources, providing new catalogs of candidates or confirming the spectral classification. This is done using new and/or existing public catalogs, like the WENSS catalog (Snellen et al. 1998) or by observing a sample of candidates at multiple frequencies. A recent study finds 261 GPS sources with frequency turnovers between 841 MHz and 1.4 GHz and additionally 1222 spectral break sources with turnover frequencies between 72 and 944 MHz within the GLEAM catalog ). These authors determined α low and α high by fitting a power law (i) to the 20 GLEAM flux densities and (ii) to the SUMSS and/or NVSS flux density point(s) together with the two central GLEAM flux density points (at 189 MHz and 212 MHz). They classified objects with α low ≥ 0.1 and α high > 0.1 as GPS sources. In another recent study, Sotnikova et al. (2019) found 164 GPS sources based on observations of the candidate sample from Mingaliev et al. (2013). We compared our results to these two recent studies, by first creating a combined sample of our two spectral break samples. With 18 075 spectral break sources around 325 MHz and 3104 around 1.4 GHz, we performed an internal cross-match and found 1310 sources that are included in both samples. This means that the peak frequency is not well defined for a significant number of GPS/MPS source candidates. The total number of unique spectral break source candidates is 19 869. To compare these sources with the Callingham et al. (2017) sample, we determined the amount of GLEAM sources in our sample. A sum of 9611 GLEAM sources is present in our combined sample around 1.4 GHz and 325 MHz. Comparing these GLEAM sources to the 1483 sources of the combined spectral break and GPS samples of Callingham et al. (2017), we found 697 matching sources (47%). Compared to their 261 GPS sources, we found 117 sources (45%) in our combined sample. From these 117 sources, 49 are classified by us as GPS/MPS sources and 11 as concave spectrum sources (51%).
To investigate the relatively low number of matches and similar classifications, we inspected 50 Callingham et al. (2017) GPS sources by eye. We found 44% of the sources in our The low-frequency spectral slope is on the x-axis, the high-frequency spectral slope is y-axis. The diagonal line is the one-to-one relation. Most of the spectral indices are slightly below the one-to-one relation. The entire sample shows a median and semi-inter-quartile-range (SIQR) value of the spectral slope of α high = −0.83 ± 0.24 on the higherfrequency part and α low = −0.61 ± 0.31 on the lower-frequency part. The square marks the classification range of concave source candidates. In this region, 874 sources (327 concave and 547 MPS source candidates) are located. The mean error for all points is indicated in the upper left corner.
sample. Within these sources we have 50% matching classifications. The main reasons for the differences are -SPECFIND missed 10% of the objects -SPECFIND ignored a flux density measurement -there were not enough flux density measurements at independent frequencies -the peak flux was located above 3 GHz -objects were HFP sources with positive spectral slopes SPECFIND missed GPS sources because for frequencies below 500 MHz the only flux density measurement comes from GLEAM 10 and either there are less than three flux density measurements at independent frequencies above 500 MHz or there are three measurements that could not be fitted by a power law.
The percentage of common sources between our 325/1400 MHz break sample and the sample of Callingham et al. (2017) is caused by the fact that SPECFIND needs at least 5 flux density measurements at independent frequencies together with inhomogeneous coverages and sensitivities of the input catalogs. We call this effect the catalog selection bias. Within the sample of common sources, a similar classification is found for ∼ 40% of the sources. We call this the PS classification bias.
We analyzed possible false positive detections within our common GPS/MPS sample by investigating sources of our sample, which are present in the GLEAM catalog but not present  In some cases the distance between the GLEAM position and those of the other surveys is larger than 10 but smaller than 20 . Based on the NVSS images and  The title of each spectrum shows the source name and its classification: conc: concave spectrum source or mps: megahertz-peaked spectrum source. See Table B.2 for the list of sources in each of the 15 spectra. More sources are available in the ViZieR catalog mhzbreak_cand.dat at the CDS.
Moreover, we found that out of 300 MPS and 47 GPS source candidates that have a GLEAM frequency measurement in our samples, in total 108 (out of 300) MPS source candidates and 33 (out of 47) GPS source candidates do not have assigned a spectral index in the GLEAM catalog ). An inspection of our MPS and GPS source candidates, which are absent in the Callingham et al. (2017) PS samples and have a GLEAM spectral index α > 0.1, by eye did not show obvious classification problems. Callingham et al. (2017) derived their own spectral indices and could determine them for 96698 unresolved GLEAM sources with δ ≥ −80 • and S 200 MHz,wide ≥ 0.16 Jy instead of 95568 sources with spectral indices selected by the same criteria in the GLEAM catalog. Based on these numbers we do not expect that many of the cross-identified GLEAM sources without a spectral index from the GLEAM catalog possess a spectral index derived by Callingham et al. (2017).  which is the condition to identify a PS source. An inspection of these eight sources with the VizieR photometric viewer did not show any classification problem. We conclude that the absence of many of our MPS and GPS source candidates in the Callingham et al. (2017) PS source samples is due to the selection criteria of Callingham et al. (2017) and our classification of most of our MPS and GPS sources is probably correct. Furthermore, about half of the potentially MPS sources, which are false positive have a flat GLEAM spectral index of −0.3 < α < 0.1. These sources are probably variable flat spectrum sources, where the variability occurs at frequencies higher than 1 GHz as the sources J1258+2820 and J1616+4632 in Dallacasa & Orienti (2016). One of these sources is QSO B1102-24, classified as QSO and blazar at a redshift of z = 1.66. This radio continuum source is strongly variable at frequencies 1 GHz (Fig. B.7). We think that these sources can well be classified as PS sources. Based on the number of MPS/GPS sources that are absent in the Callingham et al. (2017) PS samples with α < −0.3 we estimate the percentage of false positives to be 20%.
The 164 GPS sources from Sotnikova et al. (2019) have peak frequencies in the range of 200 MHz to 25 GHz. With 99 of these sources that have peak frequencies above 2 GHz, this sample is biased toward higher frequencies. Since we are mostly sensitive to turnover frequencies around the selection frequencies of 325 MHz and 1.4 GHz, we compare to the 65 GPS sources with peak frequencies below 2 GHz and found 45 sources (69%) in our combined sample of 19 869 spectral break sources around 1.4 GHz and 325 MHz (catalog selection bias). We classified 21 sources (47%) as GPS/MPS/conc sources (PS classification bias).
We conclude from the comparisons with both studies from above that our PS source detection rate is about 50% due to our catalog selection bias. In addition, our PS source classification rate among the PS source detections is about 50%. We estimate the fraction of false positives in our sample of PS source candidates to be at maximum 20%.
By investigating the peak frequencies of the overlapping sources from both studies from above, we can identify a turnover frequency range, where our method is most sensitive. In the 1.4 GHz sample, we find mainly spectral break sources with turnover frequencies around 1.4 ± 0.5 GHz. Additionally, sources were found that have turnover frequencies in the megahertz regime and in the higher-frequency regime (up to 15 GHz).
In the 325 MHz sample we find mainly sources with turnover frequencies around 325 ± 175 MHz with additional sources found, which show turnover frequencies below 150 MHz and above 500 MHz.
One defining feature of PS sources is that the majority of the sources are unresolved unless observations have milliarcsecond scale resolution (O'Dea & Saikia 2021). Due to our limited input catalog tables, we are not able to classify the sources based on their angular scale. Instead, we used the four catalogs with the highest resolution to investigate the compactness of the sources. To do so, we used the FIRST14, CRATES, CLASS and TXS catalogs with spatial resolutions ≤ 6 . The criteria for unresolved sources are: FIRST14 -column fMaj<= 6 , CRATES -column Morph=P; CLASS -column b/a=0; TXS -column Struct=P. The numbers of resolved sources are presented in Table B.4. Only two CRATES and no FIRST14 resolved PS-source candidates were found. Moreover, only one TXS PS-source candidate with a peak around 1 GHz is resolved. On the other hand, about 35% of (i) the TXS PS-source candidates with a peak around 100 MHz and (ii) the CLASS PS-source candidates are resolved. Since the criteria in the CLASS and TXS catalogs are less stringent than those of the FIRST14 and CRATES catalogs, we decided to keep their PS classification. With the column "resolved," we flag sources in our online tables of spectral break source candidates. A conservative estimate of the resolved PS sources fraction is about 30%.