A&A 482, 483-498 (2008)
DOI: 10.1051/0004-6361:20079222

Cluster analyses of gigahertz-peaked spectrum sources with self-organising maps^,

I. Torniainen¹ - M. Tornikoski¹ - M. Turunen¹ - M. Lainela² - A. Lähteenmäki¹ - T. Hovatta¹ - M. G. Mingaliev³ - M. F. Aller⁴ - H. D. Aller⁴

1 - TKK Helsinki University of Technology, Metsähovi Radio Observatory, Metsähovintie 114, 02540 Kylmälä, Finland
2 - Tuorla Observatory, University of Turku, Väisäläntie 20, 21500 Piikkiö, Finland
3 - Special Astrophysical Observatory, Russian Academy of Sciences, Nizhnij Arkhyz, Karachaevo-Cherkesia, 369167 Russia
4 - Department of Astronomy, University of Michigan, Ann Arbor, MI, 48109, USA

Received 9 December 2007 / Accepted 24 January 2008

Abstract
Context. Gigahertz-peaked spectrum (GPS) sources and high frequency peakers (HFPs) are among the smallest of active galactic nuclei currently believed to represent the earliest phases in the evolution of extragalactic radio sources. Recently there has been evidence of contamination by other types of radio sources among the GPS and HFP samples, but the confirmed GPS sources or HFPs also seem to form a very heterogeneous population.
Aims. We study the statistical clustering of the GPS sources and the HFPs by taking as many source parameters as possible to find homogeneous groups among the sources. We expect the clustering to give us insight into the physical parameters that play a role in different source populations.
Methods. We have collected a sample of 206 GPS sources and HFPs from the literature and gathered a massive database of various source properties, such as the redshift, the size, the polarization, the magnitudes, and the properties of the radio continuum. To visualize and to cluster these multidimensional data we used self-organising maps (SOM), which are neural networks trained by an unsupervised algorithm. We have classified the sources with an auxiliary classification to trace the locations of different types of radio continuum spectra on the map.
Results. The sources form distinctive clusters on the map, which is supported by the accordant organisation of the non-numerical parameters not used in the analysis, such as the radio morphology and the optical identification. Our results confirm that the blazars contaminating the GPS and the HFP samples are physically different from the genuine GPS sources and HFPs, and they should be excluded from the samples. The genuine GPS sources form various clusters, which indicates the existence of different subpopulations, besides the expected galaxy-quasar dualism.

Key words: galaxies: active - galaxies: quasars: general - radio continuum: galaxies

1 Introduction

The gigahertz-peaked spectrum (GPS) sources and the high frequency peakers (HFPs) are a heterogenous group of compact (linear size LS < 1 kpc) extragalactic objects. They are active galactic nuclei (AGN) with convex radio continuum spectra, peaking in the GHz frequencies (GPS sources) or higher (HFPs). They can be divided into two types: galaxies and quasars. The galaxy-type sources are found at lower redshifts ( 0.1 < z < 1, O'Dea et al. 1996) and are less variable than the quasar-type sources. They also have lower turnover frequencies than the GPS quasars and exhibit symmetric VLBI morphologies, whereas the quasars usually have complex or core-jet morphologies (Stanghellini et al. 1997,2001).

The currently favoured view is that the galaxy-type GPS sources and HFPs are intrinsically small due to their young age (e.g., Phillips & Mutel 1980,1982; Polatidis & Conway 2003) and that quasar-type sources are large-scale radio sources which appear small due to a projection effect (e.g., Stanghellini 2003). VLBI observations have revealed extended emission around some GPS sources - both galaxies and quasars. The galaxy-type sources may be explained by recurrent activity of the nucleus (e.g. Baum et al. 1990; Stanghellini et al. 1990). This was, nonetheless, found probable for only one source out of a sample of six GPS sources associated with extended emission (Stanghellini et al. 2005). The quasar-type core-jet sources associated with extended emission are most likely truly extended sources at such high redshifts that most of the large-scale structures are below detection limits (Stanghellini et al. 2005). However, there are also quasar-type GPS sources with symmetric VLBI morphology, and they may be intrinsically small and young sources.

Another explanation for the small size of the galaxy-type sources, i.e. confinement by a dense ambient medium, has been suggested by, e.g., Baum et al. (1990) and O'Dea et al. (1991). In this scenario, the source is old and has remained small in size due to external pressure that prevents the radio lobes from growing beyond the galactic center. Recently, there have not been any studies supporting this view.

In our previous papers we searched for high peaking GPS sources and studied some of the known ones. In the first paper (Tornikoski et al. 2001, hereafter Paper I) we identified several new southern high peaking sources and found variability in some of the known ones, and in the second paper (Torniainen et al. 2005, hereafter Paper II) we widened our study to the northern hemisphere and found mild to extreme variability in all the known GPS, CSS, or HFP sources monitored in Metsähovi. The change in the shape of the spectrum differed from source to source: a minority of sources maintained their convex shape and the peak frequency independent of the state of the activity, but for most of the sources the shape changed from flat to inverted as the activity increased. Majority of the sources in Paper II were quasars, and it was obvious that the quasar-type GPS samples were severly contaminated by blazars. This inspired us to study the possible contamination of GPS galaxies. For the third paper (Torniainen et al. 2007, hereafter Paper III), we collected a sample of 96 galaxy-type GPS sources and gathered as much radio data for them as possible. After studying their overall radio spectra, and the spectral and variability indices, we found that for only a third of the sources the GPS classification is well-grounded. For a third there are not enough data for firm conclutions, and a third of the sources were flat- or inverted-spectrum sources.

Recently, also Labiano et al. (2007a) produced a new master list of GPS sources, where some sources exhibiting strong radio variability had been excluded. This new list does not, however, include the findings of Papers II and III.

It has also been confirmed by other studies (e.g. Orienti et al. 2007; Tinti et al. 2005) that there are different populations among the GPS sources and the HFPs identified in the literature, some of them having truly constant convex spectral shape and some of them having only temporarily inverted shape of the spectrum.

Most of the GPS samples in the literature have been selected by combining datapoints from different catalogues originating from different epochs and picking up sources with convex radio continuum spectrum. This has been done without paying much attention on the effect of variability when using non-simultaneous datapoints. Also single epoch multifrequency observations or observations spanning only a couple of years have been used. These approaches have proved to generate very heterogeneous samples since a peaked spectrum can be caused by several different effects. On the basis of the sample contamination, the most severe cases are variable flat-spectrum sources observed when one flaring component dominates the spectrum and creates temporarily inverted spectrum that lasts even for months or years. There are also consistently convex-spectrum sources with high variability as well as sources with virtually no flux density variations. There are galaxies and quasars, compact symmetric objects (CSOs), and core-jet or complex VLBI morphologies. Some sources are detected in the X-rays or the $\gamma$ -rays or both, some remain undetected. For the majority of the sources, there is a very limited amount of continuum observations in the radio band as well as in other parts of the spectrum, not to mention information on other source properties, e.g., the emission lines, the size, and the column densities.

There seems to be no clear and simple common factor present in our sample of genuine GPS sources (Paper II; Paper III): there are both quasars and galaxies, CSOs and core-jets, and variable and non-variable sources. Intrigued by this variety of objects classified as GPS sources, and craving for clarity about their physical nature, we wanted to take as many parameters as possible into account and run neural clustering analyses for a complete sample of GPS sources.

A self-organized map (SOM) is an unsupervised neural network, used, for example, for visualization of multidimensional data, classification and clustering. The algorithm tries to place the objects on a multidimensional map so that the Euclidian distance of the parameter vectors of similar objects is minimized. In astronomy, neural networks have been used mainly for classification of objects (e.g., Brett et al. 2004; Miller & Coe 1996; Rajaniemi & Mähönen 2002).

In Sect. 2, we present the sample and the collection and processing of the data. In Sect. 3, the SOM analyses are presented in detail. The results are presented and discussed in Sect. 4, and the conclusions are given in Sect. 5.

2 Sample and data

We collected a sample of 206 GPS and HFP sources for a detailed study of the underlying populations of various kinds of sources, both among all the GPS sources identified in the literature and the genuine GPS sources with constant shape and peak of the spectrum.

The sources in the sample and their references are listed in Table 1. Some sources from these papers were left out since there were no coordinates for them or it was likely that there would not be a sufficient number of parameters for them to be used in the analyses.

Table 1: The sources and their classification.
Source ID Ref. Origcl Ref. Aux. Class Cluster

(1) (2) (3) (4) (5) (6) (7)

B0000+212 GAL 13 HFP 2 gps 6

B0002+051 QSO 13 HFP 2 gps 10

B0018+729 GAL 16 GPS 16 s 1

B0019-000 GAL 16 GPS 16 n 1

B0022-423 GAL 5 GPS 5 gps 5

B0026+346 GAL 7 GPS 7 gps 7

B0034+078 GAL 13 HFP 2 gps 10

B0039+230 EF 7 GPS 7 f 13

B0048-097 BLO 20 GPS 20 idb 12

B0105-122 GAL 15 GPS 15 n 5

B0108+388 GAL 16 GPS 16 gps 16

B0113+241 EF 13 HFP 2 f 12

B0116+319 GAL 12 GPS 12 gps 7

B0144+209 EF 7 GPS 7 gps 5

B0153+744 LPQ 6 GPS 11 c,v 14

B0159+839 QSO 5 GPS 5 f 3

B0201+113 QSO 7 GPS 7 gps 14

B0204-306 GAL 15 GPS 15 n 2

B0207-224 GAL 15 GPS 15 n 5

B0208+040 GAL 15 GPS 15 f/s 5

B0215+015 HPQ 21 HFP 2 idb 14

B0218+357 BLO 7 GPS 7 f,v 14

B0237-233 QSO 17 GPS 17 gps 14

B0238-084 GAL 5 GPS 5 idb 1

B0240-217 GAL 15 GPS 15 f/s 1

B0248+430 LPQ 17 GPS 17 f,v 14

B0316+162 GAL 16 GPS 16 f/s 2

B0320+053 GAL 15 GPS 15 f/s 1

B0326+349 QSO 3 HFP 2 f 3

B0332-403 HPQ 20 GPS 20 f,v 15

B0354+231 QSO 13 HFP 2 f,v 12

B0359-294 GAL 15 GPS 15 f/S 2

B0400+258 QSO 7 GPS 7 f,v 15

B0404+768 GAL 16 GPS 16 gps 14

B0405-280 GAL 15 GPS 15 n 2

B0405-395 GAL 15 GPS 15 s 2

B0424+328 GAL 13 HFP 2 gps 6

B0428+205 GAL 16 GPS 16 gps 7

B0431-026 GAL 15 GPS 15 f/s 4

B0437-454 QSO 5 GPS 5 f 12

B0439-337 GAL 15 GPS 15 gps 2

B0454-088 GAL 15 GPS 15 f/s 2

B0454-234 HPQ 20 GPS 20 f,v 15

B0457+024 QSO 17 GPS 17 gps 16

B0500+019 GAL 15 GPS 15 gps 11

B0507+179 QSO 10 GPS 5 f 1

B0516+087 EF 13 HFP 2 f 12

B0528+134 LPQ,BLO 16 GPS 16 idb 14

B0528-250 QSO 5 GPS 5 gps 15

B0537-441 BLO 20 GPS 20 f/s,v 15

B0552+398 QSO 24 GPS 17 gps,v 11

B0554-026 GAL 16 GPS 16 n 2

B0602+780 GAL 16 GPS 16 n 2

B0621+446 BLO 2 HFP 2 f 3

B0633+595 EF 13 HFP 2 f 15

B0636+680 QSO 2 HFP 2 gps 16

B0642+449 LPQ 13 HFP 2 gps,v 16

B0646+600 QSO 2 HFP 2 f,v 12

B0651+410 GAL 2 HFP 2 c 1

B0700+470 GAL 14 GPS 14 f/s 4

B0703+468 QSO 1 GPS 16 gps 5

B0706+460 14 GPS 14 gps,v 5

B0710+439 GAL 16 GPS 16 gps 16

B0711+356 QSO 7 GPS 7 gps,v 10

B0718+374 QSO 13 HFP 2 gps 10

B0738+313 QSO 17 GPS 17 c 14

B0741-063 GPS 18 gps 5

B0742+103 GAL 17 GPS 17 gps 14

B0743-006 QSO 8 GPS 17 gps 11

B0802+103 QSO 7 GPS 7 s 5

B0858-279 QSO 5 GPS 5 c 15

B0902+490 QSO 5 GPS 5 f/s 9

B0904+039 GAL 23 GPS 7 n 2

B0910+151 GAL 15 GPS 15 n 1

B0914+114 GAL 16 GPS 16 s 5

B0923+392 QSO 2 HFP 2 gps,v 11

B0930+493 QSO 14 GPS 14 c 9

B0941-080 GAL 17 GPS 17 f/s 1

B1013+054 QSO 13 HFP 2 f 12

B1031+567 GAL 16 GPS 16 gps 7

B1039+811 LPQ 10 GPS 9 f,v 12

B1042-269 GAL 15 GPS 15 n 2

B1043+066 QSO 2 HFP 2 c 10

B1054+004 GAL 15 GPS 15 f/s 4

B1057-797 QSO 20 GPS 20 c 9

B1100+223 EF 5 GPS 5 n 10

B1107+109 GAL 15 GPS 15 n 4

B1107-187 GAL 15 GPS 15 n 1

B1117+146 GAL 17 GPS 17 f/s 7

B1118-056 EF 5 GPS 5 f 12

B1120-274 GAL 15 GPS 15 n 8

B1127-145 QSO 17 GPS 17 c,v 14

B1132-000 GAL 15 GPS 15 f/s 4

B1133+432 EF 23 GPS 14 gps 5

B1143-245 QSO 17 GPS 17 gps 16

B1144+352 GAL 14 GPS 14 f 1

B1144+542 QSO 14 GPS 14 f 15

B1146+531 QSO 2 HFP 2 f 15

B1148-171 QSO 7 GPS 7 f 12

B1200+045 GAL 15 GPS 15 f 4

B1225+368 QSO 14 GPS 14 gps 5

B1245-197 QSO 17 GPS 17 c 14

B1323+321 GAL 16 GPS 16 gps 7

B1323+799 GAL,QSO 10 GPS 9 f 9

B1324+574 GAL 14 GPS 14 gps 8

B1333+459 QSO 2 HFP 2 c 15

B1334-127 HPQ 20 GPS 20 c,v 15

B1343-300 GAL 15 GPS 15 s 2

B1345+125 GAL 16 GPS 16 c 7

B1347-218 GAL 15 GPS 15 n 4

B1349+027 GAL 15 GPS 15 s 4

B1349-439 BLO 20 GPS 20 c 12

B1350+113 GAL 15 GPS 15 f/s 5

B1354-174 GAL 7 GPS 7 f 9

B1355+441 GAL 14 GPS 14 gps 11

B1357+769 QSO 10 GPS 9 f 12

B1358+624 GAL 16 GPS 16 gps 8

B1404+286 GAL 16 GPS 16; 2 gps,v 6

B1410+138 EF 13 HFP 2 c 2

B1422+231 QSO 13 HFP 2 f 14

B1427+109 QSO 2 HFP 2 gps 16

B1433-040 GAL 16 GPS 16 n 5

B1442+101 QSO 17 GPS 17 n 14

B1444-339 GAL 15 GPS 15 n 2

B1455+080 BLO 13 HFP 2 f,v 12

B1502+036 QSO 2 HFP 2 c 3

B1503-091 GAL 15 GPS 15 n 5

B1509+054 GAL 13 HFP 2 gps 6

B1518+046 QSO 17 GPS 17 gps 5

B1519-273 BLO 10 GPS 5 c,v 15

B1526+670 QSO 2 HFP 2 gps,v 16

B1540-077 GAL 15 GPS 15 n 1

B1543+005 GAL 15 GPS 15 n 14

B1545-120 GAL 15 GPS 15 n 5

B1548-302 GPS 7 n 4

B1553-062 GAL 15 GPS 15 f/s 2

B1557-004 GAL 15 GPS 15 n 5

B1600+335 GAL 16 GPS 16 c 3

B1601+112 BLO 13 HFP 2 f,v 12

B1601-222 GAL 15 GPS 15 gps 6

B1604+315 GAL 16 GPS 16 c 9

B1607+268 GAL 16 GPS 16 gps,v 11

B1614+051 QSO 2 HFP 2 gps 16

B1622+665 GAL 2 HFP 2 gps 6

B1638+124 GAL 15 GPS 15 c 7

B1645+635 QSO 2 HFP 2 f 13

B1646+028 GAL 15 GPS 15 f/s 5

B1714+193 QSO 13 HFP 2 f 4

B1726+769 QSO 10 GPS 9 f 4

B1732+094 GAL 15 GPS 15 gps 6

B1734+508 GAL 2 HFP 2 c 10

B1749+096 BLO 2 HFP 2 idb 12

B1751+278 GAL 5 GPS 5 n 2

B1758+388 QSO 2 HFP 2 gps 15

B1803+784 BLO 10 GPS 9 f,v 12

B1807+170 BLO 13 HFP 2 f 12

B1824+271 GAL 23 GPS 5 s,v 1

B1839+389 QSO 2 HFP 2 c 15

B1843+356 GAL 16 GPS 16 gps,v 16

B1848+283 QSO 2 HFP 2 gps 16

B1851+488 QSO 5 GPS 5 f 12

B1853+376 GAL 13 HFP 2 c 10

B1934-638 GAL 5 GPS 5 gps 6

B1936-155 HPQ 19 GPS 19 f,v 15

B1954-388 HPQ 20 GPS 20 c 12

B2000-330 QSO 5 GPS 5 c 16

B2007+777 BLO 10 GPS 9 f,v 12

B2008-068 GAL 16 GPS 16 gps 5

B2008-159 LPQ 22 GPS 19 gps,v 15

B2019+050 GAL 13 HFP 2 gps 11

B2021+614 GAL 6 GPS 11 gps 7

B2022+171 LPQ 13 HFP 2 gps,v 14

B2050+364 GAL 5 GPS 5 c 2

B2053-201 GAL 7 GPS 7 s 3

B2055+055 GAL 15 GPS 15 f/s 5

B2059+034 QSO 2 HFP 2 f 12

B2112+283 EF 13 HFP 2 c 15

B2121+053 QSO 2 HFP 2 idb 15

B2121-014 GAL 15 GPS 15 s 2

B2126-158 QSO 17 GPS 17 gps 16

B2126-185 QSO 7 GPS 7 n 4

B2128+048 GAL 17 GPS 17 gps 10

B2128-123 LPQ 20 GPS 20 c 15

B2134+004 LPQ 17 GPS 17; 2 gps 11

B2136+141 LPQ 20 GPS 20 c 14

B2149+056 GAL 15 GPS 15 gps 11

B2153-119 S 4 GPS 7 n 4

B2154-183 QSO 7 GPS 7 n 5

B2201+098 GAL 13 HFP 2 c 5

B2205+166 QSO 13 HFP 2 f 12

B2209+236 QSO 13 HFP 2 f,v 12

B2210+016 GAL 17 GPS 17 f/s 7

B2236+124 QSO 5 GPS 5 f 12

B2254+024 QSO 2 HFP 2 f 13

B2254-204 BLO 20 GPS 20 c 9

B2255-282 LPQ 20 GPS 20 idb 15

B2318+049 QSO 2 HFP 2 c 12

B2322-040 GAL 16 GPS 16 gps 5

B2323+790 GAL 5 GPS 5 n,v 2

B2327+335 QSO 2 HFP 2 c 12

B2333-528 GAL 5 GPS 5 f/s 5

B2337+264 GAL 5 GPS 5 gps 16

B2337-063 GAL 15 GPS 15 f/s 5

B2342+821 QSO 16 GPS 16 gps 8

B2352+495 GAL 17 GPS 17 gps 7

B2353+816 BLO 10 GPS 9 f 4

**Table 1:** The sources and their classification.
Source	ID	Ref.	Origcl	Ref.	Aux. Class	Cluster
(1)	(2)	(3)	(4)	(5)	(6)	(7)
B0000+212	GAL	13	HFP	2	gps	6
B0002+051	QSO	13	HFP	2	gps	10
B0018+729	GAL	16	GPS	16	s	1
B0019-000	GAL	16	GPS	16	n	1
B0022-423	GAL	5	GPS	5	gps	5
B0026+346	GAL	7	GPS	7	gps	7
B0034+078	GAL	13	HFP	2	gps	10
B0039+230	EF	7	GPS	7	f	13
B0048-097	BLO	20	GPS	20	idb	12
B0105-122	GAL	15	GPS	15	n	5
B0108+388	GAL	16	GPS	16	gps	16
B0113+241	EF	13	HFP	2	f	12
B0116+319	GAL	12	GPS	12	gps	7
B0144+209	EF	7	GPS	7	gps	5
B0153+744	LPQ	6	GPS	11	c,v	14
B0159+839	QSO	5	GPS	5	f	3
B0201+113	QSO	7	GPS	7	gps	14
B0204-306	GAL	15	GPS	15	n	2
B0207-224	GAL	15	GPS	15	n	5
B0208+040	GAL	15	GPS	15	f/s	5
B0215+015	HPQ	21	HFP	2	idb	14
B0218+357	BLO	7	GPS	7	f,v	14
B0237-233	QSO	17	GPS	17	gps	14
B0238-084	GAL	5	GPS	5	idb	1
B0240-217	GAL	15	GPS	15	f/s	1
B0248+430	LPQ	17	GPS	17	f,v	14
B0316+162	GAL	16	GPS	16	f/s	2
B0320+053	GAL	15	GPS	15	f/s	1
B0326+349	QSO	3	HFP	2	f	3
B0332-403	HPQ	20	GPS	20	f,v	15
B0354+231	QSO	13	HFP	2	f,v	12
B0359-294	GAL	15	GPS	15	f/S	2
B0400+258	QSO	7	GPS	7	f,v	15
B0404+768	GAL	16	GPS	16	gps	14
B0405-280	GAL	15	GPS	15	n	2
B0405-395	GAL	15	GPS	15	s	2
B0424+328	GAL	13	HFP	2	gps	6
B0428+205	GAL	16	GPS	16	gps	7
B0431-026	GAL	15	GPS	15	f/s	4
B0437-454	QSO	5	GPS	5	f	12
B0439-337	GAL	15	GPS	15	gps	2
B0454-088	GAL	15	GPS	15	f/s	2
B0454-234	HPQ	20	GPS	20	f,v	15
B0457+024	QSO	17	GPS	17	gps	16
B0500+019	GAL	15	GPS	15	gps	11
B0507+179	QSO	10	GPS	5	f	1
B0516+087	EF	13	HFP	2	f	12
B0528+134	LPQ,BLO	16	GPS	16	idb	14
B0528-250	QSO	5	GPS	5	gps	15
B0537-441	BLO	20	GPS	20	f/s,v	15
B0552+398	QSO	24	GPS	17	gps,v	11
B0554-026	GAL	16	GPS	16	n	2
B0602+780	GAL	16	GPS	16	n	2
B0621+446	BLO	2	HFP	2	f	3
B0633+595	EF	13	HFP	2	f	15
B0636+680	QSO	2	HFP	2	gps	16
B0642+449	LPQ	13	HFP	2	gps,v	16
B0646+600	QSO	2	HFP	2	f,v	12
B0651+410	GAL	2	HFP	2	c	1
B0700+470	GAL	14	GPS	14	f/s	4
B0703+468	QSO	1	GPS	16	gps	5
B0706+460		14	GPS	14	gps,v	5
B0710+439	GAL	16	GPS	16	gps	16
B0711+356	QSO	7	GPS	7	gps,v	10
B0718+374	QSO	13	HFP	2	gps	10
B0738+313	QSO	17	GPS	17	c	14
B0741-063			GPS	18	gps	5
B0742+103	GAL	17	GPS	17	gps	14
B0743-006	QSO	8	GPS	17	gps	11
B0802+103	QSO	7	GPS	7	s	5
B0858-279	QSO	5	GPS	5	c	15
B0902+490	QSO	5	GPS	5	f/s	9
B0904+039	GAL	23	GPS	7	n	2
B0910+151	GAL	15	GPS	15	n	1
B0914+114	GAL	16	GPS	16	s	5
B0923+392	QSO	2	HFP	2	gps,v	11
B0930+493	QSO	14	GPS	14	c	9
B0941-080	GAL	17	GPS	17	f/s	1
B1013+054	QSO	13	HFP	2	f	12
B1031+567	GAL	16	GPS	16	gps	7
B1039+811	LPQ	10	GPS	9	f,v	12
B1042-269	GAL	15	GPS	15	n	2
B1043+066	QSO	2	HFP	2	c	10
B1054+004	GAL	15	GPS	15	f/s	4
B1057-797	QSO	20	GPS	20	c	9
B1100+223	EF	5	GPS	5	n	10
B1107+109	GAL	15	GPS	15	n	4
B1107-187	GAL	15	GPS	15	n	1
B1117+146	GAL	17	GPS	17	f/s	7
B1118-056	EF	5	GPS	5	f	12
B1120-274	GAL	15	GPS	15	n	8
B1127-145	QSO	17	GPS	17	c,v	14
B1132-000	GAL	15	GPS	15	f/s	4
B1133+432	EF	23	GPS	14	gps	5
B1143-245	QSO	17	GPS	17	gps	16
B1144+352	GAL	14	GPS	14	f	1
B1144+542	QSO	14	GPS	14	f	15
B1146+531	QSO	2	HFP	2	f	15
B1148-171	QSO	7	GPS	7	f	12
B1200+045	GAL	15	GPS	15	f	4
B1225+368	QSO	14	GPS	14	gps	5
B1245-197	QSO	17	GPS	17	c	14
B1323+321	GAL	16	GPS	16	gps	7
B1323+799	GAL,QSO	10	GPS	9	f	9
B1324+574	GAL	14	GPS	14	gps	8
B1333+459	QSO	2	HFP	2	c	15
B1334-127	HPQ	20	GPS	20	c,v	15
B1343-300	GAL	15	GPS	15	s	2
B1345+125	GAL	16	GPS	16	c	7
B1347-218	GAL	15	GPS	15	n	4
B1349+027	GAL	15	GPS	15	s	4
B1349-439	BLO	20	GPS	20	c	12
B1350+113	GAL	15	GPS	15	f/s	5
B1354-174	GAL	7	GPS	7	f	9
B1355+441	GAL	14	GPS	14	gps	11
B1357+769	QSO	10	GPS	9	f	12
B1358+624	GAL	16	GPS	16	gps	8
B1404+286	GAL	16	GPS	16; 2	gps,v	6
B1410+138	EF	13	HFP	2	c	2
B1422+231	QSO	13	HFP	2	f	14
B1427+109	QSO	2	HFP	2	gps	16
B1433-040	GAL	16	GPS	16	n	5
B1442+101	QSO	17	GPS	17	n	14
B1444-339	GAL	15	GPS	15	n	2
B1455+080	BLO	13	HFP	2	f,v	12
B1502+036	QSO	2	HFP	2	c	3
B1503-091	GAL	15	GPS	15	n	5
B1509+054	GAL	13	HFP	2	gps	6
B1518+046	QSO	17	GPS	17	gps	5
B1519-273	BLO	10	GPS	5	c,v	15
B1526+670	QSO	2	HFP	2	gps,v	16
B1540-077	GAL	15	GPS	15	n	1
B1543+005	GAL	15	GPS	15	n	14
B1545-120	GAL	15	GPS	15	n	5
B1548-302			GPS	7	n	4
B1553-062	GAL	15	GPS	15	f/s	2
B1557-004	GAL	15	GPS	15	n	5
B1600+335	GAL	16	GPS	16	c	3
B1601+112	BLO	13	HFP	2	f,v	12
B1601-222	GAL	15	GPS	15	gps	6
B1604+315	GAL	16	GPS	16	c	9
B1607+268	GAL	16	GPS	16	gps,v	11
B1614+051	QSO	2	HFP	2	gps	16
B1622+665	GAL	2	HFP	2	gps	6
B1638+124	GAL	15	GPS	15	c	7
B1645+635	QSO	2	HFP	2	f	13
B1646+028	GAL	15	GPS	15	f/s	5
B1714+193	QSO	13	HFP	2	f	4
B1726+769	QSO	10	GPS	9	f	4
B1732+094	GAL	15	GPS	15	gps	6
B1734+508	GAL	2	HFP	2	c	10
B1749+096	BLO	2	HFP	2	idb	12
B1751+278	GAL	5	GPS	5	n	2
B1758+388	QSO	2	HFP	2	gps	15
B1803+784	BLO	10	GPS	9	f,v	12
B1807+170	BLO	13	HFP	2	f	12
B1824+271	GAL	23	GPS	5	s,v	1
B1839+389	QSO	2	HFP	2	c	15
B1843+356	GAL	16	GPS	16	gps,v	16
B1848+283	QSO	2	HFP	2	gps	16
B1851+488	QSO	5	GPS	5	f	12
B1853+376	GAL	13	HFP	2	c	10
B1934-638	GAL	5	GPS	5	gps	6
B1936-155	HPQ	19	GPS	19	f,v	15
B1954-388	HPQ	20	GPS	20	c	12
B2000-330	QSO	5	GPS	5	c	16
B2007+777	BLO	10	GPS	9	f,v	12
B2008-068	GAL	16	GPS	16	gps	5
B2008-159	LPQ	22	GPS	19	gps,v	15
B2019+050	GAL	13	HFP	2	gps	11
B2021+614	GAL	6	GPS	11	gps	7
B2022+171	LPQ	13	HFP	2	gps,v	14
B2050+364	GAL	5	GPS	5	c	2
B2053-201	GAL	7	GPS	7	s	3
B2055+055	GAL	15	GPS	15	f/s	5
B2059+034	QSO	2	HFP	2	f	12
B2112+283	EF	13	HFP	2	c	15
B2121+053	QSO	2	HFP	2	idb	15
B2121-014	GAL	15	GPS	15	s	2
B2126-158	QSO	17	GPS	17	gps	16
B2126-185	QSO	7	GPS	7	n	4
B2128+048	GAL	17	GPS	17	gps	10
B2128-123	LPQ	20	GPS	20	c	15
B2134+004	LPQ	17	GPS	17; 2	gps	11
B2136+141	LPQ	20	GPS	20	c	14
B2149+056	GAL	15	GPS	15	gps	11
B2153-119	S	4	GPS	7	n	4
B2154-183	QSO	7	GPS	7	n	5
B2201+098	GAL	13	HFP	2	c	5
B2205+166	QSO	13	HFP	2	f	12
B2209+236	QSO	13	HFP	2	f,v	12
B2210+016	GAL	17	GPS	17	f/s	7
B2236+124	QSO	5	GPS	5	f	12
B2254+024	QSO	2	HFP	2	f	13
B2254-204	BLO	20	GPS	20	c	9
B2255-282	LPQ	20	GPS	20	idb	15
B2318+049	QSO	2	HFP	2	c	12
B2322-040	GAL	16	GPS	16	gps	5
B2323+790	GAL	5	GPS	5	n,v	2
B2327+335	QSO	2	HFP	2	c	12
B2333-528	GAL	5	GPS	5	f/s	5
B2337+264	GAL	5	GPS	5	gps	16
B2337-063	GAL	15	GPS	15	f/s	5
B2342+821	QSO	16	GPS	16	gps	8
B2352+495	GAL	17	GPS	17	gps	7
B2353+816	BLO	10	GPS	9	f	4

Notes for the columns:
(1) Source name in B1950 coordinates;
(2) Optical identification: BLO = BL Lac object, EF = empty field, GAL = galaxy,
HPQ = high polarization quasar, LPQ = low polarization quasar, S = stellar, QSO = quasar;
(3) Reference for Col. 2: 1 = Augusto et al. (2006), 2 = Dallacasa et al. (2000);
3 = Dallacasa et al. (2002); 4 = de Vries et al. (1995); 5 = de Vries et al. (1997);
6 = Impey et al. (1991); 7 = Jeyakumar et al. (2000); 8 = Labiano et al. (2007a);
9 = Mingaliev et al. (2001); 10 = NED; 11 = O'Dea et al. (1991); 12 = O'Dea et al. (2005);
13 = Orienti et al. (2006a); 14 = Snellen et al. (1995); 15 = Snellen et al. (2002a);
16 = Stanghellini et al. (1993); 17 = Stanghellini et al. (1998); 18 = Steppe et al. (1995);
19 = Tornikoski et al. (2000); 20 = Tornikoski et al. (2001); 21 = Véron-Cetty & Véron (2006);
22 = Wills et al. (1992); 23 = Xiang et al. (2005); 24 = Xiang et al. (2006);
(4) Original GPS/HFP classification of the source;
(5) Reference for Col. 4, as in Col. 3;
(6) Auxiliary spectrum classification: gps = genuine gigahertz-peaked spectrum,
n = not enough data for GPS identification, s = steep spectrum, f = flat spectrum,
f/s = flat at low frequencies, steep at high frequencies, c = convex spectrum,
idb = inverted during bursts, v = variability of Var $_{\Delta S} > 3$ ;
(7) Number of the cluster in which the source is located

The parameters used in the analyses are listed together with their references in Table 2. The data for each source are available in electronic form at the CDS. Some of the parameters are described in more detail below.

2.1 Source size

The linear sizes have been collected from numerous references and they have been obtained by various instruments or VLBI networks and at various frequencies. Therefore the values are not perfectly comparable, but rather give some guidelines of the source size. In addition, in the original papers the sizes were calculated with very different values of cosmological parameters and hence were not comparable as such. Thus we recalculated the linear sizes with the latest estimates of cosmological parameters (H₀=71 km s^-1 Mpc^-1, $\Omega _{\rm M} = 0.27, \Omega_{\rm vac} = 0.73$ ) using the javascript calculator created by Edward L. Wright (Wright 2006).

For some sources, several different sizes were given in the literature. We selected the largest size obtained with VLBI, since when studying the compactness of the source, the largest observed size gives the most significant information. If there were sizes obtained with both VLBI and interplanetary scintillation (IPS) method (Jeyakumar et al. 2000), we selected the VLBI size for conformity, but accepted IPS sizes when there was no other information available. The scale of the sizes varied substantially so we used logarithm of the size in the analyses.

2.2 Magnitudes

The Gunn system r and i magnitudes from Stanghellini et al. (1993) were converted to Cousins R and I magnitudes using equations given by Schombert et al. (1990). For the sources with only one of the Gunn magnitudes, the Cousins magnitude was estimated using the mean value of r-R or i-I of the sources in Stanghellini et al. (1993). The brightest magnitude was chosen if there were several values for one source, except when a fainter value was simultaneous with the I magnitude observation.

2.3 Radio spectrum parameters

When calculating the parameters of the radio continuum spectra, we used the method developed for Paper III. To overcome the drawbacks of varying amount of data points at unevenly sampled frequency coverage, the frequency range between 0.05-360 GHz was divided to logarithmically equidistant intervals, chosen to be as wide as the fractional interval between 8 GHz and 10 GHz. The data at each interval was bound to the logarithmic centre of the interval. The median of the flux density and the fractional variability index (Var $_{\Delta S} = (S_{\max} - S_{\min})/S_{\min}$ ) were calculated for each of these data bins.

We calculated also three other variability indices which included the error estimates for the flux densities, but chose to use the above-mentioned quantity because the errors were not available for all the data in the CATS database, and thus the values would have not been consistent for all the sources and the data bins. The results were substantially the same regardless of the choice of the variability index.

The median flux density of each bin was used to model the shape of the spectra by fitting the following equation from Kovalev et al. (2000), rearranged by Dallacasa et al. (2000),

$\begin{displaymath} \log S = a - \sqrt{b^2+ (c \log \nu - d)^2)}, \end{displaymath}$

(1)

where S is the flux density at frequency $\nu$ , and a, b, c, and d are the fit parameters.

In general, Eq. (1) models the spectra in this sample very well, but since it does not model any physical properties, we made adjustments to the fits of some sources, as described below.

Some of the sources in our sample have abundant data on both sides of the turnover, so that the flux density peak and the turnover frequency are easy to determine and there are no difficulties in interpreting the fits. However, there is a number of sources in which the fits from Eq. (1) do not represent the slope of the data accurately, and for those sources, picked up by visual examination, logarithmic linear fits were also applied. In some of these cases the flat top of the peak was not used for the linear fit in order to better model the declining slope.

The majority of the sources have insufficient data for determining both spectral indices reliably, or even at all. For some sources the optically thin part of the spectrum is available and the spectral index $\alpha$ is calculated either with Eq. (1) or linearly. Some sources show only a very wide round top of the spectrum, and it is not feasible to fit spectral indices to these kind of data. Using Eq. (1), the slopes are calculated from extrapolated values as follows. When applicable, frequencies 0.1 MHz and 1 MHz are selected to represent the optically thick, and 100 GHz and 1000 GHz the optically thin part of the spectrum. These values, selected far from the turnover to make sure that the slope has levelled out, are substituted into Eq. (1) to get the respective, modelled flux density values, and the basic formula for the slope in the logarithmic scales is used to derive the spectral indices: $\alpha = (\log S_2 - \log S_1)/(\log \nu_2 - \log \nu_1)$ . The source spectra with the applicable fits are presented in Figs. 7.

In determing the peak frequency, we used the value derived from the fit when it was applicable. For some sources the spectrum was so flat that the fit yielded peak frequencies far beyond the frequency range of the data. Then the peak frequency was determined visually or omitted depending on the shape of the spectrum. For several cases the peak frequency from the fit matched the start of the declining part of the spectra but there was no clear information on the rising part of the spectrum. In these cases we considered the peak frequency from the fit an upper (in some cases a lower) limit of the possible turnover.

The rest frame peak frequency was calculated and for the sources with no redshift information available we used a generic value of z = 1, which is close to the median value of our sample (0.93).

From the spectral indices, the turnover frequency, and the variability in the radio frequencies we derived other quantities to describe the shape of the radio spectrum. The symmetry of the spectrum was calculated as $\alpha_{\rm below} / (- \alpha_{\rm above})$ . The curvature of the spectrum is defined as the change in the spectral indices over the spectrum, i.e. $\alpha_{\rm below} - \alpha_{\rm above}$ . The width of the spectrum, FWHM in decades of frequency, was calculated from the fitted function by taking the difference of the frequencies below and above the turnover where the flux density was half of the highest value.

Because the amount of data varied from source to source and the data were not distributed evenly in the optically thick and thin parts of the spectrum, we wanted to put more weight on the spectrum parameters of sources with abundant radio data, and less on the sources with sparse data that probably did not describe the true shape of the spectrum. In order to calculate weighting factors for the spectrum indices $\alpha_{\rm below}$ and $\alpha_{\rm above}$ , the number of empty and non-empty fitting data bins in the corresponding frequency intervals were defined as:
n = N(``non-empty data bins'') and
m = N(``empty data bins'').

Now the weighting factor $q'_{\rm b}$ for $\alpha_{\rm below}$ can be given as:

$\begin{displaymath}q'_{\rm b} = 1 - [n + (\pi/4)*m]^{-1}, \end{displaymath}$

(2)

where the factor $\pi$ /4 is used to reduce the significance of empty data bins. The final weighting factor $q_{\rm b}$ is then generated by normalizing $q'_{\rm b}$ to interval [0, 1] in respect to all the processed $\alpha_{\rm below}$ values. $q_{\rm a}$ for $\alpha_{\rm above}$ is calculated in an identical manner.

Weighting factors for the spectrum index derivatives, curvature, symmetry, and FWHM, were then approximated with geometric mean sqrt( $q_{\rm b} * q_{\rm a}$ ) where applicable. These factors were then applied to SOM training using weighting mask matrix.

The source power at 5 GHz was calculated using the median flux density of the databin around 5 GHz. The luminosity distance was calculated with the fundamental formula from e.g., Altschuler (1989), and the generic redshift of z = 1 was used for sources without redshift information.

2.4 Auxiliary classification

In addition to calculating numerical parameters to be used in the analyses, we used the radio spectral parameters for classifying of the spectra. The spectral classes and their criteria follow the approach of Paper III:

gps - gigahertz-peaked: $\nu_{\rm peak} > 500$ MHz, $\alpha_{\rm below} > 0.5$ , Var $_{\Delta S} < 3$ , well-defined shape of the spectrum;
gps,v - variable gigahertz-peaked: as above, but Var $_\Delta S$ may be greater;
c(,v) - convex: $0 < \alpha_{\rm below} < 0.5$ , $\alpha_{\rm above} < 0$ (v if Var $_{\Delta S} > 3$ );
n - not enough data for solid classification, otherwise as gps;
idb - inverted during bursts: Var $_{\Delta S} > 6$ , shape of the upper envelope of the spectrum convex;
f(,v) - flat: no clear turnover, $-0.5 < \alpha < 0.5$ (v if Var $_{\Delta S} > 3$ );
f/s - flattening at low frequencied, then $\alpha < -0.5$ ;
s - steep: simple steep spectrum with $\alpha < -0.5$ ;

This classification is used to trace the locations of different types of spectra on the maps.

**Table 2:** Parameters and their references used in the analyses.
Parameter	Label on the SOM	References
Optical identification	ID	Tornikoski et al. (2001); O'Dea et al. (2005); Stanghellini et al. (1993); Snellen et al. (2002a); Orienti et al. (2006a); Hewitt & Burbidge (1993); Jeyakumar et al. (2000); Stanghellini et al. (1998); Impey et al. (1991); de Vries et al. (1997); NED; Xiang et al. (2005); Augusto et al. (2006); Xiang et al. (2006); Snellen et al. (1995); Tornikoski et al. (2000); Wills et al. (1992); Dallacasa et al. (2000)
Redshift	z	O'Dea et al. (1996); Hewitt & Burbidge (1993); Jeyakumar et al. (2000); Véron-Cetty & Véron (2003); de Vries et al. (1997); NED; de Vries et al. (2007); Xiang et al. (2006); O'Dea et al. (2005); Orienti et al. (2006b); Stanghellini et al. (1993); Xiang et al. (2005); Labiano et al. (2007b); Stanghellini et al. (1998); Impey & Tapia (1990); Impey et al. (1991); Snellen et al. (1995); Tinti & de Zotti (2006); Dallacasa et al. (2000)
Size	Size_kpc	O'Dea & Baum (1997); Xiang et al. (2006); Dallacasa et al. (1998); Xiang et al. (2005); Orienti et al. (2006a); Jeyakumar et al. (2000); Augusto et al. (2006); Gupta et al. (2006); Labiano et al. (2007b); Stanghellini et al. (2001); Best et al. (1999); Gurvits et al. (1999)
Power at 5 GHz	P_5 GHz	Calculated as described in the text
Optical polarization	p_opt	Fugmann & Meisenheimer (1988); Impey & Tapia (1990); Visvanathan & Wills (1998); Impey et al. (1991); Marcha et al. (1996); Wills et al. (1992); O'Dea (1998)
Radio polarization	p_radio	Ricci et al. (2004); Zukowski et al. (1999); Steppe et al. (1995); Aller et al. (2003); Homan & Lister (2006)
B magnitude		Siebert et al. (1998); Véron-Cetty & Véron (2003); Labiano et al. (2007b); Dallacasa et al. (2002)
V magnitude		Véron-Cetty & Véron (2003); Dallacasa et al. (2002); O'Dea et al. (1991); Wills et al. (1992); Hewitt & Burbidge (1993); Barvainis et al. (2005); Impey & Tapia (1990); Labiano et al. (2007b); Impey et al. (1991)
R magnitude		O'Dea et al. (1996); Labiano et al. (2007b); Véron-Cetty & Véron (2003); Dallacasa et al. (2002); Stanghellini et al. (1993); O'Dea et al. (1991); Tinti & de Zotti (2006)
I magnitude		de Vries et al. (1995); Xiang et al. (2005); Guainazzi et al. (2006); Stanghellini et al. (1993); Dallacasa et al. (2002); de Vries et al. (2000)
V-R colour	V-R	Dallacasa et al. (2002), and calculated values
B-V colour	B-V	Véron-Cetty & Véron (2003), and calculated values
U-B colour	U-B	Véron-Cetty & Véron (2003), and calculated values
VLBI morphology	-	Xiang et al. (2006); O'Dea et al. (1991); Dallacasa et al. (1998); Lister et al. (2002); Xiang et al. (2005); Orienti et al. (2006a); Augusto et al. (2006); Jeyakumar et al. (2000); Gugliucci et al. (2005); Fey & Charlot (1997)
R-I colour	R-I	O'Dea et al. (1996,1991)
Hydrogen column density in X-rays	N_H_X	Siemiginowska et al. (2003); Siebert et al. (1998); Guainazzi et al. (2006); Bloom et al. (1999); Vink et al. (2006); Elvis et al. (1994)
Power law slope in X-rays	Gamma	Siemiginowska et al. (2003); Siebert et al. (1998); Guainazzi et al. (2006); Vink et al. (2006)
Hydrogen column density 21 cm	N_H_21	Gupta et al. (2006); Orienti et al. (2006b); Pihlström et al. (2003)
O-E colour	O-E	Snellen et al. (2002b)
Variability index	vi	Calculated as described in the text
Number of observations	N_vi	Calculated as described in the text
Rest frame turnover frequency	nu_peak,rest	Calculated as described in the text
Optically thick spectral index	alpha_b	Calculated as described in the text
Optically thin spectral index	alpha_a	Calculated as described in the text
Curvature	Curvature	Calculated as described in the text
Symmetry	Symmetry	Calculated as described in the text
Width of the spectrum	FWHM	Calculated as described in the text

3 Analyses

A self-organising map (Kohonen 2001) is a neural network, which can be used for cluster analyses, visualization of multidimensional data, and classification. We have chosen to use this method for its intuitive way of visualizing multidimensional data, and its ability to analyse incomplete data matrices. One of its other benefits is that the network is trained in an unsupervized manner, i.e. there is no user input on the classification. Therefore the clustering is not biased by any antecedent results.

A SOM consists of neurons, which are organized in a N-dimensional grid, usually N = 2 for the most convenient visualization. In the 2-dimensional case, the lattice of neurons can be hexagonal or rectangular, and the lattice can be folded into cylindrical or toroidal shape. In this paper, a simple flat 2-dimensional hexagonal grid of neurons was chosen.

In each neuron i, there is a randomly initialized weight vector $\vec{w}_i$ of D dimensions. The input data are also considered to consist of vectors, input vectors $\vec{x}$ , of D dimensions. Each input vector represents an observation of the input data and each dimension represents a parameter of the observation. Thus, in this paper, each input vector is a single GPS source, and each component of this vector represents one property of the source.

The map is trained by taking one input vector and comparing it with all the weight vectors to find the best-matching unit (BMU) c, the neuron of which the weight vector $\vec{w}_c$ is closest (usually in Euclidian distance) to the input vector:

$\begin{displaymath}\vert\vert\vec{x} - \vec{m}_c\vert\vert = min\vert\vert\vec{x} - \vec{m}_i\vert\vert. \end{displaymath}$

(3)

The weight vectors of the BMU and its topological neighbours are then updated to resemble the input vector even more:

$\begin{displaymath}\vec{m}_i(t+1) = \vec{m}_i + a(t) h_{ci}(r(t))[\vec{x}(t) - \vec{m}_i(t)], \end{displaymath}$

(4)

where t denotes time (training step), a(t) is learning rate, and h_ci(r(t)) neighbourhood function depending on neighbourhood radius r(t). The learning rate and the neighbourhood radius typically decrease with time, so that the amount of change and the number of affected neurons decrease as more training steps are completed. The learning rate decreases from 1 to 0 usually using a function that is inversely proportional to time. The neighbourhood radius is usually large at the beginning, allowing the map to adapt more rapidly, and becomes smaller as the training progresses, so that the map is finally fine-tuned to the delicate details of the input data.

This comparison and updating can be done in two different ways: using either sequential or batch training. In sequential training the comparison is done by taking one input vector at a time and updating the map before proceeding to the next input vector. We have used batch training, in which all the input data are gone through once before updating the weight vectors with the weighted averages of the samples.

When the training is completed, the map will have formed a representation of the observations by adjusting its vectors according to observed variables. In practice, this allows us to locate the sources by their properties on the map and thus clustering similar sources together. The neurons, with the sources they harbour, can be divided into clusters by different clustering methods.

We have used SOM Toolbox version 2.0 for Matlab. The grid size and the topology were optimized to the data by the software. We chose to use the centroid method for the clustering, because it creates clusters by calculating the centroid of the whole cluster instead of creating chains between similar sources like the linkage methods do. We also tested clustering using cluster averages, the neighbourhood function, and the Ward method, but the results did not differ substantially.

4 Results and discussion

The maps are presented in Figs. 2-6, only available in electronic form via http://www.aanda.org.

The most important tools in interpreting the results of the analyses are the U-matrix and the component planes. The U-matrix is a representation of the average neigbourhood distances of each neuron, with an additional hexagon between every neighbouring neuron to illustrate the distance between the pair. If the data were clearly divided into different clusters, the U-matrix would show clear light (red) borders, representing large distances, between the neurons which belong to different clusters. Dark-coloured (blue) areas represent groups of similar neurons, where the differences between the neighbours are small. Component planes show the projection of the value of each parameter on the map grid, i.e. it can be thought as a contour map describing the location of low and high values of the parameter.

Because of the space needed for text in the neurons, the separate maps, i.e. all except the combined U-matrix and the component plane view, have been rotated 90 $\hbox{$^\circ$ }$ counterclockwise. In the discussion below, we use the coordinates of the separate maps, so when referring to the upper left corner of a map, the corresponding area in the combined U-matrix and component plane plot is in the upper right corner.

Below, the maps are presented and discussed in a general manner and only some specific clusters of interest are discussed in more detail.

4.1 Map of all sources and all parameters

When all sources in the sample and all the collected parameters are used, the maps presented in Figs. 2-6 are produced. Combining the information of the auxiliary classification (Fig. 3), VLBI morphology (Fig. 4) and the optical identification (Fig. 5), we can see that the map can be divided roughly into four quarters. The two upper quarters are populated with quasar-type sources, the upper right with gps-type spectra and the upper left with other types of spectra. In the lower part of the map there are mostly galaxy-type sources; again the right side is dominated by sources with gps-type spectra while the left side has types f, f/s, c, and n. Each of these quarters can be divided into several clusters of similar sources, and their typical properties can be studied in the plot of the U-matrix and the component planes (Fig. 2).

As they are not numerical quantities, neither the optical identification, VLBI morphology, nor the auxiliary classification were used by the algorithm, and therefore the formation of groups of sources similar in these properties is likely to reflect some deeper similarities between the sources. (Because the auxiliary classification has been done using the parameters of the radio spectrum, the classes are linked to different parts of the map according to the values of the radio parameters. However, the clear division of classes on the map is an indication that we have chosen to use quantities that really seem to reflect some essential properties of the sources.)

Below, we first give an overview of the four quarters and then present some individual clusters in more detail.

The upper left quarter of the map consists of flat- and convex-spectrum sources and sources with inverted spectrum during outbursts. There is pronounced variability in the sources in this part of the map; there are quasars, low polarizarion quasars (LPQs), highly polarized quasars (HPQs), and, in particularly, on the outermost edge and the upper corner, BL Lac objects (BLOs). This upper left quarter contains sources with typical blazar properties, i.e. these sources have been misidentified as GPS sources because of the temporary GPS shape of their spectra during radio flares. This is not surprising, as it has been noticed earlier (e.g., 20; Tinti et al. 2005; Paper II) that some previously classified GPS sources show blazar-like behaviour, and, indeed, they have proved to be blazars when taking a closer look.

In this ``blazar quarter'' there are also two gps sources (B0528-250, B1758+388), which seem to be misclassified by the auxiliary classification. Their radio continuum spectra show GPS-like characteristics but this may be due to lack of flux density monitoring, which would likely reveal more blazar-like behaviour. In Papers II and III we have shown that monitoring must be continued for several years in order to find out the true nature of a source. Neither of these sources have been monitored for more than $\sim$ 3 years near the turnover, so there is not yet compelling evidence that these sources would maintain their GPS-type spectrum in all levels of activity.

The lower parts of the left side of the map are mostly populated by galaxies with unknown VLBI morphologies. These sources are characterized by low redshifts and rather low radio powers at 5 GHz. Besides the radio data we have calculated, there is very little information available of the other properties of these sources. Typical spectrum types in this area are f, f/s, and n.

The right side of the map contains sources with confirmed gps-type spectra. The uppermost third of the right side is populated by high redshift quasars, which have high radio powers. The sources near the vertical mid-line of the map are larger in size and have flatter spectra and higher variability than the sources in the right corner. The sources on the uppermost rows are mostly unresolved by VLBI, however, there are two CSOs in the right corner and a group of core-jets a bit lower in the middle.

Below these quasar-type sources, in the middle of the right side, there is a bundle of galaxies, mostly with CSO morphologies. These low-redshift sources have rather low radio powers, especially on the right edge, where also the sizes of the sources are small. The bottom right of the map consists of galaxies with mostly unknown VLBI morphologies, and, in addition to gps-type spectra, there are also s, f/s, and n types of spectra. The sizes of these sources are quite large near the vertical mid-line of the map and smaller near the corner. Variability is low in the bottom rows of the map, but this may be due to small number of observations, which indicates lack of proper monitoring. The curvature of the spectrum is high in the bottom right corner of the map, and the spectral index above the turnover is steep in the area which extends nearly to the mid-line. The spectral index below the peak is high in the corner, but declines rapidly towards the vertical mid-line, which is due to the lack of the optically thich part of the spectrum of the auxiliary classes present in the area.

4.1.1 Some notes on individual clusters

In this paper we mainly concentrate on the outcome of the SOM analysis and the general trends we observe in the GPS source subpopulations. A more detailed analysis of the physical properties of the various subpopulations will be the topic of a subsequent paper.

In Fig. 6 the similar neurons have been clustered together by the centroid method. We have analyzed the map using different numbers of clusters, and have chosen to present the clustering with $N_{\rm clusters} = 16$ , because the clusters seem to represent well the different areas in the map, and the division of sources into clusters does not seem to be too coarse or too fine-tuned. Depending on the number of clusters, some single sources may switch clusters. The cluster memberships cannot be considered definitive but rather suggestive for individual sources. The map is plotted in Fig. 6 together with the cluster numbering generated by the algorithm. The sources in each cluster have been listed in Table 3 and a summary of the properties of the cluster is presented in Table 4. Here we present some of the most interesting clusters and their properties, together with some preliminary interpretations of the nature of the related sources.

Table 3: Clustering of the sources ( $N_{\rm clust} = 16$ ).
Cluster 1 2 3 4 5 6 7 8

Sources B0018+729 B0204-306 B0159+839 B0431-026 B0022-423 B0000+212 B0026+346 B1120-274

B0019-000 B0316+162 B0326+349 B0700+470 B0105-122 B0424+328 B0116+319 B1324+574

B0238-084 B0359-294 B0621+446 B1054+004 B0144+209 B1404+286 B0428+205 B1358+624

B0240-217 B0405-280 B1502+036 B1107+109 B0207-224 B1509+054 B1031+567 B2342+821

B0320+053 B0405-395 B1600+335 B1132-000 B0208+040 B1601-222 B1117+146

B0507+179 B0439-337 B2053-201 B1200+045 B0703+468 B1622+665 B1323+321

B0651+410 B0454-088 B1347-218 B0706+460 B1732+094 B1345+125

B0910+151 B0554-026 B1349+027 B0741-063 B1934-638 B1638+124

B0941-080 B0602+780 B1548-302 B0802+103 B2021+614

B1107-187 B0904+039 B1714+193 B0914+114 B2210+016

B1144+352 B1042-269 B1726+769 B1133+432 B2352+495

B1540-077 B1343-300 B2126-185 B1225+368

B1824+271 B1410+138 B2153-119 B1350+113

B1444-339 B2353+816 B1433-040

B1553-062 B1503-091

B1751+278 B1518+046/7

B2050+364 B1545-120

B2121-014 B1557-004

B2323+790 B1646+028

B2008-068

B2055+055

B2154-183

B2201+098

B2322-040

B2333-528

B2337-063

Cluster 9 10 11 12 13 14 15 16

Sources B0902+490 B0002+051 B0500+019 B0048-097 B0039+230 B0153+744 B0332-403 B0108+388

B0930+493 B0034+078 B0552+398 B0113+241 B1645+635 B0201+113 B0400+258 B0457+024

B1057-797 B0711+356 B0743-006 B0354+231 B2254+024 B0215+015 B0454-234 B0636+680

B1323+799 B0718+374 B0923+392 B0437-454 B0218+357 B0528-250 B0642+449

B1354-174 B1043+066 B1355+441 B0516+087 B0237-233 B0537-441 B0710+439

B1604+315 B1100+223 B1607+268 B0646+600 B0248+430 B0633+595 B1143-245

B2254-204 B1734+508 B2019+050 B1013+054 B0404+768 B0858-279 B1427+109

B1853+376 B2134+004 B1039+811 B0528+134 B1144+542 B1526+670

B2128+048 B2149+056 B1118-056 B0738+313 B1146+531 B1614+051

B1148-171 B0742+103 B1333+459 B1843+356

B1349-439 B1127-145 B1334-127 B1848+283

B1357+769 B1245-197 B1519-273 B2000-330

B1455+080 B1422+231 B1758+388 B2126-158

B1601+112 B1442+101 B1839+389 B2337+264

B1749+096 B1543+005 B1936-155

B1803+784 B2022+171 B2008-159

B1807+170 B2136+141 B2112+283

B1851+488 B2121+053

B1954-388 B2128-123

B2007+777 B2255-282

B2059+034

B2205+166

B2209+236

B2236+124

B2318+049

B2327+335

The cluster number 5 in the lower right corner of the map contains sources with CSO, CD, and unknown morphologies. There is also one core-jet object. There are altogether 26 sources, out of which 17 are galaxies, 5 are quasars, and 4 sources with empty fields or with no information. The component planes show steep spectral indices and narrow FWHM in the outermost corner, whereas, when going leftwards, the sizes of the sources increase, and the turnover frequencies decrease, which can also be seen from the auxiliary classification of the spectra. In the upper and right edges of the cluster there are gps-type spectra, whereas in the lower left part of the cluster there are sources with s and f/s types of spectra.

**Table 4:** Summary of the properties of sources in each cluster.
Cluster	z				Size [kpc]				lg (P_5GHz)				pol_opt [%]
number	N	med	min	max	N	med	min	max	N	med	min	max	N	med	min	max
1	12	0.267	0.005	0.821	5	0.09	0.00	0.27	13	26.23	22.86	26.87	2	0.55	0.47	0.63
2	16	0.692	0.235	1.195	5	0.39	0.11	2.35	19	26.80	25.58	27.84	0	-	-	-
3	4	0.456	0.156	1.100	2	40.32	0.48	80.16	6	26.57	25.75	27.81	0	-	-	-
4	10	0.680	0.530	1.344	1	0.01	0.01	0.01	14	26.81	26.51	27.43	0	-	-	-
5	18	1.339	0.178	1.980	6	0.93	0.24	39.07	26	27.32	24.94	28.02	0	-	-	-
6	8	0.192	0.077	0.735	7	0.03	0.01	0.30	8	25.83	24.89	27.09	1	0.61	0.61	0.61
7	11	0.362	0.060	1.150	10	0.25	0.04	1.12	11	26.54	25.07	27.62	4	1.18	0.30	1.44
8	3	0.650	0.431	0.735	2	1.17	0.39	1.95	4	26.89	26.77	27.30	0	-	-	-
9	4	2.630	1.970	3.147	3	0.02	0.01	0.03	7	27.83	27.15	28.57	1	9.30	9.30	9.30
10	5	1.620	0.990	1.900	2	0.14	0.04	0.24	9	27.37	26.92	27.89	1	1.00	1.00	1.00
11	7	0.740	0.473	2.365	8	0.04	0.01	0.35	9	27.26	26.98	29.07	6	2.21	0.40	7.80
12	16	0.847	0.050	1.809	5	0.01	0.01	0.05	26	27.12	24.55	27.75	6	9.75	6.00	21.50
13	2	2.230	2.081	2.379	1	0.06	0.06	0.06	3	27.77	27.42	27.95	1	1.67	1.67	1.67
14	17	1.715	0.556	3.626	14	0.16	0.02	27.27	17	28.23	26.89	29.01	6	0.71	0.30	2.66
15	18	1.644	0.501	3.095	9	0.01	0.01	0.04	20	27.94	27.20	28.49	9	10.60	1.90	27.10
16	12	2.790	0.518	3.773	9	0.06	0.01	0.25	14	28.25	27.03	28.89	2	1.57	0.87	2.27
Cluster	pol_radio [%]				V-R				B-V				U-B
number	N	med	min	max	N	med	min	max	N	med	min	max	N	med	min	max
1	3	0.58	0.04	3.40	3	1.50	0.91	1.50	3	1.07	0.00	1.34	1	0.61	0.61	0.61
2	0	-	-	-	3	1.10	0.60	1.40	2	-0.25	-0.40	-0.10	0	-	-	-
3	1	35.20	35.20	35.20	1	0.00	0.00	0.00	2	0.35	0.20	0.49	1	-0.53	-0.53	-0.53
4	0	-	-	-	0	-	-	-	0	-	-	-	0	-	-	-
5	5	3.20	1.20	7.10	2	0.73	0.21	1.25	3	0.25	0.10	1.40	1	-0.84	-0.84	-0.84
6	1	0.70	0.70	0.70	4	1.02	0.93	1.40	2	0.65	0.52	0.78	1	-0.05	-0.05	-0.05
7	6	0.17	0.00	1.21	4	1.80	-0.10	2.50	3	0.00	-0.50	0.00	0	-	-	-
8	1	1.64	1.64	1.64	1	0.30	0.30	0.30	2	0.00	0.00	0.00	0	-	-	-
9	1	3.80	3.80	3.80	0	-	-	-	1	0.60	0.60	0.60	0	-	-	-
10	1	1.09	1.09	1.09	1	0.41	0.41	0.41	1	0.35	0.35	0.35	1	-1.11	-1.11	-1.11
11	3	2.20	1.00	2.73	3	-0.38	-0.80	-0.30	3	0.30	0.06	1.00	2	-0.63	-0.94	-0.31
12	4	3.31	0.70	3.57	4	0.47	0.36	0.49	9	0.58	0.00	1.49	5	-0.63	-0.84	-0.47
13	0	-	-	-	0	-	-	-	1	0.08	0.08	0.08	0	-	-	-
14	5	3.82	2.20	5.49	6	0.91	0.00	1.50	11	0.20	-1.00	0.80	4	-0.61	-0.70	-0.37
15	11	3.10	1.43	7.10	2	0.72	0.03	1.40	12	-0.40	-2.80	0.58	3	-0.55	-0.89	-0.48
16	5	2.60	0.44	4.00	8	0.17	-4.90	1.70	8	-0.28	-1.20	1.60	1	1.70	1.70	1.70
Cluster	R-I				$\nu_{\rm peak,rest}$ [GHz]				$\alpha_{\rm b}$				$\alpha_{\rm a}$
number	N	med	min	max	N	med	min	max	N	med	min	max	N	med	min	max
1	3	1.00	0.40	1.50	9	0.91	0.26	2.10	8	0.24	0.02	0.50	10	-0.74	-0.84	1.00
2	2	0.40	0.10	0.70	14	1.55	0.74	8.87	11	0.48	0.11	0.79	19	-0.74	-1.00	-0.47
3	1	0.50	0.50	0.50	2	7.94	1.53	14.35	4	0.23	0.06	0.31	4	-0.42	-0.61	-0.19
4	0	-	-	-	9	0.85	0.30	1.85	8	0.16	0.00	1.00	11	-0.70	-1.04	-0.44
5	3	0.50	0.20	0.60	21	1.92	0.52	8.49	15	0.64	0.20	1.50	25	-1.17	-1.83	1.00
6	1	0.20	0.20	0.20	8	6.48	0.83	13.04	8	1.19	0.92	1.98	8	-0.94	-1.30	-0.31
7	3	0.40	0.30	0.60	11	0.64	0.35	5.30	9	0.56	0.24	0.77	11	-0.58	-0.90	1.00
8	1	0.30	0.30	0.30	4	1.13	0.74	2.09	3	1.16	1.10	1.22	4	-0.79	-0.84	-0.66
9	1	1.40	1.40	1.40	6	7.01	4.62	44.39	5	0.40	0.35	0.51	7	-0.51	-0.84	-0.08
10	1	1.50	1.50	1.50	9	9.86	1.58	14.00	9	0.63	0.36	0.76	9	-0.63	-0.85	-0.36
11	1	0.60	0.60	0.60	9	8.30	1.52	31.15	9	0.80	0.66	1.11	9	-0.86	-1.11	-0.66
12	0	-	-	-	15	11.29	3.42	90.90	16	0.18	0.04	0.53	18	-0.18	-0.53	1.00
13	0	-	-	-	1	5.70	5.70	5.70	3	0.44	0.05	1.00	1	-0.35	-0.35	-0.35
14	1	1.70	1.70	1.70	13	5.48	0.66	32.09	13	0.45	0.19	1.15	15	-0.62	-1.07	-0.23
15	0	-	-	-	18	20.47	6.81	52.61	15	0.46	0.21	0.70	16	-0.47	-0.62	-0.20
16	2	0.98	0.76	1.20	14	19.83	3.65	66.65	14	1.10	0.51	2.65	14	-0.95	-2.12	-0.43
Cluster	Symmetry				Curvature				FWHM				Variability index
number	N	med	min	max	N	med	min	max	N	med	min	max	N	med	min	max
1	6	-0.56	-0.82	-0.37	6	0.99	0.85	1.34	1	2.55	2.54	2.54	13	1.14	0.39	7.36
2	14	-0.36	-0.82	0.00	14	1.00	0.56	1.58	3	1.27	0.98	1.43	19	0.16	0.05	1.53
3	2	-0.19	-0.37	0.00	2	0.65	0.62	0.68	2	2.21	2.16	2.27	6	1.21	0.36	1.92
4	6	-0.43	-0.96	-0.36	6	0.92	0.44	1.12	0	-	-	-	14	0.36	0.17	1.69
5	18	-0.59	-1.49	0.08	18	1.69	1.17	3.33	11	1.01	0.73	1.22	26	0.35	0.11	2.35
6	8	0.14	0.01	1.40	8	2.14	1.43	2.62	8	1.02	0.78	1.35	8	0.79	0.35	3.66
7	9	0.00	-0.34	0.25	9	1.15	0.79	1.48	9	1.59	1.18	1.96	11	1.73	0.60	4.04
8	3	0.43	0.38	0.44	3	1.93	1.76	2.01	3	1.02	0.92	1.21	4	0.79	0.44	1.26
9	5	0.00	-0.38	0.00	5	1.01	0.70	1.24	4	1.49	1.34	1.73	7	0.62	0.26	1.71
10	9	0.00	-0.48	0.16	9	1.22	0.72	1.52	8	1.24	1.04	1.67	9	0.84	0.38	3.07
11	9	0.01	-0.29	0.05	9	1.76	1.31	2.23	9	1.25	0.92	1.37	9	2.13	0.58	8.16
12	12	0.00	0.00	0.13	12	0.37	0.04	1.05	4	3.54	2.21	5.59	26	2.97	1.00	18.16
13	1	0.09	0.09	0.09	1	0.79	0.79	0.79	0	-	-	-	3	2.52	1.86	2.93
14	13	-0.17	-0.75	0.28	13	1.12	0.47	2.12	10	1.46	1.10	2.04	17	2.49	0.95	22.75
15	15	0.00	-0.35	0.44	15	0.92	0.62	1.24	10	1.59	1.18	2.09	20	3.14	1.27	14.11
16	14	0.12	-1.19	1.59	14	1.93	1.25	3.70	14	1.18	0.84	1.50	14	1.75	0.64	4.15

Cluster	N_vi				N_H_X [10²² cm^-2]				Gamma				N_HI [10²⁰ cm^-2]
number	N	med	min	max	N	med	min	max	N	med	min	max	N	med	min	max
1	13	9.00	2.00	133.00	1	0.10	0.10	0.10	1	2.81	2.81	2.81	2	1.14	1.00	1.27
2	19	4.00	2.00	8.00	0	-	-	-	0	-	-	-	2	6.98	6.27	7.69
3	6	6.50	2.00	24.00	0	-	-	-	0	-	-	-	0	-	-	-
4	14	6.00	3.00	8.00	0	-	-	-	0	-	-	-	0	-	-	-
5	26	5.00	2.00	27.00	0	-	-	-	0	-	-	-	2	1.03	0.64	1.42
6	8	11.00	3.00	17.00	0	-	-	-	0	-	-	-	5	3.98	1.99	26.91
7	11	38.00	4.00	155.00	3	0.66	0.50	1.00	1	1.43	1.43	1.43	8	2.20	0.38	12.20
8	4	6.00	2.00	16.00	1	3.00	3.00	3.00	1	1.24	1.24	1.24	2	1.37	0.75	1.99
9	7	7.00	2.00	9.00	0	-	-	-	0	-	-	-	0	-	-	-
10	9	10.00	2.00	17.00	1	0.30	0.30	0.30	1	1.50	1.50	1.50	0	-	-	-
11	9	60.00	17.00	386.00	2	0.30	0.10	0.50	2	1.90	1.62	2.18	4	11.78	0.99	35.42
12	26	10.50	2.00	220.00	2	0.04	0.02	0.06	2	2.67	2.15	3.18	0	-	-	-
13	3	9.00	9.00	14.00	0	-	-	-	0	-	-	-	0	-	-	-
14	17	112.00	9.00	1845.00	3	0.33	0.21	3.60	3	1.47	1.24	1.56	4	1.83	0.99	2.66
15	20	26.00	7.00	164.00	3	0.04	0.03	0.07	3	2.20	1.91	2.44	0	-	-	-
16	14	9.00	2.00	114.00	6	1.83	0.44	57.00	3	1.75	1.59	3.32	2	45.40	11.00	79.80
Cluster	O-E
number	N	med	min	max
1	1	-5.25	-5.25	-5.25
2	0	-	-	-
3	1	0.96	0.96	0.96
4	1	1.15	1.15	1.15
5	0	-	-	-
6	3	2.08	1.64	2.44
7	4	2.24	0.67	2.73
8	1	2.19	2.19	2.19
9	1	0.41	0.41	0.41
10	2	0.53	0.40	0.65
11	3	0.85	0.71	2.32
12	3	1.29	0.34	1.35
13	3	0.72	0.43	2.01
14	5	2.05	1.09	2.96
15	4	0.87	0.54	1.58
16	3	1.21	0.41	1.45

The sizes of six sources in this cluster are known; the median value (0.93 kpc) is greater than the median of all the sources in the sample (0.11 kpc). The rest frame turnover frequency has a decreasing gradient towards the lower left part of the cluster; the median of the cluster is 1.92 GHz, whereas the median of all the sources is 4.91 GHz. It is also lower than in other clusters with gps-type spectra, except for the clusters number 7 and 8. However, the value is affected by undefinable turnover frequencies of the sources with s and f/s types of spectra, and therefore only represents the $\nu_{\rm peak}$ of the gps-type sources in the cluster.

There are at least three sources with some evidence of young age: B0703+468, a CSO quasar, generally suggested to be a young source by Stanghellini et al. (2005); B1225+368, low radiative age found by Murgia et al. (1999); and B2201+098 with a lower limit of kinematic age of less than 1000 years found by Gugliucci et al. (2005). None of the sources have been reported to have related extended emission around them. This cluster could represent young sources, with the youngest in the upper and right parts of the cluster and the older, possibly CSS sources of which the turnover frequency has already decreased below the observed frequencies, in the lower left part of the cluster.

The cluster 7 harbours galaxy-type sources with mostly gps spectra and low turnover frequencies (median 0.64 GHz), small sizes (median 0.25 kpc), and CSO, CD or CT morphologies. These sources have rather high variability; the median of the highest variability index is 1.73, which is observed at the frequency of $\sim$ 8 GHz, yet, at least the source B1031+567 has been discovered - among six other CSOs - to have extremely stable flux densities at this frequency band on timescales ranging from one week to ten months (Fassnacht & Taylor 2001). The radio powers are intermediate (median $\log~(P_{5~\rm GHz}) = 26.5$ ) in this cluster. Out of the total 11 sources, there are at least four sources with kinematic age estimates, ranging from $\sim$ 380 yr to $\sim$ 3000 yr (e.g., Giroletti et al. 2003; Polatidis & Conway 2003), and two sources with other hints of young age (Murgia et al. 1999). Therefore, it is likely that the sources in this cluster represent galaxy-type symmetric radio sources in their youth. However, the low turnover frequency, in fact, the lowest of all clusters, is in contradiction with the evolutionary scheme, where the turnover of a new-born radio source is at high frequencies and decreases as the source grows larger. At least there does not seem to be any cluster of sources clearly representing the next phase of evolution of these sources, which is not totally surprising as the one of the key selection criteria of GPS classification has been turnover frequency of 0.5 GHz. However, there are sources with s and f/s types of spectra, especially in the clusters 5 and 2 for which the turnover frequency has not been determined, and therefore they don't contribute to the median turnover frequency of their clusters.

The borders of this cluster do not change when the number of the clusters is changed. This provides additional support that the sources form a homogeneous population that is not likely to mix or merge with the neighbouring clusters. The unexpected combination of low turnover frequency, small linear size and confirmed very young age of the sources in this cluster may require reformulation of the views on what is the cause of the turnover.

The cluster 10 consists of gps sources and sources with c and n types of spectra. This cluster stays also well-defined when the value of $N_{\rm clust}$ is changed. There are both galaxy- and quasar-types of sources. The median redshift of this cluster is 1.62, as there are no redshifts available for any of the galaxies. The rest frame turnover frequencies are high (median 9.9 GHz) and the sizes of the two sources with known LS are small. The radio powers are high in this cluster, the median $\log~(P_{5~\rm GHz})$ = 27.4. Morphologically these sources are CSOs and CDs, and there are two unresolved sources. These sources could also be young sources, but different from the sources in the cluster 7, as these have substantially higher turnover frequencies and radio powers.

The cluster 6 on the right edge of the map consists of low-redshift galaxies with gps-type spectra and CSO morphologies. The sizes of the sources are small (median 0.027 kpc) and their spectra are narrow (median FWHM 1.0 decades of frequency), and have their turnovers at rather high frequencies (median 6.5 GHz). For B1404+286, there is a kinematic age estimate of 100-200 years obtained by Polatidis & Conway (2003) who also have found an upper limit for the expansion velocity for B1934-638 in the cluster. Two of the sources have been suggested to exhibit recurrent activity.

$\begin{figure} \par\includegraphics[width=15.8cm,clip]{9222fig1.eps} \end{figure}$

Figure 1: Size vs. rest frame turnover frequency plotted for each cluster, for all the sources and for sources with auxiliary gps classification. The clusters 4 and 13 did not have any sources with information on both the size and the turnover frequency. When confirmed by statistics, the slope of the anticorrelation is plotted. The solid line depicts the anticorrelation without the outlier in the plots of the cluster 14, all the sources, and gps sources. The dashed line shows the effect of the outlier.

The main differences between this cluster and the clusters 7 and 10 are the very low redshift, the small size, the low radio power (the lowest of all clusters), the high column density of neutral hydrogen (although the cluster 10 does not have any column density information to compare to), and the high spectral curvature of the sources in this cluster. The median turnover frequency (6.5 GHz) is substantially higher than in the cluster 7.

The cluster number 11 is populated by a mixture of gps quasars and galaxies. Most of the sources have core-jet morphologies, but there is also one CSO and one unresolved object. Three of the core-jet sources exhibit high variability so that their auxiliary classification is gps,v. For the rest of the sources, variability is not pronounced (median of the maximum variability index without gps,v sources is 1.2) eventhough there is a median of 35 observations from which the variability index has been calculated. The sizes of both the variable and the non-variable sources are small (median 0.037 kpc) and the turnover frequencies high (median 8.3 GHz). One of the variable sources (B1607+268) has been estimated to be only $\sim$ 2200 yr old (Nagai et al. 2006), and one source has extended emission around it, while three others have not shown related extended emission (Stanghellini et al. 2005). The spectrum of one source has been succesfully fitted to a free-free absorption model (Kameno et al. 2003).

The cluster 16 in the upper right corner of the map has a mixture of CSOs and unresolved VLBI morphologies. The sources are quasars except for two sources which are galaxies.The sources have gps-type spectra except for the three sources with gps,v and one with c spectra. The sources in this cluster are characterized by extremely high rest frame turnover frequencies (median 19.8 GHz), high curvature (median 1.9), and high column densities $N_{\rm H,X}$ and $N_{\rm HI}$ , although the number of column density measurements is quite low (6 and 2, respectively). There are four sources associated with free-free absorption (FFA) in the literature (e.g., Kameno et al. 2003; Bicknell et al. 1997), for two sources there are kinematic age estimates of $\sim$ 180 yr and $\sim$ 900 yr (Polatidis & Conway 2003), and there is also the prototype recurrent source B0108+388, which is the only source with information on extended emission in the cluster. The median curvature of the spectra is 1.9, which is close to the characteristic FFA curvature value of $\sim$ 2 (the exact value depends on the homogeneity or clumpiness of the absorbing medium, Bicknell et al. 1997). This cluster may represent free-free absorbed sources.

There does not seem to be any cluster clearly hosting a population of frustrated sources. The number of column density measurements is rather low, and there are no other possible indicators of the density of the environments of the sources. Therefore the possibility cannot be ruled out completely, but it seems unlikely that any of the current clusters represent frustrated sources.

4.2 Size - turnover anticorrelation

As mentioned in Sect. 2.1, the linear size LS information is not of uniform quality. However we believe the values are accurate enough to study the linear size - turnover frequency anticorrelation. It was discovered by Fanti et al. (1990) for CSS sources and confirmed later by O'Dea & Baum (1997) for a combined sample of CSS and GPS sources. O'Dea & Baum (1997) found a correlation of $\nu_{\rm peak} \propto LS^{-0.65}$ .

The sizes and turnover frequencies (in the source frame) for each cluster have been plotted to Fig. 1. There is also a size- $\nu_{\rm peak}$ plot of all the sources and of sources with gps classification. Pearson Correlation tests were performed to study if the logarithms of the variables were linearly correlated, and a model was plotted for the correlating clusters. For most of the clusters, there were not enough size and turnover frequency data to allow the determination of the correlation, but the clusters 11 and 16 show statistically significant anticorrelation between the variables. Also for the cluster 14 the anticorrelation is confirmed, if the outlier (B0201+113 with a size of 27.3 kpc obtained with the IPS method by Jeyakumar et al. 2000) is excluded.

The anticorrelation is also valid when the entire sample and all the gps sources are studied. When the outlier B0201+113 is included, the slope of the entire sample is the same as in the sample of O'Dea & Baum (1997), but when excluding the outlier, our slope becomes steeper (-0.75). The same value is obtained when considering only the sources with gps type spectra. For the clusters 11 and 14 the slope is steeper (-0.86 and -0.81, respectively), and for the cluster 16 the slope is flatter (-0.72), however, the variations may not be intrinsic but due to the low number of data points and the incoherence of the VLBI measurements.

The upper left end of the $LS-\nu_{\rm peak}$ distribution is populated by blazars of the clusters 12 and 15, and high peaking sources of the gps clusters 11 and 16. The small size and the high turnover frequency in blazars are due to a small viewing angle and relativistic beaming, whereas at least for four sources in the cluster 16 this explanation in unlikely because the sources are CSOs, i.e. they have large viewing angles. Therefore, the continuous distribution of data in the $LS-\nu_{\rm peak}$ plot does not necessarily imply that the sources are just scaled versions of each other or connected by evolution.

5 Conclusions

We have collected a sample of 206 GPS sources and HFPs presented in the literature, and collected various parameters for them. We have analyzed the sample with self-organising neural networks using centroid clustering analyses. The method and the parameters we have used seem to describe the essence of the sources rather well, as the distributions of the VLBI morphology and the optical identification are consistent with the clustering structure, eventhough they were not used in the analyses.

Our results confirm the contamination of GPS samples by small, beamed blazar-type sources. Over a quarter of the cluster map is populated by variable flat-spectrum quasars and sources with inverted spectra during outbursts. These sources should be excluded from GPS samples, and the results of the GPS studies in which they have been included should be reconsidered.

Sources with confirmed gigahertz-peaked spectrum form different clusters, and it seems likely that there are various different populations of GPS sources in addition to the quasar - galaxy dualism.

Our analyses produce a cluster of very young (confirmed by kinematic age estimates) galaxy CSOs with rather low radio powers and low intrinsic turnover frequencies, which is in contradiction with the view that the youngest sources would have the highest turnover frequencies. There is also a cluster, consisting of a mixture of CSO and CD quasars and galaxies with high peak frequencies and high radio powers, which also could be young sources but of different type than in the above-mentioned cluster. We have also identified a cluster that may represent free-free absorbed sources as well as a cluster of quasars and galaxies with mostly core-jet morphologies and consistent GPS-type spectra.

We have confirmed the size-turnover frequency anticorrelation presented in the literature, with a somewhat steeper correlation factor of -0.75. However, the factor is identical for the entire sample and the sample where only the genuine GPS sources have been included. This cannot be interpreted as a sign of any common evolution or that all the sources would be simply scaled versions of each other. Substantial fraction of sources in the high turnover - small size end of the distribution are blazars, foreshortened by small viewing angles and having boosted emission, i.e. the mechanism connecting the small size and the high turnover frequency is different than in truly small and possibly young sources.

Acknowledgements

The authors made use of the database CATS (Verkhodanov et al. 1997) of the Special Astrophysical Observatory. The authors acknowledge the support of Academy of Finland to the Metsähovi observing projects. UMRAO is supported in part by funds from the NSF and by funds from the University of Michigan Department of Astronomy. This research made use of the NASAIPAC Extragalactic Database (NED), which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under the contract with the National Aeronautics and Space Administration.

References

Aller, M. F., Aller, H. D., & Hughes, P. A. 2003, ApJ, 586, 33
Altschuler, D. R. 1989, Fundamentals of Cosmic Physics, 14, 37
Augusto, P., Gonzalez-Serrano, J. I., Perez-Fournon, I., & Wilkinson, P. N. 2006, MNRAS, 368, 1411
Barvainis, R., Lehár, J., Birkinshaw, M., Falcke, H., & Blundell, K. M. 2005, ApJ, 618, 108
Baum, S. A., O'Dea, C. P., de Bruyn, A. G., & Murphy, D. W. 1990, A&A, 232, 19
Best, P. N., Röttgering, H. J. A., & Lehnert, M. D. 1999, MNRAS, 310, 223
Bicknell, G. V., Dopita, M. A., & O'Dea, C. P. O. 1997, ApJ, 485, 112
Bloom, S. D., Marscher, A. P., Moore, E. M., et al. 1999, ApJS, 122, 1
Brett, D. R., West, R. G., & Wheatley, P. J. 2004, MNRAS, 353, 369
Dallacasa, D., Bondi, M., Alef, W., & Mantovani, F. 1998, A&AS, 129, 219
Dallacasa, D., Stanghellini, C., Centonza, M., & Fanti, R. 2000, A&A, 363, 887
Dallacasa, D., Falomo, R., & Stanghellini, C. 2002, A&A, 382, 53
de Vries, N., Snellen, I. A. G., Schilizzi, R. T., Lehnert, M. D., & Bremer, M. N. 2007, A&A, 464, 879
de Vries, W. H., Barthel, P. D., & Hes, R. 1995, A&AS, 114, 259
de Vries, W. H., Barthel, P. D., & O'Dea, C. P. 1997, A&A, 321, 105
de Vries, W. H., O'Dea, C. P., Barthel, P. D., & Thompson, D. J. 2000, A&AS, 143, 181
Elvis, M., Fiore, F., Wilkes, B., McDowell, J., & Bechtold, J. 1994, ApJ, 422, 60
Fanti, R., Fanti, C., Schilizzi, R. T., et al. 1990, A&A, 231, 333
Fassnacht, C. D., & Taylor, G. B. 2001, AJ, 122, 1661
Fey, A. L., & Charlot, P. 1997, ApJS, 111, 95
Fugmann, W., & Meisenheimer, K. 1988, A&AS, 76, 145
Giroletti, M., Giovannini, G., Taylor, G. B., et al. 2003, A&A, 399, 889
Guainazzi, M., Siemiginowska, A., Stanghellini, C., et al. 2006, A&A, 446, 87
Gugliucci, N. E., Taylor, G. B., Peck, A. B., & Giroletti, M. 2005, ApJ, 622, 136
Gupta, N., Salter, C. J., Saikia, D. J., Ghosh, T., & Jeyakumar, S. 2006, MNRAS, 373, 972
Gurvits, L. I., Kellermann, K. I., & Frey, S. 1999, A&A, 342, 378
Hewitt, A., & Burbidge, G. 1993, ApJS, 87, 451
Homan, D. C., & Lister, M. L. 2006, AJ, 131, 1262
Impey, C. D., Lawrence, C. R., & Tapia, S. 1991, ApJ, 375, 46
Impey, C. D., & Tapia, S. 1990, ApJ, 354, 124
Jeyakumar, S., Saikia, D. J., Pramesh Rao, A., & Balasubramanian, V. 2000, A&A, 362, 27
Kameno, S., Inoue, M., Wajima, K., Sawada-Satoh, S., & Shen, Z.-Q. 2003, PASA, 20, 213
Kohonen, T. 2001, Self-Organizing Maps 3rd Ed. (Berlin: Springer), Springer series in information sciences, 501
Kovalev, Y. A., Kovalev, Y. Y., & Nizhelsky, N. A. 2000, PASJ, 52, 1027
Labiano, A., Barthel, P. D., O'Dea, C. P., et al. 2007a, A&A, 463, 97
Labiano, A., O'Dea, C. P., Barthel, P. D., de Vries, W. H., & Baum, S. A. 2007b, ArXiv Astrophysics e-prints
Lister, M. L., Kellermann, K. I., & Pauliny-Toth, I. I. K. 2002, in Proceedings of the 6th EVN Symp., ed. E. Ros, R. W. Porcas, A. P. Lobanov, & J. A. Zensus, 135
Marcha, M. J. M., Browne, I. W. A., Impey, C. D., & Smith, P. S. 1996, MNRAS, 281, 425
Miller, A. S., & Coe, M. J. 1996, MNRAS, 279, 293
Mingaliev, M. G., Stolyarov, V. A., Davies, R. D., et al. 2001, A&A, 370, 78
Murgia, M., Fanti, C., Fanti, R., et al. 1999, A&A, 345, 769
Nagai, H., Inoue, M., Asada, K., Kameno, S., & Doi, A. 2006, ApJ, 648, 148
O'Dea, C. P. 1998, PASP, 110, 439
O'Dea, C. P., & Baum, S. A. 1997, AJ, 113, 148
O'Dea, C. P., Baum, S. A., & Stanghellini, C. 1991, ApJ, 380, 66
O'Dea, C. P., Stanghellini, C., Baum, S. A., & Charlot, S. 1996, ApJ, 470, 806
O'Dea, C. P., Gallimore, J., Stanghellini, C., Baum, S. A., & Jackson, J. M. 2005, AJ, 129, 610
Orienti, M., Dallacasa, D., & Stanghellini, C. 2007, A&A, 461, 923
Orienti, M., Dallacasa, D., Tinti, S., & Stanghellini, C. 2006a, A&A, 450, 959
Orienti, M., Morganti, R., & Dallacasa, D. 2006b, A&A, 457, 531
Phillips, R. B., & Mutel, R. L. 1980, ApJ, 236, 89
Phillips, R. B., & Mutel, R. L. 1982, A&A, 106, 21
Pihlström, Y. M., Conway, J. E., & Vermeulen, R. C. 2003, A&A, 404, 871
Polatidis, A. G., & Conway, J. E. 2003, PASA, 20, 69
Rajaniemi, H. J., & Mähönen, P. 2002, ApJ, 566, 202
Ricci, R., Prandoni, I., Gruppioni, C., Sault, R. J., & De Zotti, G. 2004, A&A, 415, 549
Schombert, J. M., Wallin, J. F., & Struck-Marcell, C. 1990, AJ, 99, 497
Siebert, J., Brinkmann, W., Drinkwater, M. J., et al. 1998, MNRAS, 301, 261
Siemiginowska, A., Aldcroft, T. L., Bechtold, J., et al. 2003, PASA, 20, 113
Snellen, I. A. G., Zhang, M., Schilizzi, R. T., et al. 1995, A&A, 300, 359
Snellen, I. A. G., Lehnert, M. D., Bremer, M. N., & Schilizzi, R. T. 2002a, MNRAS, 337, 981
Snellen, I. A. G., McMahon, R. G., Hook, I. M., & Browne, I. W. A. 2002b, MNRAS, 329, 700
Stanghellini, C. 2003, PASA, 20, 118
Stanghellini, C., Baum, S. A., O'Dea, C. P., & Morris, G. B. 1990, A&A, 233, 379
Stanghellini, C., O'Dea, C. P., Baum, S. A., & Laurikainen, E. 1993, ApJS, 88, 1
Stanghellini, C., O'Dea, C. P., Baum, S. A., et al. 1997, A&A, 325, 943
Stanghellini, C., O'Dea, C. P., Dallacasa, D., et al. 1998, A&AS, 131, 303
Stanghellini, C., Dallacasa, D., O'Dea, C. P., et al. 2001, A&A, 379, 870
Stanghellini, C., O'Dea, C. P., Dallacasa, D., et al. 2005, A&A, 443, 891
Steppe, H., Jeyakumar, S., Saikia, D. J., & Salter, C. J. 1995, A&AS, 113, 409
Tinti, S., & de Zotti, G. 2006, A&A, 445, 889
Tinti, S., Dallacasa, D., de Zotti, G., Celotti, A., & Stanghellini, C. 2005, A&A, 432, 31
Torniainen, I., Tornikoski, M., Teräsranta, H., Aller, M. F., & Aller, H. D. 2005, A&A, 435, 839 (Paper I)
Torniainen, I., Tornikoski, M., Lähteenmäki, A., et al. 2007, A&A, 469, 451
Tornikoski, M., Lainela, M., & Valtaoja, E. 2000, AJ, 120, 2278
Tornikoski, M., Jussila, I., Johansson, P., Lainela, M., & Valtaoja, E. 2001, AJ, 121, 1306
Verkhodanov, O. V., Trushkin, S. A., Andernach, H., & Chernenkov, V. N. 1997, ed. Gareth Hunt & H. E. Payne, ASP Conf. Ser., 125, 322
Véron-Cetty, M.-P., & Véron, P. 2003, A&A, 412, 399
Véron-Cetty, M.-P., & Véron, P. 2006, A&A, 455, 773
Vink, J., Snellen, I., Mack, K.-H., & Schilizzi, R. 2006, MNRAS, 367, 928
Visvanathan, N., & Wills, B. J. 1998, AJ, 116, 2119
Wills, B. J., Wills, D., Breger, M., Antonucci, R. R. J., & Barvainis, R. 1992, ApJ, 398, 454
Wright, E. L. 2006, PASP, 118, 1711
Xiang, L., Dallacasa, D., Cassaro, P., Jiang, D., & Reynolds, C. 2005, A&A, 434, 123
Xiang, L., Reynolds, C., Strom, R. G., & Dallacasa, D. 2006, A&A, 454, 729
Zukowski, E. L. H., Kronberg, P. P., Forkert, T., & Wielebinski, R. 1999, A&AS, 135, 571

6 Online Material

$\begin{figure} \par\includegraphics[angle=90,width=16.4cm,clip]{9222fig2.eps}\end{figure}$

Figure 2: U-matrix and the component planes of all parameters when all the sources in the sample are analyzed.

$\begin{figure} \par\includegraphics[angle=90,width=16.4cm,clip]{9222fig3.eps}\end{figure}$

Figure 3: Sources marked by their auxiliary classification on the grid of SOM.

$\begin{figure} \par\includegraphics[angle=90,width=16cm,clip]{9222fig4.eps}\end{figure}$

Figure 4: VLBI morphology of the sources, cso = compact symmetric object (cyan), cj = core-jet object (yellow), cd = compact double (light green), cx = complex (blue), unres = unresolved (red), unknown = no observations (pink), ct = compact triple (light blue), s = stellar (orange), gl = gravitational lense (violet), ln = linear (purple). The number in the parenthesis describes the number of sources with the related morphology, and the size of the dot the total number of sources in the neuron.

$\begin{figure} \par\includegraphics[angle=90,width=16.4cm,clip]{9222fig5.eps}\end{figure}$

Figure 5: Sources marked by their optical identification on the grid of SOM.

$\begin{figure} \par\includegraphics[angle=90,width=16.4cm,clip]{9222fig6.eps}\end{figure}$

Figure 6: Source names and the cluster numbers on a map of the clusters $N_{\rm clust} = 16$ .

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7a.eps} \end{figure}$

Figure 7: Radio spectra of sources with the fitted curves. The median values of each data bin are marked with red colour.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7b.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7c.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7d.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7e.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7f.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7g.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7h.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7i.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7j.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7k.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7l.eps} \end{figure}$

Figure 7: continued.

$\begin{figure} \par\includegraphics[width=17cm]{9222fi7m.eps} \end{figure}$

Figure 7: continued.

Cluster	1	2	3	4	5	6	7	8
Sources	B0018+729	B0204-306	B0159+839	B0431-026	B0022-423	B0000+212	B0026+346	B1120-274
	B0019-000	B0316+162	B0326+349	B0700+470	B0105-122	B0424+328	B0116+319	B1324+574
	B0238-084	B0359-294	B0621+446	B1054+004	B0144+209	B1404+286	B0428+205	B1358+624
	B0240-217	B0405-280	B1502+036	B1107+109	B0207-224	B1509+054	B1031+567	B2342+821
	B0320+053	B0405-395	B1600+335	B1132-000	B0208+040	B1601-222	B1117+146
	B0507+179	B0439-337	B2053-201	B1200+045	B0703+468	B1622+665	B1323+321
	B0651+410	B0454-088		B1347-218	B0706+460	B1732+094	B1345+125
	B0910+151	B0554-026		B1349+027	B0741-063	B1934-638	B1638+124
	B0941-080	B0602+780		B1548-302	B0802+103		B2021+614
	B1107-187	B0904+039		B1714+193	B0914+114		B2210+016
	B1144+352	B1042-269		B1726+769	B1133+432		B2352+495
	B1540-077	B1343-300		B2126-185	B1225+368
	B1824+271	B1410+138		B2153-119	B1350+113
		B1444-339		B2353+816	B1433-040
		B1553-062			B1503-091
		B1751+278			B1518+046/7
		B2050+364			B1545-120
		B2121-014			B1557-004
		B2323+790			B1646+028
					B2008-068
					B2055+055
					B2154-183
					B2201+098
					B2322-040
					B2333-528
					B2337-063
Cluster	9	10	11	12	13	14	15	16
Sources	B0902+490	B0002+051	B0500+019	B0048-097	B0039+230	B0153+744	B0332-403	B0108+388
	B0930+493	B0034+078	B0552+398	B0113+241	B1645+635	B0201+113	B0400+258	B0457+024
	B1057-797	B0711+356	B0743-006	B0354+231	B2254+024	B0215+015	B0454-234	B0636+680
	B1323+799	B0718+374	B0923+392	B0437-454		B0218+357	B0528-250	B0642+449
	B1354-174	B1043+066	B1355+441	B0516+087		B0237-233	B0537-441	B0710+439
	B1604+315	B1100+223	B1607+268	B0646+600		B0248+430	B0633+595	B1143-245
	B2254-204	B1734+508	B2019+050	B1013+054		B0404+768	B0858-279	B1427+109
		B1853+376	B2134+004	B1039+811		B0528+134	B1144+542	B1526+670
		B2128+048	B2149+056	B1118-056		B0738+313	B1146+531	B1614+051
				B1148-171		B0742+103	B1333+459	B1843+356
				B1349-439		B1127-145	B1334-127	B1848+283
				B1357+769		B1245-197	B1519-273	B2000-330
				B1455+080		B1422+231	B1758+388	B2126-158
				B1601+112		B1442+101	B1839+389	B2337+264
				B1749+096		B1543+005	B1936-155
				B1803+784		B2022+171	B2008-159
				B1807+170		B2136+141	B2112+283
				B1851+488			B2121+053
				B1954-388			B2128-123
				B2007+777			B2255-282
				B2059+034
				B2205+166
				B2209+236
				B2236+124
				B2318+049
				B2327+335

Cluster analyses of gigahertz-peaked spectrum sources with self-organising maps,

2 Sample and data

2.1 Source size

2.4 Auxiliary classification

3 Analyses

4 Results and discussion

5 Conclusions

6 Online Material

Cluster analyses of gigahertz-peaked spectrum sources with self-organising maps^,