Identification of filamentary structures in the environment of superclusters of galaxies in the Local Universe

The characterization of the internal structure of the superclusters of galaxies (walls, filaments and knots where the clusters are located) is paramount for understanding the formation of the Large Scale Structure and for outlining the environment where galaxies evolved in the last Gyr. (i) To detect the compact regions of high relative density (clusters and rich groups of galaxies); (ii) to map the elongated structures of low relative density (filaments, bridges and tendrils of galaxies); (iii) to characterize the galaxy populations on filaments and study the environmental effects they are subject to. We employed optical galaxies with spectroscopic redshifts from the SDSS-DR13 inside rectangular boxes encompassing the volumes of a sample of 46 superclusters of galaxies, up to z=0.15. Our methodology implements different classical pattern recognition and machine learning techniques pipelined in the Galaxy Systems-Finding algorithm and the Galaxy Filaments-Finding algorithm. We detected in total 2,705 galaxy systems (clusters and groups, of which 159 are new) and 144 galaxy filaments in the 46 superclusters of galaxies. The filaments we detected have a density contrast above 3, with a mean value around 10, a radius of about 2.5 Mpc and lengths between 9 and 130 Mpc. Correlations between the galaxy properties (mass, morphology and activity) and the environment in which they reside (systems, filaments and the dispersed component) suggest that galaxies closer to the skeleton of the filaments are more massive by up to 25% compared to those in the dispersed component; 70 % of the galaxies in the filament region present early type morphologies and the fractions of active galaxies (both AGN and SF) seem to decrease as galaxies approach the filament. These results suggest that preprocessing in large scale filaments could have significant effects on galaxy evolution.


Introduction
The Large Scale Structure (LSS) of the Universe is composed of a network of groups and clusters of galaxies, elongated filaments, widely spread sheets, and voids (e.g. Peebles, 1980;Davis et al., 1982;Bond et al., 1996). Both, the ΛCDM cosmological model (e.g. Bond & Szalay, 1983;Doroshkevich & Khlopov, 1984) and recent numerical N-body simulations (e.g. Millennium, Springel et al., 2005;Bolshoi, Klypin et al., 2011;Illustris, Vogelsberger et al., 2014), reinforce that these structures are assembled under the effect of gravity, generated by the total matter content. Since the baryonic matter follows, to first order, the distribution of the dark matter, the galaxies and gas populate these substructures accordingly (e.g. Eisenstein et al., 2005). Moreover, there is increasing evidence that the galaxy properties (for instance mass, activity, morphology, luminosity, surface brightness, orientation, etc.) correlate with the LSS environment in which they are located (e.g. Smargon et al., 2012;Scoville et al., 2013;Poudel et al., 2016;Kuutma et al., 2017;Chen et al., 2017;Wang et al., 2018) or, more specifically, with the internal structure of the supercluster (e.g. Einasto et al., 2008;Gallazzi et al., 2009;Gavazzi et al., 2010;Cybulski et al., 2014;Guglielmo et al., 2018). Furthermore, theoretical studies (e.g. Cen & Ostriker, 1999) suggest that from one half to two thirds of the baryonic matter in the Universe is hidden in the filamentary structures of the LSS. Therefore, the characterization of the LSS (e.g. topology, density, temperature, dynamical state, matter distribution and its evolution along time) is an important step to place constraints on the current cosmological models.
Galaxy clusters are well studied through their gas component since they are the densest regions of the LSS. However, the gas in filaments is most likely in a not so hot (T ∼ 10 5 -10 7 K, or 0.01-1 keV) and relative low-density gas phase called Warm Hot Intergalactic Medium (WHIM). There is already some evidence of such gas from X-ray emission observed within pairs of close clusters (e.g. Ursino et al., 2015;Alvarez et al., 2018). In addition, the WHIM between pairs of clusters has been observed through the Sunyaev-Zel'dovich effect (SZ, e.g. Planck Collaboration et al., 2013). Tanimura et al. (2018) carried out statistical analyses using Planck SZ observations in the regions of superclusters. Their results show evidence of inter-cluster gas Santiago-Bautista et al.: Identification of filamentary structures in the Local Universe of temperature T ∼ 8×10 6 K. Also, Eckert et al. (2017) presented deep X-ray observations of the galaxy cluster Abell 2744, the analysis of which suggests a gas fraction of 5% to 15% for the filaments that surround the cluster and a plasma temperature of 1-2 × 10 7 K. Therefore, the characterization of these structures through observables like X-ray emission or Sunyaev-Zel'dovich effect is still challenging due to the low density and temperature of the WHIM.
An alternative is to analyze the galaxy distribution at large scales. Recently, with the availability of large sky area databases such as the Two Degree Field Redshift Survey (2dFRS, Colless et al., 2001), the 2MASS Redshift Survey (2MRS, Huchra et al., 2012) and the Sloan Digital Sky Survey (SDSS, Albareti et al., 2017), the development of accurate structure detection algorithms has become an even more important concern for astronomy. Visually, the galaxy distribution shows filamentary ridge-like structures that connect massive clusters and groups. However, the identification of these structures through a computational algorithm is not easy to achieve. A good algorithm should first produce an identification that resembles the human visual perception. It also should deliver quantitative results and be founded in a robust and well-defined numerical theory. All of this must be done in an acceptable amount of time with reasonable computational resources.
Currently, there are several filaments-finding algorithms that have been tested on the basis of N-body simulations. For example, Aragón-Calvo et al. (2007) present the "multi-scale morphology filter method" (MMF) that divides cosmic structure into nodes (clusters), filaments and walls by using a smoothing over a range of scales (from a Delaunay tessellation reconstruction, DFTE) and a morphological response filter. Another approach presented by Aragón-Calvo et al. (2010) makes use of a watershed segmentation techniques to trace the spines of the filaments. Also, Cautun et al. (2013) propose an algorithm that takes into account the density, tidal field, velocity divergence and velocity shear of the galaxies, called NEXUS. Other examples are the algorithms by González & Padilla (2010), which uses the binding energy for selecting the filament members; and the DisPerSE algorithm, by Sousbie (2011), based on the Morse theory -both utilize Delaunay-Voronoi tessellation based on density estimations.
On the other hand, several attempts were made to trace the distribution of the real cosmic web using the SDSS database. For example, the algorithm by Bond et al. (2010), called "smoothed Hessian major axis filament finder" (SHMAFF), was applied to the SDSS-DR6 after removing the "finger-of-God" (FoG) effect. Platen et al. (2011) compared three different reconstruction techniques, namely the DFTE, the "natural neighbor field estimator" (NNFE) and a Kriging interpolation, and searched for voids also in DR6. They found that DFTE works quantitatively better than the others while the Kriging and NNFE have a better performance in producing visually appealing reconstructions than DFTE. Smith et al. (2012) applied their "multi scale probability mapping" (MSPM), which combines probability and scale density information with a "Friends-of-Friends" (FoF) algorithm, over the SDSS-DR7 galaxies. Their method allowed them to recover structures from clusters to filaments of up to ∼10 h −1 Mpc. Tempel et al. (2014) applied a Bisous model on the SDSS-DR8 spectroscopic galaxies to trace the filament spines. Their method adjusts cylinders to the galaxy positions applying a stochastic metric. The "subspace constrained mean shift" (SCMS) approach, which uses a "kernel density estimator" (KDE), was applied by Chen et al. (2016) to DR7 and by Chen et al. (2015) to DR12. This method allows the identification of high density regions by smoothing the galaxy distribution. They apply this technique over slices of 0.05 in redshift for the SDSS sky area. Moreover, Alpaslan et al. (2014) found, for the GAMA survey, that there are fine filaments embedded inside the SDSS voids. These structures, 'tendrils', have a lower density than the SDSS filaments and appear to be morphologically distinct, they are more isolated and span shorter distances. A comprehensive review and comparative analysis of these algorithms can be found in Libeskind et al. (2018).
Another approach to analyze the LSS structure is to study the superclusters of galaxies. These are traditionally defined as concentrations of galaxy clusters (e.g. Abell, 1961;Einasto et al., 2001;Chow-Martinez et al., 2014), building up the cosmic web from a network of connected high density nodes; or directly from the distribution of galaxies (e.g. Luparello et al., 2011;Costa-Duarte et al., 2011;Liivamägi et al., 2012). They can also be defined kinematically by mapping galaxy peculiar velocity flows, a technique still restricted to the very nearby Universe (Tully et al., 2014;Dupuy et al., 2019). This last method is the closest to a purely gravitational potential based approach, and allows the identification of the "basins of attraction" that partition the Universe in cells or cocoons (e.g. Dupuy et al., 2019;Einasto et al., 2019). For this work we adopted the supercluster secondorder clustering definition for determining the superclusters of the sample. These systems are not virialized and the contents of the inter-cluster medium (dark matter halos, gas and galaxies) dynamically interact and organize by falling through the gravitational potential of the more massive structures, forming walls, filaments, groups and clusters. As shown in Tanaka et al. (2007), the possibility to find elongated chain-like structures increases in superclusters. Also, following the classification of superclusters by Einasto et al. (2014) in "filament-type" and "spider-type" ones, both have filaments, in a linear or radial configuration respectively. Following this approach, Cybulski et al. (2014), for instance, applied a combination of Voronoi tessellation and minimum spanning tree (MST) techniques over the Coma supercluster region in order to search for bridges between clusters of galaxies.
Motivated by the above context, we developed a methodology 1 for the identification of structures in the environment of superclusters using the galaxies embedded in them. We restrict our study to the SDSS-DR13 area, and use only galaxies with spectroscopic redshifts for our analysis. The approach we follow seeks to detect structures by using only the geometrical information of the galaxy distribution. By using different pattern recognition methods we identify high to moderatedensity galaxy systems, and low-density filaments connecting them. This allows the identification of structures over a wide range of scales (1 − 100 Mpc), from groups to long filaments. Moreover, the identified structures are validated through comparisons with previously reported catalogs. We also carried out a qualitative validation through a kernel method which is one of the most used methodologies for the detection of overdensity regions. Our aim was to investigate whether previous filament candidates in the sample, identified from chains of Abell/ACO clusters, are bona-fide structures and to characterize their galaxy populations. Finally, we studied the relation between the galaxy properties and the supercluster environment in which they reside (e.g. systems, filaments and the dispersed component of the superclusters).
This paper is organized as follows: In Section 2 we present the data for the sample of superclusters under analysis and the sample of galaxies from the SDSS survey. In Section 3 we describe in detail the implementation of mathematical tools and pattern recognition methods applied for the detection of high density regions (clusters and groups) and for the skeletonization of the low density filamentary structures. In Section 4 we describe the algorithm for detecting clusters and groups of galaxies inside superclusters' boxes, while in Section 5 we present the algorithm for finding the filaments and their skeletons. In Section 6 we describe the application of the algorithms to one of the superclusters, MSCC 310, as an example of their use. Section 7 is devoted to the validation and evaluation of the methodology and discussion of its results. In Sections 8 and 9 we present the results concerning the analyses of the galaxy properties as function of the supercluster environment. We also discuss these results and compare with previously reported results. Finally, in Section 10 we present the conclusions of this work. Through this paper we assume the Hubble constant H 0 = 70 h 70 km s −1 Mpc −1 , the matter density Ω m = 0.3, and the dark energy density Ω Λ = 0.7.

The superclusters and filament candidates
We are interested in unveiling and studying LSS filaments, which can be defined as chains of clusters connected by bridges of galaxies and probably by gas and dark matter. As mentioned previously, these elongated structures should most likely be found in superclusters since they probably just passed the quasi non-linear regime described by Zel'dovich's approximation (1970, see also the "sticking model" by Shandarin & Zel'dovich 1989). In the current evolutionary stage of LSS, superclusters are basically a network of sheets, filaments and knots (clusters and groups) of galaxies, gas and dark matter, just starting a global gravitational collapse process.
Thus, we selected a sample of superclusters of galaxies from the Main SuperCluster Catalogue (MSCC, Chow-Martinez et al., 2014), which are inside the SDSS region (in order to have a sample of galaxy data as homogeneous as possible). The original MSCC is an all-sky catalog that contains 601 superclusters, identified in a complete sample of rich Abell/ACO clusters, with updated redshifts from 0.02 to 0.15, by using a tunable FoF algorithm. From these superclusters, 166 are inside the SDSS Data Release 13 (DR13) region. For this work we selected those superclusters with 5 or more clusters having their box volume (see below) inside the SDSS-DR13 survey area. In addition, we used as reference the list of filament candidates for MSCC superclusters by Chow-Martínez et al. (in preparation), in order to select the superclusters with the most promising filaments. Roughly speaking, these filament candidates were identified as chains of at least three clusters, members of the superclusters, separated by less than 20 h −1 70 Mpc from each other. The present work also intends to validate these filament candidates by searching for the bridges of galaxies that we expect to connect them. It is worth mentioning that some of the filament candidates may reveal to be only chance configurations, with no bridges of galaxies connecting the clusters of a chain. Also, some bridges may exist, but not necessarily along the straight lines connecting the clusters.
Our final sample consists of 46 superclusters of galaxies, which are listed in Table 1. The ID of the supercluster in MSCC is listed in column 1, with its proper name in column 2, when it exists. Column 3 presents sky coordinates, RA (α) and Dec (δ), of the supercluster mean position, while column 4 shows its mean redshift. Columns 5 and 6 list the richness (number of member clusters) and the number of filament candidates found previously in each supercluster. The IDs of the Abell/ACO member clusters are listed in column 7.
For the Abell/ACO clusters and for the galaxies in the superclusters box volumes (see Section 2.3), we first transformed their radial-angular coordinates to rectangular coordinates as follows: where D C is the co-moving distance as obtained using the spectroscopic redshift and the cosmological parameters indicated above.

The SDSS galaxies
The main galaxy sample of SDSS-DR13 (Albareti et al., 2017) is a suitable database to search for filamentary structures on the LSS since: (i) it covers a large sky area (14,555 square degrees), containing various MSCC superclusters; (ii) it contains homogeneous photometric and spectroscopic data for galaxies with an astrometric precision of 0.1 arcsec rms and uncertainty in radial velocities of about 30 km s −1 (Bolton et al., 2012); (iii) it is roughly complete to the magnitude limit of the main galaxy sample (r Pet = 17.77), which corresponds to an average z ∼ 0.1, going (inhomogeneously) deeper for data releases after DR7 (Abazajian et al., 2009); (iv) at the limit of our sample, z = 0.15, the SDSS spectra are complete for galaxies brighter than M r ∼ −21. SDSS-DR7 joins the SDSS-I/II spectra for one million galaxies and quasars. It has ∼6% incompleteness due to fiber collisions (Strauss et al., 2002) and another ∼7% incompleteness attributed to pipeline misclassification (Rines et al., 2007). These spectra are included in the final data release of the SDSS-III (Alam et al., 2015). The Baryon Oscillation Spectroscopic Survey (BOSS) is part of the SDSS-III observations and has obtained spectra for another 1.4 million galaxies. The BOSS observations are divided in two main samples, LOWZ (z < 0.4) and CMASS (0.4 < z < 0.7). The SDSS-DR13 (Albareti et al., 2017) includes spectra for more than 2.6 million galaxies and quasars.
Although photometric redshifts are available for SDSS galaxies, for this work we have selected those objects listed on the SpecObj sample with spectroscopic redshifts available (downloaded from the SkyServer web service) and denoting an extragalactic object (that is, galaxies and low-z quasars). The SpecObj table contains the best and unique spectra for the same location within 2 arcsec called "sciencePrimary" objects. We considered galaxies within a redshift range from 0.01 to 0.15 and selected spectra with quality flag "good" or "marginal". Since the kind of study presented here relies on the galaxy distance measurement, we restricted our analysis to galaxies with spectroscopic redshift due to its higher accuracy. However, galaxies with photometric redshift can be included to the sample in further analyses to test if their addition increase the filament signal of detection.
For the present work we also made use of value-added sub product catalogs such as the MPA-JHU catalog (Brinchmann et al., 2004;Kauffmann et al., 2003;Tremonti et al., 2004). They calculated different galaxy properties (stellar mass, metallicity, activity type classification, star forming rate, among others) using the spectra from the SDSS-DR8 galaxies (Aihara et al., 2011). As explained by Tremonti et al. (2004), the galaxy properties in the MPA-JHU catalog are calculated by processing the galaxy spectrum in a way that even the weaker emission lines are detectable. In order to analyze the morphological distribution of the galaxies in the different supercluster environments we employed the morphological classification provided by Huertas-Company et al. (2011). They calculate a probabilistic morphological classification, for the SDSS-DR7 spectroscopic galaxies, by applying deep learning techniques that make use of their photometry. They also compare their automated classification with a sample of the Galaxy Zoo (Lintott et al., 2008(Lintott et al., , 2011 visual classification. They show that their classification in early and late types are in good agreement with the visual classification.

The superclusters' boxes
For each supercluster in Table 1 we selected all the SDSS galaxies (according to the above criteria) located inside the corresponding box volume. These boxes were defined in rectangular coordinates, in a way that their walls were set at a distance of 20 h −1 70 Mpc beyond the center of the farthest clusters in each direction, for each supercluster. This extension was applied in order to guarantee that any connection of the supercluster with external structures could be detected.  , the boxes we use here are a little bit larger, as expected, implying that we are sampling the LSS in a general way, not restricting the analysis to the densest parts of the superclusters. Our sampling of the superclusters may be compared to the one by Krause et al. (2013), that is, more embracing than the sampling done by, for example, Kopylova & Kopylov (2006) and Liivamägi et al. (2012).
In Table 2 we list the box volume (column 2), the number of galaxies inside this volume (column 3), its mean volume and surface (sky projected) number densities (columns 4 and 5), the baseline density (column 6), the number of galaxies with surface density above the baseline density (column 7), the segmentation parameter f and the number of HC groups (see Section 3.2 for details) (columns 8 and 9), the remaining FoG corrected groups of richness between 5 and 9 galaxies (column 10) and of richness higher or equal to 10 galaxies (column 11), the ranges of radius and velocity dispersion for this final list of groups (columns 12 and 13), for each supercluster.
In particular, the superclusters MSCC 236, MSCC 314 and MSCC 317 lie close to the limits of the SDSS region: although all their member clusters are inside, their boxes were reduced to a margin of 10 h −1 70 Mpc in only one direction. Figure 1 shows the diminution of mean volume density of the boxes with redshift, due to Malmquist bias. The fitted function will be used as the selection function for the SDSS galaxies considered in this work. It may be noted that the mean densities of the superclusters MSCC 55 and MSCC 579 lie far below the fit. This is also due to the positions of these superclusters close to the border of SDSS coverage: their samplings seem sparse and irregular. In fact, for MSCC 579 one can clearly see the shape of the cones of observation through the galaxy distribution. For this reason, the analysis of these superclusters and of the three cited above must be taken with caution.
It is worth noting that, since we have only the radial velocity component available (redshift), the transformation from radialangular coordinates to rectangular coordinates is more complicated for the galaxies. Their peculiar velocity may bias their redshift-space coordinate, especially when they are members of clusters and groups of galaxies, being subject to the FoG effect. Thus, for the galaxy data used to detect the filaments, we first applied a correction, to be described below, which redefined their individual D C in equations 1-3.

Mathematical tools
In what follows, we considered the N galaxies in each supercluster volume as a set of points x 1 , x 2 , ..., x N ∈ X, all part of a sample X.

Voronoi Tessellation (VT)
The Voronoi tessellation (Voronoi, 1908) of a sample X, Vor(X), can be defined as the subdivision of a 2D plane or a 3D space into cells with the property that the seed point x i ∈ X is located in the cell v i if and only if the Euclidean distance In other words, the VT divides the space into polygonal cells, centered on the seed points (in our case, galaxies), in a way that the cell walls are equidistant to all nearest seeds (e.g. Platen et al., 2011). Therefore, the density at each galaxy position x i is determined as d i = 1/v i , with v i being the volume (or area) of the cell enclosing the object x i . Scoville et al. (2013) and Darvish et al. (2015), for instance, use VT to find the high density regions in sky slices while Cybulski et al. (2014) apply VT to identify the filamentary structures in the Coma cluster region.

Hierarchical Cluster Analysis (HC)
Hierarchical clustering is a machine learning method whose objective is to group objects with similar properties. It has been used in different areas of science such as artificial intelligence, biology, medicine and business. In general, it can be used to carry out pattern recognition analysis, allowing to regroup, segment and classify any kind of data. This method is equivalent to a reduction of the dimensionality of the data and reduces considerable the computing time. In astronomy, the most popular application of HC has been for the detection of substructures inside galaxy clusters, following the algorithm developed by Serna & Gerbal (1996). This algorithm considers the positions, redshifts and potential binding energy between pairs of galaxies to detect substructures (see also Guennou et al., 2014).
Since we are interested in finding galaxy structures on scales larger than the ones for substructures and for structures that may be less strongly gravitationally bounded, we chose to use an agglomerative hierarchical clustering analysis method that considers only the positions and redshifts or 3D estimated positions of galaxies. A detailed description of the HC algorithm can be found in Theodoridis & Koutroumbas (2009);Theodoridis et al. (2010) and Murtagh & Contreras (2011). For our analysis we chose Ward's minimum variance clusterization criteria, described in detail by Murtagh & Legendre (2014). In general, Ward's method works by merging the groups following the criterion: where ∆D is a term that measures the distance between two groups c 1 and c 2 , respectively. In our case, initially each point is considered as a group, subcluster or singleton, then each group can be agglomerated with a neighbor that has the minimum ∆D distance. The agglomeration continues until all points are grouped together.
The results of the HC clusterization can be represented by a dendrogram or hierarchical tree. A dendrogram represents, in a graphical form, the connections between elements and groups at different levels of agglomeration. The height of each connection line in the tree corresponds the distance between two elements or centroids connected. This representation also allows visualizing the principal branch structures where the singletons are the final leaves. The number of desired groups N cut is, therefore, obtained by cutting the hierarchical tree at a certain level. The exact value of this level depends on the characteristics of the sample or, more properly, on the underlying physics that is used to define the groups.
Each created group can be represented by a 2D/3D Gaussian model, P j (x). This allows to classify the groups by their Gaussian properties, e.g. centroid (mean position, C j ), richness (number of members, N j ) and compactness (covariance, σ j ).

Graphs
Graph theory-based algorithms have shown to be a suitable tool to analyze complex networks. Some of the most common subjects where these algorithms are applied successfully are social networks, computer vision, statistics, business and transportation networks.
A graph is a representation of the connections in a network. It is composed of "nodes" and "edges", where each node represents an object, and the edges represents the connections between each two nodes. Also, the edges can have weights that represent the strength of the connection. An undirected graph has edges that do not have direction. We can define an undirected graph as G = (U, E, W), with n nodes (or vertices) u i ∈ U, m edges e kl ∈ E and a weight set W with a w kl for each edge e kl . The information of a graph can be represented by a square adjacency matrix. The values of the matrix entries indicate the weight of the connection between nodes. Hence, the adjacency matrix A of the graph G is defined as: where u k and u l are nodes in G. One can refer to Ueda & Itoh (1997) for a discussion about the use of graph theory approach for quantifying the LSS of the Universe.

Minimum Spanning Tree (MST)
A spanning tree connects all nodes in a graph in a way that does not produce cycles. A graph can contain several unconnected spanning trees. Since the edges in a graph can have weights, the minimum spanning tree algorithm (Graham & Hell, 1985) searches for a spanning tree that minimizes the total weight. This algorithm traces a tree-like continuous path for a group of edges and nodes in an optimal way. In particular, the Kruskal minimum spanning tree algorithm analyzes the edges in sequence, sorting them by weight. At the beginning, the shortest edge is analyzed and this would be the first tree branch. Then, the nodes are added to the tree under three conditions: (i) only one node is added to the tree; (ii) a node is added based in the number of connected edges; (iii) their edges cannot be connected to another existing node in the tree. The process continues with the following edges in the graph until all connected edges are analyzed. Finally, the tree is extracted from the graph and the process begins again with the remaining nodes until all are tested. As its name remarks, the result is a forest of optimized independent trees.

Dijkstra's shortest path
Dijkstra's algorithm (Dijkstra, 1959) is a classical method for searching the shortest path between two nodes in a graph. We define a path of length e kl between two nodes u k and u l as a sequence of connected nodes u 1 , u 2 , ..., u n if k l ∀ k, l ∈ 1, ..., n.
In general Dijkstra's algorithm works as follows.
First, an origin is selected by taking the node at the beginning of the path, u 0 . Then, a distance value is assigned to all nodes: set as zero for the origin, s(u 0 ), and as infinity for all the other nodes, s(u i ) = inf. Next, all nodes are marked as unvisited and u 0 is marked as current a. The algorithm then calculates the distance from the current node a to all the unvisited nodes connected by the edges e i as s new = s(e ai ) + w ai ; here s(e ai ) is the distance from a to the node u i and w ai is the weight of the edge e i . If s(e ai ) + w ai < s(e i ), then the distance is updated and the connected node label is updated as the current a. After visiting all neighbors of the current node, they are marked as visited. A visited node will not be checked again; then the recorded distance s(e ai ) is final and minimal. Finally, if all nodes have been visited, the algorithm stops. Otherwise, the algorithm sets the unvisited nodes with the smallest distance (from the initial node u 0 , considering all nodes in the graph) as the next "current node" and continues from the second step. A detailed description of the algorithm can be consulted in Santanu (2014).

Kernel Density Estimator (KDE)
As mentioned before, VT is used to measure the local density at each point position. However, in some cases, it fails on the identification of large overdensity regions, as mentioned by Cybulski et al. (2014). An alternative for the VT method is to apply kernel density estimators. In general, KDE methods work by adjusting a kernel function over each observation in the sample. However, the choice of the optimal kernel model and its intrinsic parameters is still under investigation in the pattern recognition community. Also, there are several attempts to apply adaptive Gaussian model kernels, in other words, to change the size of Gaussian model as a function of different parameters, for instance the distance to the nearest neighbor (Chen et al., 2016) or a weighting function (Darvish et al., 2015).
For this work we used the results from the VT method (see Section 3.1) as the input parameters for the KDE. We start by fitting an ellipsoid inside each VT cell. Thus, instead of choosing a fixed bandwidth for the kernel, we employ the eigenvalues and eigenvectors of the ellipsoids to calculate a Gaussian kernel φ Σ centered at µ with covariance matrix Σ for each observation. Therefore, each n-dimensional kernel is represented as: Then the KDE can be estimated as: where α i is a weight factor calculated from the VT cell volume (v i ) as 1/v i . The identification of the overdensity regions is done through the projection of KDE kernels in 2D planes superposing a regular rectangular grid to the data. Thus, the density estimation is obtained at each grid intersection by calculating the average density of all kernels that overlap at that point. Then, observations closer to an evaluating point will contribute more to the density estimation than points farther away from it. Consequently, the density will be higher in areas with many observations than in areas with few observations.

Transversal profiles
The distribution of galaxy properties in filaments is analyzed by constructing transversal profiles. These profiles are calculated by setting up a series of concentric cylinders with axes orientated along the filament skeletons. Then, a bin is considered to be the volume within two concentric cylinders of radius R cy and R cy + ∆R cy . The occurrence of a galaxy proxy in each bin is determined with respect to the galaxy distance from the filament skeleton D ske . The total count of galaxies per bin is weighted by the bin volume, in a similar way as making a normalized histogram. In order to compare samples of different sizes, a normalization is applied by dividing the number of events in a bin by the total number of galaxies in the sample.

Detection of high density regions
We first searched for the high relative density regions, clusters and groups of galaxies (which we will refer generically as galaxy systems), inside the studied superclusters, since these systems are the natural nodes for filaments. This was also necessary for correcting the FoG effect and having the data prepared for the application of the filaments-finding algorithm (see next Section). Furthermore, the detection of galaxy systems allows the identification of new ones possibly not known before (especially poorer galaxy groups), and the improvement of the membership estimation of the superclusters themselves.
A description of the algorithm, including the strategy we used for optimizing its parameters using simulated mock volumes, is presented in Santiago-Bautista et al. (2019). Here we review the main steps of this algorithm. First we calculate the local surface density at the position of each galaxy in the projected area of the supercluster by applying the VT method (section 3.1). The VT individual area of the galaxy can be directly converted to a surface density estimation (d i = 1/a i ), in this case in units of deg −2 . It is worth to note that the boxes we have considered for the superclusters in our sample comprehend slices in redshift-space in a range between 0.02 ≤ ∆z ≤ 0.07.
In order to identify the galaxy systems, we start by applying the HC method (section 3.2), but only to the N gal galaxies with densities above a baseline density, d bas , which should be analogue to a background density. In a certain sense, this separates supercluster galaxies from void galaxies (that is, under-dense regions). This baseline is calculated from the mean density by randomizing the galaxies in each sky projected area. Since the distribution of points in the space is not isotropic, it is not possible to set directly a background density from the projected positions of the galaxies. Therefore, it is necessary to simulate an isotropic distribution of the points, in order to set the baseline value (see, e.g. Cybulski et al., 2014). A set of 1 000 randomization of the points positions is generated, each with the same sample number over the same area. Then, the mean surface density is calculated by: where d i, j = 1/a i, j corresponds to the inverse of the area of the point x i for the randomization j.
Since the distribution of galaxies is not homogeneous among the different boxes, we calculate independent baseline values for each supercluster, see Table 2. A density contrast (δ i ) is then calculated as: That is, the N gal galaxies to which we apply the HC are the ones with a density contrast, δ i > 0 (see Santiago-Bautista et al., 2019, for an evaluation of the negligible effect of slightly changing the density threshold).
Then, we apply HC to the set of parameters (RA, Dec, 1000z) for these galaxies (the factor of 1000 is the weight for z values to be comparable to the sky coordinates values). The number of groups taken from the analysis is defined as a cut of the HC tree, fixed to N cut = N gal / f , with a segmentation parameter f , which is the expected mean number of elements per group. Currently, the selection of the optimal number of groups in clusterization methods is still a topic under investigation in the pattern recognition community, which includes the HC algorithm. A specific value of f was calculated for each supercluster (3 ≤ f ≤ 36) according to the optimization process described in Santiago-Bautista et al. (2019). This strategy was adopted because a physically motivated value for f would depend on many parameters, like the density of galaxies in each box, the sampling of these galaxies with respect to the real distribution, the redshift, among others which are difficult to estimate for our data.
Finally, we select only those systems with a number of galaxies, N j , larger than two. These pre-identified systems are, then, subject to the next step of refinement: the iterative estimation of the dynamical parameters virial mass and radius.

FoG correction
After identifying the galaxy systems we proceed to refine the galaxy membership and correct the galaxy positions for the FoG effect by using a virial approximation. We apply a simplified version of the algorithm presented by Biviano et al. (2006) for the estimation of the virial mass and radius. We do not apply the surface pressure term correction based on the concentration parameter. Avoiding such a correction can lead to an overestimation of the virial radius, however, for this geometric analysis, only a virial approximation is enough.
The virial parameters calculation algorithm works as follows: First we take the projected center and mean velocity of the system from the results of HC (C j ). The projected center is then set at the position of the brightest r-band magnitude member galaxy (BMG) within 1 σ j from the HC center, while the HC mean velocity is used directly. Those galaxies expected to belong to the system are selected among all galaxies in the sample (those with spectroscopic redshift in SDSS-DR13) that are projected inside a cylinder of radius R a , hereafter, aperture. Biviano et al. (2006) show that the dynamical analyses are similar for different aperture sizes. We chose an aperture of R a = 1 h −1 70 Mpc.
Then, for the line-of-sight direction, we select galaxies with a difference in velocity up to S a = ±3000 km s −1 with respect to the mean cluster velocity. This would correspond to three times the velocity dispersion of a rich cluster. A robust estimation of mean velocity, v LOS , and velocity dispersion, σ v , for the galaxies inside the cylinder is obtained by using Tukey's biweight method (Beers et al., 1990). An approximation of the mass M a in the aperture is computed as: where G is the gravitational constant, the 3π/2 is the deprojection factor and R h is the projected harmonic radius. We calculate the virial radius, R 3 vir = (3/4π)(M vir /ρ vir ), by assuming a spherical model for nonlinear collapse, that is, by taking the virialization density as ρ vir = 18π 2 [3H 2 (z)]/[8πG], and M a as an estimation for M vir , we have: Then, the aperture R a is updated to the calculated R vir value, the mean velocity to v LOS , and S a to σ v , defining a new cylinder. This process is repeated iteratively until the radius R vir converges. M vir is finally calculated at the end of the iteration process.
The correction for the FoG effect is carried out by adjusting the position of the N mem galaxies inside the final cylinder. This is done by scaling their comoving distances along the cylinder to the calculated virial radius.
A schematic representation of the GSyF algorithm, including the FoG correction, can be found on the left side of Figure 2.

Detection of low density regions
Since we are interested in the detection and analysis of elongated and low relative density contrast structures, we apply again a combined VT+HC method to the data, but now in the rectangular 3D space, with the positions of the galaxies corrected for the FoG effect. Thus, the VT densities are now volume densities, in units of Mpc −3 . At the beginning of this analysis the HC method is applied to all galaxies in the volume without density restrictions, that is, no baseline is applied. Density restrictions are considered later as criteria for the construction of filaments. Another difference between this application of VT+HC and the one used for the GSyF methodology is a relaxed cut in the hierarchical tree. Since we are interested in detecting more elongated representative structures, we tested values for the segmentation parameter f between 10 and 40. The direct effect of relaxing the cut is to allow the detection of groups at lower densities.
Here we need to make some practical definitions in order to describe our strategy.
-The nodes to which we will apply the method correspond, in the context we are working, to the HC group centroids. -An edge is defined as any connection between two nodes. -The real "links" between the systems are defined as the most promising edges, filtered according to their proximity and density contrast. -Spanning trees are extracted as described in section 3.4, by cutting the graph in no-cycled optimal trees. Some nodes inside a spanning tree may have been detected as galaxy systems by the GSyF algorithm.
-A "bridge" is defined as a sequence of links and nodes between two systems. -A "filament" is identified if a spanning tree bridges three or more systems connected by bridges. -If the spanning tree contains none or only one system, it is called "tendril". -The "skeleton" is the medial line of a filament. The method for finding it, which intends to reduce the dimensionality of the objects (in our case, galaxy filaments), is known as skeletonization. Figure 3 shows schematically these definitions.

Chaining the filaments
Once we have applied HC we measure the Euclidean distance D E of each group' centroid (node) against all its group neighbors. These connections (edges) can be represented by an undirected graph as described in section 3.3. The weights W of the edges are set by the Bhattacharyya coefficient, BC, defined as: The Bhattacharyya coefficient quantifies the amount of overlapping between two distributions P 1 (x) and P 2 (x). Thus, the orientation of the two groups weights the connection between them.
In the next step, we filter the edges by two criteria: First we select the edges corresponding to a D E smaller than a threshold, D max (hereafter, linking length). Secondly, we consider an edge as a real link of galaxies based on the following: i) we define a cylinder along the edge with radius of 1 h −1 70 Mpc; ii) we measure the linear density of galaxies along the cylinder; iii) if the mean linear density of the cylinder is above d = N/V (Table 2) we take the cylinder as a link of galaxies connecting the two nodes.
Each ensemble of connected links is a tree in the forest graph. We then apply Kruskal's MST technique (section 3.4) on the forest graph to identify independent trees and their dominant branches.
To proceed we need to match the list of detected spanning trees with the list of detected GSyF systems. However, due to the effect of losing sampled galaxies with increasing redshift (see Figure 1), the richness of the detected systems depends on the redshift. In other words, to have a comparable richness for two similar system, for instance one at z = 0.03 and the other at z = 0.13, we have to apply a correcting factor to the richness of the second one. To overcome this limitation, we apply the following lower limit for the richness of the systems at the supercluster redshift: log 10 N min = a log 10 z + b, with a = −1.0 and b = −0.2. This leads to a lower richness limit of N min = 30 to 5 galaxies per system, from the nearest and farthest supercluster in our sample respectively.
Having now the systems and the bridges between them (instead of the nodes and edges in the previous step) we can identify the filaments. As stated above, and following the definition by Chow-Martínez et al. (in preparation), we search for the filaments which have at least 3 galaxy systems connected by bridges.
Although isolated bridges (that is, connecting only one pair of systems) and tendrils (connections between nodes with no system embedded) are important and are also a sub-product of the algorithm, we will focus our discussion hereafter only on the filaments. The connecting edges of these filaments are then refined using Dijkstra's algorithm (section 3.5). This refinement allows the identification of the filament skeleton, i.e. the principal branch connection. According to the pattern recognition literature, a skeleton represents the principal features of an object such as topology, geometry, orientation and scale. Figure 4 shows schematically the steps of the GFiF algorithm.
The results of the filaments-finding algorithm depend on several parameters, in particular the number of HC groups, N cut (or, equivalently, f ), and the linking length D max . Therefore, it is necessary to carry out a search for the optimal combination of these parameters. In addition, in order to find the longest filaments possible inside the supercluster volume, we search for the linking length that maximizes the number of filaments in the box, that is just before they begin to percolate. The optimization for these parameters is described in detail in Santiago-Bautista et al. (2019). We found that the optimized parameter f decreases with z from about 20 to 10 galaxies in the range covered by our sample, that is, depending on the sampling, a smaller density is found with increasing z. The D max , in turn, increases with z, in a way to compensate the decrease in f .
A schematic representation of the GFiF algorithm can be found on the right side of Figure 2.

Detection algorithms in action
In order to illustrate the detection algorithms presented above, we now describe their application to one of the superclusters in our sample, MSCC 310, the Ursa-Majoris Supercluster (see Tables 1 and 2). This supercluster contains 21 Abell clusters, with redshifts in the range from 0.05 to 0.08, and it is one of the largest in volume in our sample: it occupies an area in the sky of about 1 700 deg 2 , equivalent to a volume of (116 h −1 70 Mpc) 3 (including the 20 h −1 70 Mpc added to the box limits from the farthest clusters).
The volume contains N = 12 286 SDSS galaxies with spectroscopic redshift. This corresponds, to a mean surface density

GSyF Algorithm GFiF Algorithm
Measure skeleton length l fil and filament radius R fil  Table 2. The list of parameter values used is shown in Table 5.

Application of GSyF to MSCC 310
First we applied the VT algorithm to the projected distribution of MSCC 310 galaxies and calculated their d i . Then we made 1 000 simulations to estimate d bas (15.7 deg −2 , for this case). We applied the HC algorithm to the N gal = 7 529 galaxies with δ i > 0. After calculating the best f parameter from the 30 mock simulations ( f = 6, in this case), we have taken the N cut = 1 140 groups generated from the HC application. As expected, these groups have, on average, ∼ 6 members. Of these, we retained 1 015 with N j ≥ 3.
The iterative virial refinement was initialized by assigning the center of each HC group at the brightest r-band galaxy member close to its geometrical centroid (see Table 3). For MSCC 310 groups, the mean difference between the geometrical center and brightest groups' galaxy projected position was found to be about 350 h −1 70 kpc. On average, the virial refinement needed six iterations to produce convergence to the virial radius. This refinement resulted in 122 systems with N mem ≥ 10 for the MSCC 310 volume. The refinement also detected 113 smaller systems with 5 ≤ N mem < 10.
In Table 3 we list the properties of the first 25 richest systems for the MSCC 310 supercluster. Column 1 assigns a sequential number to the systems, while column 2 presents their richness. The coordinates of the brightest member of the system are indicated in columns 3, 4 and 5, while columns 6, 7 and 8 show the coordinates of the final position of the centroid. The other calculated properties of the systems -velocity dispersion, harmonic and virial radius -are presented in columns 9, 10 and 11 respectively. Column 12 denotes the cross-reference with Abell clusters. The range of virial radii of the GSyF systems with N mem ≥ 10 in MSCC 310 was 0.7 − 2.5 h −1 70 Mpc. For groups with 5 ≤ N mem < 10 the range of virial radius lies within 0.4 − 0.9 h −1 70 Mpc. After the refinement, the projected central position of the systems changed, on average by 170 h −1 70 kpc, while the redshift was refined for some cases up to ∆z ∼ 0.001 or ∆σ v ∼ 300km s −1 .
As an example, the richest system in MSCC 310 is the cluster A1291 A. Its   Bottom, Galaxy positions after correction for FoG effects. The color represent density as calculated from 3D VT. The highest density is represented in red while green to blue represents lower densities.
Finally, as described in section 4.2, we correct the D C of the member galaxies in each system by re-scaling their dispersion range to the R vir of the system. An example of the MSCC 310 volume, before and after the correction, is shown in Figure 5.

Application of GFiF to MSCC 310
With the co-moving distances for the MSCC 310 galaxies corrected for the FoG effect, we proceeded to transform their sky coordinates to rectangular ones following equations 1, 2 and 3.
The N gal was now taken to be the total number of galaxies in the box of MSCC 310, N = 12 286, for which we applied the GFiF method. The VT algorithm was then applied to calculate volumetric numerical densities. We used a segmentation parameter of f = 16 for the HC algorithm. From that, we identified 768 low density groups and, for each pair, we calculated the D E distance between centers and the BC weight. As expected, the implementation of the HC algorithm over all galaxies detected larger groups (∼ 15 galaxies on average now) and more elongated, with a mean σ j of 1.8 h −1 70 Mpc compared with the mean σ j of 0.5 h −1 70 Mpc found with the application of GSyF. In order to filter the connections, a linking length of D max = 8 h −1 70 Mpc was used resulting on 334 edges. As described above, D max and f were obtained by the optimization process described in Santiago-Bautista et al. (2019). The second filter, the minimum mean linear density along the edge cylinders (in this case 0.008 gal.Mpc −3 ), left 273 links from the 316 connections smaller than D max . This resulted in 34 trees, to which we applied MST. Of these, only 9 are linking 3 or more systems of galaxies with a richness N mem above 11 galaxies, the rest are isolated bridges and tendrils. This result is shown in the dendrogram depicted in Figure 6 (top panel) which shows 9 dominant filaments for the MSCC 310 supercluster.
Concerning the systems embedded in the structures, from the 359 HC groups (nodes) in the spanning trees, 116 matched with the systems with N mem ≥ 10 identified with GSyF. From these, 61 were found to be in filaments (53%), 26 (22%) in bridges between pairs of systems, and 29 (25%) not connected by bridges, that is, relatively isolated.
The filaments detected by the GFiF algorithm in the MSCC 310 supercluster and their main properties are listed in Table 4. Column 1 assigns a sequential number to the filament; column 2 lists the number of systems detected by GSyF linked by the filament; column 3 shows the number of galaxies attributed to the filament; columns 4 to 6 are the mean, min. and max. redshift of the filament; column 7 corresponds to the mean number density inside the filament; column 8 is the mean transversal radius of the filament measured at 10 × d; columns 9 and 10 show the number of nodes that constitute the filament and the number of central skeleton nodes, respectively; column 11 is the length of the filament skeleton.
We can also observe the filaments inside the MSCC 310 volume in Figure 6 (bottom panel). In this panel the 9 filaments are plotted over the distribution of galaxies in a RA [deg] × Z [Mpc] plane. This projection allows the recognition of structures both in one of the coordinates of the sky plane and depth. Filaments are depicted in colors, the same colors in both panels of this figure. Isolated bridges (that connect two systems alone, without forming a filament) are represented only in the bottom panel and by black lines. Tendrils are not represented to avoid crowding. The longest paths for the filament skeletons, that is, those that connect the farthest systems of each filament, range from 18 to 62 h −1 70 Mpc and connect up to 11 systems inside the MSCC 310 volume. Moreover, we measured the paths between pairs of systems chained together by bridges; such distances range from 5 to 24 h −1 70 Mpc.

Checking the identified systems of galaxies
In order to validate our GSyF algorithm we compared the list of identified systems to different cluster and group catalogs in the region of SDSS. For MSCC 310, for instance, GSyF detected 122 systems with ten or more galaxies and another 113 systems with 5 ≤ N mem < 10. A match was considered positive if the projected For the rich clusters, first we compared our results to the original Abell/ACO catalog (Abell et al., 1989), based on the most recent parameter measurements for its clusters (e.g. Chow-Martinez et al., 2014). Also, we compared the detected systems against the central galaxy position provided by the Brightest Cluster Galaxy catalog (Lauer et al., 2014, hereafter L14). Regarding catalogs based on the SDSS spectroscopic sample we compared with the C4 cluster catalog (Miller et al., 2005), based on the SDSS-DR2. For less rich clusters and groups we compared our systems with the Multi-scale Probability Mapping clusters/groups catalog (MSPM, Smith et al., 2012) and the Tempel et al. (2011) catalog (hereafter T11), carried over the SDSS-DR7 and -DR8 respectively. For the comparisons we used all systems detected by GSyF down to a richness of 5 galaxies.
By using the tolerance cylinder described above, 19 of the 37 Abell/ACO clusters inside the MSCC 310 box were detected as systems of richness above 5 galaxies with our method (51%), while the equivalent number was 26 (76%) for the 34 clusters in C4. There are 11 clusters in the L14 catalog embedded in the volume and 8 (73%) of them have GSyF counterparts. However, by increasing the aperture to 2 h −1 70 Mpc, we increased the detection of Abell clusters to 29/37 (78%), C4 clusters to 33/34 (97%) and L14 catalog to 100%, see Table 6. The increase of 20 − 30% in cluster matches by using a larger aperture size can be related to the fact that the mean separation of member galaxies increases for lower richness systems, and the determination of the cluster center then is subject to this separation, see Table 6. For example, A1452 and A1507 B have a GSyF counterpart located at ∼1.5 h −1 70 Mpc projected distance and ∆σ v of ∼ 630 km s −1 and 120 km s −1 respectively (See Table 3, systems No. 12 and 14), while their C4 counterparts are 0.7 and 0.4 h −1 70 Mpc away respectively.
Concerning the less massive systems, there are 315 groups detected by T11 and 213 groups listed in the MSPM catalog with richness larger or equal to 5 galaxies for the MSCC 310 volume. Our algorithm detected systems that correspond to 61% (79%) of the T11 groups and 67% (78%) of the MSPM groups, within an aperture of 1 h −1 70 Mpc (2 h −1 70 Mpc) (see Table 6). This is acceptable for our purposes since we have constructed GSyF to find the clusters that present FoG effect, although we can clearly go farther towards poorer systems with it.
The region of the UMa supercluster has been previously studied by Krause et al. (2013). These authors identified 31 galaxy systems in the MSCC 310 area with a number of galaxies between 15 and 94 galaxies. We found that our GSyF sys- tems match with 24 (77%) of these clusters within an aperture of 3 h −1 70 Mpc, of which 10 are Abell clusters. The systems detected in the main portion of the MSCC 310 supercluster are depicted on a sky projected distribution, in Figure 7 (top panel), by black circles with radius equal to the measured virial radius. The system positions from the Abell, C4, L14, MSPM and T11 catalogs are depicted respectively as red, pink, cyan, blue and green points. We also observe that the system membership number detected by GSyF is in agreement, for most of the cases, with the number of members for the same systems detected by T11, C4 and MSPM (see Figure  8). Qualitatively one can observe in this figure that the richness from T11 is in better agreement with our measurements, while MSPM estimates a richness slightly lower than both ours and T11. A similar analysis can be done for the other superclusters in our sample. For example, for the Coma supercluster (MSCC 295, Figure 7, bottom panel), the GSyF algorithm detected, in total, 115 systems. Of these, we found that A1656, the richest one, is composed of 579 galaxies. The estimated virial radius and mass are, respectively, 1.96 h −1 70 Mpc and 7.7 × 10 14 M . The second richest cluster, A1367, has 243 galaxies, while its radius and mass are, respectively, 1.73 h −1 70 Mpc and 5.3 × 10 14 M . These estimations are in good agreement with those measured by Rines et al. (2003). The complete catalog of systems for each volume is available online 2 .

Checking the filament skeletons
We compared the filaments obtained using the GFiF algorithm for the MSCC 310 volume with those presented by Tempel et al. (2014) (hereafter, T14) as extracted from their Table 2. We transformed the T14 survey coordinates filament positions (see  70 Mpc. We found 40% match between our detected filaments and T14 and 80% match with our isolated bridges and tendrils. The mean difference between the medial axis of T14 filaments matching the nearest filament/tendril detected by us is ∼ 1.5 h −1 70 Mpc. T14 filaments are represented by a sequence of points forming a line. Then, the calculated separation was taken to be the distance from the T14 filament points to the edges of our filaments. Our filaments are depicted over T14 filaments in Figure 9. As can be seen, GFiF detects the most prominent (dense) filaments among the ones in T14.

Comparison with KDE density maps
Another test we did in order to validate the results from GSyF and GFiF algorithms was to apply an independent analysis to the galaxies in the MSCC 310 volume in order to corroborate the densities, nominally the KDE method, as described in section 3.6. A quantitative comparison in the space between the 3D KDE and the skeleton structures is left for upcoming works. Therefore, we restricted our analysis to 2D projections (density maps) of the 3D KDE, (XY, XZ, YZ). For this analysis we used kernels of size 1 Σ (see section 3.6). Since each kernel is created based on the VT cell, we used d bas as baseline density. We selected those regions for which d kde > d bas in the RA×Dec projected density map. Afterwards, we compared the position of the density peak of each region against the centroids of the 122 GSyF systems. We found that 93 GSyF systems with N j ≥ 10 (76%) match density peaks above 3 d bas . The remaining 29 GSyF systems (24%) are identified with density peaks in the range (1 − 3) d bas . Moreover, we observe that the filament edges connect these density peaks forming chains of overdensity regions. In Figure 10 (top panel) we show the systems detected by GSyF represented by circles of r = R vir over the galaxy density distribution as measured using KDE in a RA×Dec projection. In the bottom panel of Figure 10 we show the filaments overlaid on the KDE density map for the MSCC 310 volume. The density maps are set in terms of the mean number density.

Main properties of the filaments
In a similar way as described for MSCC 310, we applied the GFiF algorithm to the 46 superclusters of our sample, detecting a total of 144 filaments in 40 superclusters which are listed in Table 7. This table also lists the parameters used or measured by GFiF: column 2 notes the segmentation parameter f , while column 3 presents the number of detected HC groups in the supercluster box. Column 4 shows the linking length (D max ) used to connect HC groups. The process of filtering the connections can be followed through columns 5 to 10, which show, respectively, the number of detected edges, the number of filtered links, the number of trees detected after applying MST and the final number of filaments, N ske , number of isolated bridges, N brid , and tendrils, N tend . Column 11 lists the minimum richness we considered for GSyF systems to be taken as ends of the bridges. Column 12 to 14 present, respectively, the fraction of these systems included in the GFiF filaments, in isolated bridges and the ones not connected by bridges. Finally, columns 15 to 17 show the number of galaxies hosted in the GFiF filaments, N g f il , and the filling factors calculated as V f il /V and N g f il /N. The list of detected filaments for all studied supercluster volumes can be consulted in Table 8, in the same format as the one presented in

Distribution of galaxies along the filaments
In order to evaluate the environment within the filaments, we extracted longitudinal profiles of number density. In Figure 12 we show the longitudinal distribution of galaxies for all bridges, from one extreme to the other (ending systems), in the supercluster MSCC 310. We observe that the density of galaxies is higher near the ends of the bridges, as expected, and decreases through the midpoint between systems. Then we proceed to extract density profiles for bridges, from the systems to the midpoint, by counting the galaxies that lie within a cylinder of radius 1 h −1 70 Mpc with medial skeleton set by the bridge skeleton. The galaxies are counted in slices of size ∆d = 0.5 h −1 70 Mpc along the skeleton axes. We also calculated the longitudinal profiles after excluding the galaxies belonging to systems (considered at 1.5 R vir ) from their bridges, in order to determine pure filament profiles. These profiles allow to evaluate the mean density contrast of the filaments as compared with the background density. In Figure 13 we show the longitudinal number density profile for all filaments detected in our sample. The stacked longitudinal profile including galaxies in systems is depicted by a blue line. The dispersion about the stacked profile is represented by a blue shaded area. The pure profiles (excluding the systems' galaxies) is represented by the red line, and its corresponding dispersion by a red shaded area. As can be seen, the mean density contrast along the filament is ∼ 10, that is, the filament is about 10 times denser than background.

Transversal density profiles
For the calculation of transversal density profiles, we excluded the galaxies located in systems, within a radius of 1.5 R vir . The density profile is calculated as described in section 3.7. The cylinder radius R cy was set from 0 up to 10 h −1 70 Mpc in steps of ∆R cy = 0.5 h −1 70 Mpc.
We computed the galaxy number density profile for filaments in two ways. First we counted the number of galaxies within concentric cylinders and divided them by the volume within the cylinders. We call this the local number density profile. For the second, we employed the number densities calculated using the VT d i , as described in Sec. 3.1. Then we measure the mean VT number density within concentric cylinders. The local number density and VT number density profiles are scaled in density contrast and stacked together. Figure 14 shows the stacked profile for all filaments detected by GFiF, respectively for local densities (top panel) and VT densities (bottom panel). The first aspect to note is that local number density profiles are smoother although, in general, both mean profiles are similar. We can observe, in both local and VT density profiles, that overdensity extends up to 5 h −1 70 Mpc. At about 3 h −1 70 Mpc, the overdensity reaches a value around 3, while the typical characteristic density contrast of 10 is reached closer to 2 h −1 70 Mpc.  Table 8. Finally, we use the density profiles to estimate the mean radius of the filaments R f il . This was achieved by considering the intersection point at which the local density profile crosses the 10 × d line, as indicated in Figure 14 by the black solid line. The mean radius as well as the mean density of each filament is noted in Table 8. Figure 15 (top panel) presents the radii distribution for all the filaments. The filament radii range from 0.6 to 4.5 h −1 70 Mpc with a mean value of 2.4 h −1 70 Mpc. The bottom panel of the figure depicts the filament radius as a function of the filament length. We observe that the filament length does not correlate with the filament radius. However, it is important to note that the radius varies slightly around the mean value along the filament path.

Stellar mass profile
We constructed a galaxy stellar mass profile for all filaments by using the masses from MPA-JHU group (Brinchmann et al., 2004;Kauffmann et al., 2003;Tremonti et al., 2004) described in Section 2.2. First, we weighted the mass by the average mass of the volume under analysis to remove the redshift dependence of the stellar mass (Chen et al., 2017). This weighting is equivalent to a normalization of the stellar mass and allows to carry out a 0 2 4 6 8 10 Distance to system 0 0  Fig. 13: Longitudinal VT density distribution for galaxies in all bridges of filaments detected by GFiF. Profiles are considered from the system center to the middle of the bridge. The thick blue line depicts the mean longitudinal profile for bridges including galaxies in systems. The thick red line corresponds to the mean longitudinal profile for all filaments excluding galaxies belonging to systems within 1.5 R vir . Blue and red shaded areas are the dispersion about the stacked profile.
stacking procedure in order to increase the signal of the profiles. This mass profile is extracted as described in Section 3.7. Figure 16 shows the stacked stellar mass profile for all filaments. The variance of the stacked profile is depicted by the error bars. We observe that, statistically, the stellar masses of the filament galaxies are larger than the average mass up to about 2 h −1 70 Mpc, while beyond 3 h −1 70 Mpc they tend to be 10% smaller. This region farther than 3 h −1 70 Mpc probably represents the dispersed population of the supercluster, associated to the more extended sheet component. Thus, our results indicate that the stellar mass correlates with the distance to the filament skeleton, being larger (up to 25%) near the skeleton than far from it. These results are in good agreement with the results presented by Chen et al. (2017) for MGS sample from DR7 (Abazajian et al., 2009). Our results are also compatible with those presented by Alpaslan et al. (2016) and Kraljic et al. (2018), for the GAMA spectroscopic survey, who find similar trends for the filaments found at redshifts z < 0.09 and 0.03 ≤ z ≤ 0.25, respectively.

Morphological type
In order to analyze if there is some morphological trend in the population of filament galaxies (as may be expected from the morphology-density relation), we also constructed morphology profiles based on morphological classifications by Huertas-Company et al. (2011). They classify the galaxies in four morphological types. For our analysis we used the probability p(Early) = p(E) + p(S0) that classifies galaxies in early type as p(Early) > 0.5 and late type as p(Early) < 0.5. Then we computed the distribution of both galaxy types as a function of the distance to the filament skeleton. The distributions were normalized so they can be compared and stacked for all filaments in our sample in a similar way as a profile extraction, again excluding galaxies in systems. The result is shown in Figure 17.   Our results show that the fraction of early type galaxies is higher than the one for late types near the filament skeleton up to ∼ 2 h −1 70 Mpc. This effect is more notorious when computed as an early to late type ratio (Figure 17, bottom panel). We observe that at distances smaller than 2 h −1 70 Mpc, the fraction of early types reaches almost twice the fraction of late types. At larger distances (that is, towards the dispersed supercluster population) the fractions tend to be similar (E/S ratio ∼ 1). A two sample Kolmogorov-Smirnov test applied to the distributions of early and late types in Fig. 17 reveals that, for the first bins, they are significantly different (p-value lower than 0.1). Our results are consistent with those presented by Kuutma et al. (2017) for the Huertas-Company et al. (2011) sample -they also observe that early type galaxies are more abundant near the filament skeleton.  Table 8.

Activity type
For the analysis with respect to activity type, we used the activity classification from the MPA-JHU group, (Brinchmann et al., 2004;Kauffmann et al., 2003;Tremonti et al., 2004) described  in Section 2.2. We computed the distribution of the different galaxy activity populations as a function of the filament skeleton distance. All distributions are normalized for all filaments and stacked together. Figure 18 divides galaxies into four activity groups: AGNs, SF galaxies, LINERs and non-active galaxies (unclassified). The error bars were not displayed over the lines for clarity -note that they are very large, implying that we have to interpret this figure with caution. Another effect to take into account is that the fractions are averaged over all the filaments (at different redshifts). The most evident tendency we can see in these distributions is a decrease in the activity as long as the galaxies "approach" the filament, although the fractions for the dispersed component are very noisy. Inside the filaments, the tendency is to have more passive galaxies, implying again smaller fractions of AGNs and SF galaxies. However, the fraction of LINERs also increases towards the filament skeletons, possibly indicating a post-activity phase for the galaxies. Deeper analyses are necessary to give a clear picture of the effect of the filament environment on the activity of galaxies.

Conclusions
In this paper we studied the bridges and filaments of galaxies in the environment of superclusters of galaxies. We developed two algorithms, the Galaxy Systems-Finding algorithm, GSyF, and the Galaxy Filaments-Finding algorithm, GFiF, respectively to detect systems of galaxies (clusters and groups), aiming especially to correct for the finger-of-God effect, and to identify the elongated structures just mentioned. These algorithms were applied to a sample of SDSS galaxies with spectroscopic redshifts in rectangular boxes enclosing 46 superclusters of galaxies selected from MSCC catalog in a redshift range from 0.02 to 0.15. GSyF and GFiF employ a set of different classic pattern recognition methods. Both of them are probabilistic in the sense that they define systems and filaments as a function of the relative position and orientation of the Gaussian groups, which are detected with a Hierarchical Clusterization method. For GSyF, the membership of the Gaussian groups is refined by using a virial approximation, allowing to discern gravitationally bound systems of galaxies from misdetections. For GFiF, these measurements are used to define a general tree from which we extract independent structures based on density criteria. Although the HC algorithm needs to be optimized for the number of clustering groups, this can be automatized based on the density function characterizing the survey. The structures are represented by a filament skeleton that allows to measure and trace quantitatively the filament path.
We show (section 7.1) that the systems detected by our methodology are in good agreement with those reported in the literature. Specifically, our comparisons of the systems sample against other cluster and group catalogs (Abell,C4,L14,T11 and MSPM) showed a match rate above 78% for groups with richness above 5 galaxies at redshifts z < 0.11. For systems with richness above 10 galaxies the coincidences were slightly higher for the group catalogues (T11 and MSPM) and slightly lower for the cluster catalogues (Abell and C4). Moreover, the richness, velocity dispersion and virial radius of systems measured by the GSyF algorithm are in good agreement with those reported in other system catalogs. Our GSyF algorithm detected a total of 2 705 systems in the rectangular boxes enclosing the volumes of 45 of the superclusters in our sample. Of these, 159 systems with richness above 10 galaxies were not previously reported in the literature 3 .
We also compared, in section 7.2, the results of our filaments-finding algorithm with those of Tempel et al. (2014) for the same regions. We observe that T14 filaments are shorter, more numerous and describe sparser and finer structures while GFiF detects larger and denser elongated structures that bridge galaxy systems. Our filaments, in some sense, link several T14 threads in one larger structure, providing a broader picture of the filament. The comparison with isolated bridges and tendrils (a sub-product of our algorithm) shows a match of 80% with T14 filaments and comparable filament lengths.
The GFiF algorithm detected a total of 144 filaments and 63 isolated bridges in the rectangular boxes enclosing the volumes of 40 of the superclusters in our sample. The supercluster filaments we detected have lengths from 9 up to 130 h For most of the cases, the numerical density inside the filaments was found to be between 5-15 times the mean density. The radii of the filament skeletons range from 0.6 up to 4.5 h −1 70 Mpc, with most between 2 and 3 h −1 70 Mpc. These values are consistent with the ones found by Cautun et al. (2013) by applying the NEXUS algorithm to an N-body simulation.
We also compared the properties of the galaxies that inhabit the filament as a function of the distance from its skeleton. We conclude the following: i) The transversal local and VT number density profiles for pure filaments show that, at distances up to 5 h −1 70 Mpc, the filaments have a positive overdensity respect to the background density inside de boxes. ii) At distances of about 3 h −1 70 Mpc, the density contrast reaches a value of 3, a limit that matches the range where typically the environmental effects studied in section 9 seem to apply. iii) The mean density contrast of the filaments, 10, is reached closer to 2 h −1 70 Mpc, a limit that we used as reference for estimating the radius of the filaments.
Our analysis regarding the stellar masses, morphological type and activity type show correlations between these galaxy properties and the distance from the filament skeleton. We arrive to the following conclusions: i) Inside 3 h −1 70 Mpc from the filament skeleton the galaxy stellar masses increase up to about 25%. This result leads to two hypotheses: (a) the mass growth of the galaxies is sensitive to the environment, or (b) the dynamical evolution brings massive galaxies into the potential well of the filaments. This result confirms several analyses which suggest that stellar masses are sensitive to the environment (Alpaslan et al., 2015(Alpaslan et al., , 2016Poudel et al., 2016;Chen et al., 2017;Malavasi et al., 2017;Kraljic et al., 2018;Musso et al., 2018). ii) The early to late type galaxy ratio has its maximum at the center of the filament and remains above ratio 1:1 up to a distance of 1.5 h −1 70 Mpc. This result is in good agreement with a similar work by Kuutma et al. (2017) for the SDSS. iii) Concerning the activity type, we observed that the fraction of AGNs and SF galaxies seem to be higher outside the filaments (in the supercluster dispersed component), showing a decrease as the galaxies approach these structures. Inside the filaments, the fractions of non-active galaxies and LINERs increase, indicating a possible post-activity phase. A similar result for the star-forming galaxies has been observed by Kraljic et al. (2018), for the GAMA spectroscopic survey.