J-PLUS: Support Vector Regression to Measure Stellar Parameters

Context. Stellar parameters are among the most important characteristics in studies of stars, which are based on atmosphere models in traditional methods. However, time cost and brightness limits restrain the efficiency of spectral observations. The J-PLUS is an observational campaign that aims to obtain photometry in 12 bands. Owing to its characteristics, J-PLUS data have become a valuable resource for studies of stars. Machine learning provides powerful tools to efficiently analyse large data sets, such as the one from J-PLUS, and enable us to expand the research domain to stellar parameters. Aims. The main goal of this study is to construct a SVR algorithm to estimate stellar parameters of the stars in the first data release of the J-PLUS observational campaign. Methods. The training data for the parameters regressions is featured with 12-waveband photometry from J-PLUS, and is cross-identified with spectrum-based catalogs. These catalogs are from the LAMOST, the APOGEE, and the SEGUE. We then label them with the stellar effective temperature, the surface gravity and the metallicity. Ten percent of the sample is held out to apply a blind test. We develop a new method, a multi-model approach in order to fully take into account the uncertainties of both the magnitudes and stellar parameters. The method utilizes more than two hundred models to apply the uncertainty analysis. Results. We present a catalog of 2,493,424 stars with the Root Mean Square Error of 160K in the effective temperature regression, 0.35 in the surface gravity regression and 0.25 in the metallicity regression. We also discuss the advantages of this multi-model approach and compare it to other machine-learning methods.


Introduction
In modern astronomy, the large amount of raw data produced by the newest surveys is far beyond our traditional processing capacity, and this insufficient computing power has become a bottleneck to the rapid development of astrophysics. Fortunately, the development of computer science has provided us with new ways to understand and gain knowledge from these raw data. Machine learning in particular has achieved remarkable success in offering novel solutions to complex problems based on applications of loss functions, optimization methods (Ruder 2017), and statistical models (MacKay 2003). Owing to its nonfunctional algorithms (Cortes & Vapnik 1995;Boser et al. 1992;Cristianini & Shawe-Taylor 2000;Cover & Hart 1967;Stone 1977;Quinlan 1986;Breiman 2001), machine learning can reveal potential patterns and important parameters that are indistinguishable using traditional scientific or statistical methods.
Stars are the cornerstone of astronomy, and stellar parameters are among the most crucial characteristics of understanding and characterizing stars. Among the most powerful and accurate methods of determining stellar parameters is spectral analysis (Wu et al. 2011;Boeche et al. 2018;Anguiano, B. et al. 2018).
However, spectral observations can be costly. The most powerful spectroscopy survey -the Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) -has taken spectra of about ten million stars in low resolution and about three million in medium resolution with a four-meter telescope in the past 8 years. In contrast to spectral observations, photometric observations have much higher observational efficiency. Compared to LAMOST, the Javalambre Photometric Local Universe Survey (J-PLUS, Cenarro et al. 2019) observed 223.6 million objects in its five-year observational campaign 1 . Therefore, many studies have been working on estimating stellar parameters from photometry-based data by constructing models of photometric observation results (Bailer-Jones 2011;Sichevskij 2012;Sichevskij et al. 2014). Their modeling requires highly precise photometric data from a few surveys, while another mighty tool -machine learning -can reveal the distribution or pattern hidden inside the photometry with rougher data.
Machine learning, a cross-disciplinary subject of statistics, optimization, and computer science, creates algorithms that can process data based on a well-chosen sample set with features. Different algorithms assign different models and reveal potential patterns in the sample (MacKay 2003;Shalev-Shwartz & Ben-David 2014). The algorithms select features from samples and optimize the parameters based on a given loss function. The loss function evaluates the difference between the predicted result and the true value. Machine-learning algorithms have been applied in many different disciplines, including finance, medical science, and computer vision.
Machine-learning technology has shown the ability to obtain valuable information from multi-band photometric data. Several studies have made some efforts to determine stellar parameters from photometric surveys. Bai et al. (2019) have derived stellar effective temperatures from Gaia second data release using a random forest (RF) algorithm. Bai et al. (2018) have presented RF models to categorize objects such as stars, galaxies, and quasi-stellar objects (QSOs) and classified and obtained the effective temperature of stars. Lu & Li (2015) developed a scheme to achieve stellar parameters from the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm and support vector regression (SVR) models. Yang et al. (2021) designed a costsensitive artificial neural network and achieved stellar parameters from two million stars from the J-PLUS first data release (DR1). Galarza et al. (2022) developed the stellar parameters estimation based on the ensemble methods (SPEEM) pipeline, which is a stack of feature searching, normalization, and multioutput regressor. The SPEEM is based on RF and extreme gradient boosting (XGB). These works are all based on machinelearning algorithms which indicates their powerful ability to reveal potential patterns within data.
J-PLUS has also been used to gain knowledge of objects ranging from our Solar System to the deep universe, 2 such as the coma cluster (Jiménez-Teja et al. 2019), low metallicity stars (Whitten et al. 2019;Galarza et al. 2022), and galaxy formation (Nogueira-Cavalcante et al. 2019). J-PLUS has 12-waveband photometry, which makes the survey ideal for the application of machine learning.
In this study, we adopt the SVR algorithm to obtain the stellar parameters, which include the effective temperature (T eff ), the surface gravity (log g), and the metallicity ([Fe/H]) of the stars in the J-PLUS DR1 catalog. To construct the training sample, we use data from LAMOST, the Apache Point Observatory Galactic Evolution Experiment (APOGEE, Zasowski et al. 2013), and the Sloan Extension for Galactic Understanding and Exploration (SEGUE, Yanny et al. 2009;Sect. 2).
We adopt the SVR (Awad & Khanna 2015;Drucker et al. 1997;Cortes & Vapnik 1995) algorithm and determine the kernel scales (Sect. 3.1). We construct 80 training sets based on the uncertainties of the stellar parameters. Each parameter of any object in the 80 sets obeys a Gaussian distribution (Sect. 3.2). In all, we construct more than two hundred different models. The blind test is presented in Sect. 3.3.
The result of applying our method to the entire J-PLUS DR1 catalog is presented in Sect. 4. In Sect. 5.1 and 5.2, we compare different methods for constructing the training sample and the inconsistency of stellar parameters among different pipelines (Sect. 5.3). We discuss the distribution of the regressed stellar parameters in Sect. 5.4, and compare our result with Yang et al. (2021) in Sect. 5.5. We present a conclusion in Sect. 6.

J-PLUS
J-PLUS is conducted by the Observatorio Astrofísico de Javalambre (OAJ, Teruel, Spain;Cenarro et al. 2014). It uses the 83 cm Javalambre Auxiliary Survey Telescope (JAST80) and T80Cam, which is a panoramic camera of 9.2k × 9.2k pixels that provides a 2 deg 2 field of view (FoV) with a pixel scale of 0.55 arcsec pix −1 (Marín-Franch et al. 2015). The J-PLUS contains a 12-passband filter system that comprises five broad (u, g, r, i, z) and seven medium bands from 3000 to 9000 Å. Cenarro et al. (2019) illustrate the scientific goals and the observational and image reduction strategy of J-PLUS. J-PLUS DR1 covers an area of 1,022 deg 2 on the sky with a magnitude limit of ∼ 21.5 for a S/N of ∼ 3. These 12 bands provide a large sample for us to characterize the spectral energy distribution of the detected sources (Cenarro et al. 2019). The 12-band magnitudes are adopted as our training features, which are u, J0378, J0395, J0410, J0430, g, J0515, r, J0660, i, J0861, and z. We name them mag1 to mag12 for simplicity. Yuan (2021) recalibrated the J-PLUS DR1 catalog using stellar color regression. The method is described in detail in Yuan et al. (2015). The catalog in Yuan (2021) contains 13,265,168 objects, including 4,126,928 objects with all 12 magnitudes. In Wang et al. 2021, the objects in the recalibrated J-PLUS catalog are classified using the support vector machine (SVM) algorithm, which distinguishes them among the classes STAR, GALAXY, and QSO. Here they chose the 12 J-PLUS magnitudes as features and used their corresponding uncertainties as weight for the SVM. They provide two catalogs based on the 12 density contours on the 12 magnitudes of the training sample. The objects that fall into all the 12 contours are assigned as interpolations, and the others are assigned as extrapolation. Interpolations have better classification accuracy than extrapolations. In this study, we use all objects classified as STAR in these two catalogs for the stellar parameter regressions.

LAMOST spectra
LAMOST is a northern spectroscopic survey situated at Xinglong Observatory, China. LAMOST is able to observe 4,000 objects simultaneously with a 20 deg 2 FoV. The main scientific project of LAMOST aims to understand the structure of the Milky Way (Deng et al. 2012) and external galaxies. We used the A, F, G, and K catalogs in LAMOST Data Release 7 (DR7), low resolution spectra (LRS), and medium resolution spectra (MRS) 3 .
LAMOST LRS have a limiting magnitude of about 20 in the g band and its S/N is higher than 6 on dark nights or 15 on bright nights. In MRS, the S/N is always larger than 10. The parameters are given by the LAMOST Stellar Parameter pipeline (LASP), and their uncertainties are mainly from the stellar S/N and chisquare of the best-matched theoretical spectrum (Wu et al. 2014;Liu et al. 2020). The parameter differences of the LRS data and MRS data are very small, on average 17.6 K for T eff , 0.028 dex for log g (g in (cms −2 )), and 0.084 dex for [Fe/H], according to our test. The internal uncertainties of LASP are estimated to be ∆T eff ∼ 80K, ∆logg ∼ 0.15dex, and ∆[Fe/H] ∼ 0.09dex (Wang et al. 2020). We cross-matched these catalogs with J-PLUS DR1 to within one arcsec using the Tool for OPerations on Catalogues And Tables (TOPCAT, Taylor 2005). There are 216,114 and 25,170 cross-matched stars in the LRS and MRS catalogs, respectively.
We adopted the APOGEE catalog to enlarge our training sample. Using the Sloan Digital Sky Survey (SDSS) Catalog Archive Server Jobs 4 , we extracted table aspcapStar and find 12,931 stars that satisfy our one arcsec cross-match tolerance.

SEGUE
SEGUE (Yanny et al. 2009) is designed to obtain images in the u, g, r, i, and z wavebands for 3,500 square degrees of the sky located primarily at low galactic latitudes (|b| < 35 • ). It delivers observations of about 240,000 stars with a g band magnitude between 14.0 and 20.3 mag and moderate-resolution spectra from 3,900 to 9,000 Å. The stellar parameters are presented in SDSS Data Release 7 for these Milky Way stars with S/N greater than ten. The SEGUE Stellar Parameter Pipeline (SSPP; Lee et al. 2008a,b;Allende Prieto et al. 2008) developed a multimethod technology to calculate the stellar parameters which includes non-linear regression models, a minimal distance of observed spectra and grids of synthetic spectra, and the correlations between spectral lines and color relations. The average uncertainties of SSPP are ∆T eff ∼ 130K, ∆logg ∼ 0.21dex, and ∆[Fe/H] ∼ 0.11dex. One arcsec tolerance cross-match yields 25,487 stars.

Normalization
The training sample was constructed with J-PLUS DR1 and three spectral surveys, 12,931 stars from APOGEE, 216,114 from LAMOST LRS, 25,170 from LAMOST MRS, and 25,487 from SSPP. After removing the stars with missing photometric observations in one or more bands, there are 279,702 stars left.
Prior to training our model, we centered all parameters x i in our training data (both the targets and input magnitudes) by subtracting their mean: 5 . This normalization reduces the upper and lower bound of the parameter distribution and makes them all zero-mean distributed, which can accelerate the training speed (Shalev-Shwartz & Ben-David 2014).
During the prediction procedure, the query input magnitudes are centered in the same fashion, using the respective means from the training data, before being fed to the model. The model output x * i,p is then returned to the true target parameter space following: We kept the parameters of the same stars from different spectroscopy surveys. For example, star A appears in both LAMOST and APOGEE, and the corresponding effective temperatures are Teff 1 and Teff 2 . In our sample set, there would be a star A 1 with effective temperature Teff 1 and another star A 2 with Teff 2 . The stellar parameters are only merged when all parameters are the same.

Contours
The prior distributions of the parameters in the training sample set are shown in Appendix A. The effective temperature is mostly distributed from 5000 ∼ 6000K, with log g about 4 ∼ 4.8 and a metallicity similar to that of our Sun, which means that most stars are in their main sequence stage. There are also some giants with lower surface gravity (2 ∼ 3), lower metallicity (<-1), and lower effective temperature (4000 ∼ 5000K).
However, the LAMOST stellar parameters are extracted from the spectra whose S/N is higher than 10. This criterion makes our LAMOST sample brighter than the limit magnitude of J-PLUS stars. Such disagreement would decrease the accuracy of prediction. To solve this problem, we applied the method to control the interpolation and extrapolation in Wang et al. (2021). This method could increase the precision in prediction. To quantify the density of the sample space, we considered 12 sub-space combinations, which are (mag1, mag2, mag3), ... , (mag10, mag11, mag12), (mag11, mag12, mag1), and (mag12, mag1, mag2), and calculated the 95% density contours to make clear how they cycle through the parameter space. The sampledense space can be approached by the intersection space inside all these twelve density contours. Stars situated in all contours are assigned as interpolations.

Support vector regression
The SVR (Awad & Khanna 2015;Drucker et al. 1997;Cortes & Vapnik 1995) algorithm is a regression method based on the SVM (SVM, Cortes & Vapnik 1995;Boser et al. 1992;Cristianini & Shawe-Taylor 2000;Shalev-Shwartz & Ben-David 2014). The data located inside the margin given by SVR algorithm are not involved in the calculation. These data gave us the flexibility to define the tolerance in regression. For a nonlinear regression problem, we first embedded the sample space into a new feature space with a higher dimension and transformed it into a linear regression problem. To transform the problem to the higher dimensional feature space, the SVR algorithm makes use of the kernel trick, which represents the inner product of the image of objects mapped from the sample space to the feature space with a kernel function. This process can accelerate the calculation by doing calculations in a lower dimension. In the feature space, the algorithm then fits the data linearly.
Similar to the SVM algorithm, the SVR also has a strip area (called a tube). The difference between the two algorithms is that SVM maximizes the minimal distance between each sample to the strip, while the SVR considers how to minimize the width of the strip to put all samples in. The width of the strip area is called the margin. The vectors from the final line to the samples that finally determined the strip area are called the support vector. Details on the algorithm are given in Smola & Schölkopf (2004).
The root mean square error (RMSE) is one of the most useful standards for evaluating the efficiency of regression. The RMSE of θ is given by E((θ − θ) 2 ), whereθ is the estimate of θ. In regression, the RMSE is given by where y is the stellar parameter.
In the pre-training, we tested both magnitudes and colors as the input features with the same preprocessing. As a result, the RMSE of the magnitudes was lower than that of the mag(n-1)−mag(n) color under the default setting of SVR training (with Gaussian kernel 0.83). Then, we constructed independent models with different Gaussian kernel scales for each stellar parameter. We present the RMSE as a function of different kernel scales in Fig. 1. We used the kernel scales with the lowest RM-SEs for the training. We also tested the mag(n-1)−mag(n) and mag1−mag(n) color (Yang et al. 2021), and optimized the kernel scale with the Bayesian optimizer. Table 1 presents the adopted kernel scales and the corresponding RMSEs, which are based on ten-fold validation. We adopted 0.8 as the kernel scale of T eff , 0.325 as the kernel scale of log g, and 0.45 as the kernel scale of [Fe/H].

Data enhancement
We duplicated the sample into 80 sets to enhance the cardinality of data for each stellar parameter. In image recognition, the method is usually carried out by rotating or mirroring the image and producing more samples for the training. We applied this method here by generating random training sets to fully use the uncertainties of both the magnitudes and the stellar parameters.
For each star in our training sample, after the centering process, we generated 80 stars with its 12 magnitudes, 3 stellar parameters, and 15 corresponding uncertainties. These 80 stars have Gaussian distributions of all the magnitudes and the stellar parameters following is the uncertainty of x i,t , and σ i ∼ N(0, 1). For example, the mean temperature is 6,000K and the σ is 100K for the star with T eff = 6000 ± 100K. The simulation process did not change the original distribution or introduce new errors (Appendix A).
Each of the 80 constructed samples has 251,732 stars. We trained 80 models using the SVR algorithm with the same kernel scale in Sect. 3.1 for each stellar parameter, which resulted in a total of 240 different models (same scheme but different sample set) for T eff , log g, and [Fe/H]. We then used the corresponding 80 regression models to retrieve a distribution of 80 predictions for a given query observation. Then, we used Gaussian functions to fit these predicted distributions to obtain the centers and standard deviations.
The data enhancement method takes the uncertainties of stellar parameters into account, which are not considered in other studies. The absence of these uncertainties could cause unexpected errors in the prediction since the spectral precision is not involved in the model construction.

Model validation
We applied model validation to illustrate the effectiveness and to avoid potential overfitting. A widespread method for model validation is blind tests, which can reveal potential overfitting and quantify a model's ability to generalize to new, previously unseen data. We reserved 10%, 27,970 stars as our blind test sample, producing 80 sets of tests for each stellar parameter according to our data enhancement procedure. Similar to the construction of the sample set, we did not use any criterion to preselected data in order to avoid the selection effect.
We provide the RMSE and normalized RMSE (NRMSE) of our blind test for both interpolation and extrapolation in Table   2, where the NRMSE = The prediction has the best result in effective temperature and the least in surface gravity. The extrapolation contains 445 stars, which suffered from a large deviation in the blind test. The result shows that the training model is better restricted in the space in which the samples are densely distributed. Figure 2 shows the T eff distribution of one star (26109-16130) in our blind tests. After applying the Lilliefors test (Lilliefors 1967), the p-value of the distribution is 0.375, greater than 0.05, implying that it may follow a Gaussian distribution. However, this may not generally be the case (Section 5.4). The results of the blind tests for log g and [Fe/H] are presented in Appendix B.

Result
We then applied 240 stellar parameter models to the interpolation and extrapolation catalogs from Wang et al. (2021), sep- Notes. The "Magnitude" column lists the RMSE of using magnitudes as features, while the column of "Color1" is the mag(n-1)−mag(n) color and "Color2", presents the optimized mag1−mag(n) color.
The 'Kernel Scale' columns match the magnitudes, color1, and color2, respectively.  arating our predictions into two catalogs. There are 2,493,424 objects in the interpolation catalog and 233,924 objects in the extrapolation catalog. In the classification, the authors restrained the 12-band magnitudes and constructed 12 contours based on the sample set distribution. All classified objects were assigned to the area inside or outside the contours and categorized as either interpolation or extrapolation. The distribution of inside objects is similar to the samples and therefore they have higher accuracy. In short, the stars in the interpolation catalog have a higher classification confidence than those in the extrapolation catalog.
We also applied this method in the prediction. There are 1,898,154 stars situated in our training set contours, and 595,270 stars are categorized as extrapolation in the classification interpolation catalog. In the classification extrapolation catalog, there are 13,274 interpolations and 220,650 extrapolations. We then input the 12-band magnitudes of each star into the 3 × 80 models and obtained the stellar parameters of the interpolation and extrapolation catalogs. An example of the Gaussian fit is given in the right panel of Fig. 2. Table 3 is a stellar parameter catalog of the interpolation catalog.
Both median and mean values of the uncertainties are presented in Table 4. The difference in mean versus median shows the existence of a few stars with a large residual that raise the mean in both the interpolation and extrapolation catalogs. These uncertainties were decreased by about 3% ∼ 5% by applying the contours to constrain the prediction sample. The uncertainties of the prediction are smaller than the residuals among different pipelines, implying that our regressions have fairly good precisions (see Sect. 5.3). This also implies that the uncertainties are mainly generated from the pipeline difference and observation uncertainties.

Training and blind test
We tested three methods to regress the parameters. In the first method, we trained a single SVR model using the magnitude uncertainties as weights, but excluding the stellar parameter uncertainties. This method aims at making a comparison with our multi-model simulation. The weight is given by where the e i j shows the magnitude uncertainty of mag j. We adopted a logarithmic scale to reduce potential steep cliffs in the feature space (Wang et al. 2021). In the second and third tests, we randomly constructed 80 training sets for each stellar parameter. The difference between the two tests is whether LAMOST MRS were included in the training sample. We constructed the second test to show the generalization ability of extrapolating data. The first and second tests use LAMOST MRS as a blind test, while the third one includes all the stars in the training sample and holds 10% for the blind test. The third method is the final model we applied. It has lower RM-SEs than the single model. We present the results of these three tests in Table 5.
A&A proofs: manuscript no. main Notes. ID stands for the J-PLUS ID. RA and Dec. are the right ascension and the declination in degree (J2000). The extrapolation stars have an asterisk on their ID, which indicates that their stellar parameters are not very reliable. The ∆ shows the uncertainty.  dex, and 0.09 ± 0.14 dex, respectively. More information is presented in Table 6. The precision of all these single models is similar to our results.
In our study, the RMSE of a single model is similar to the RMSEs of 80 models, indicating that the multi-model approach performs similarly to the weight-controlled model. We can incorporate the uncertainties of both features and target parameters, which is the main advantage of our approach. Furthermore, if a certain photometric measurement suffers a higher uncertainty, its contribution to the regression would decrease significantly, independently of the uncertainties in the remaining 11 measurements. This could introduce biases into the models and result in inaccurate predictions. Such bias is avoided in our re-sults, since all the uncertainties of the 12-band magnitudes are included in the model construction.
We also find that the uncertainties of the stellar parameters are larger than the blind test's RMSEs. If these uncertainties are not considered in the calculation, the error from the spectral fitting pipeline would not propagate to the final regression. The uncertainties of the prediction would be highly underestimated in this case.

Pipelines
Different pipelines result in different stellar parameters (Fig. 3, see more in Appendix C), and we adopted all parameters from all pipelines. We selected the stars that have parameters from multiple spectroscopy surveys and show the differences and variances in Table 7. In Table 6 and Table 7, the differences in the blind test are smaller than those caused by pipelines. When the stellar parameters from different pipelines are included in our training, the regression models are unbiased among these different pipelines and can thus be used to provide general predictions from different surveys. Including a larger variety of surveys in the training data will improve the generalization ability.
Other studies usually adopt data from one survey to avoid biases among different pipelines (Bu et al. 2020;Lu & Li 2015). Another method is to set a priority for each catalog. Some studies have concluded that a larger sample size can decrease the bias caused by pipelines, for example Bai et al. (2018).
Although a single catalog and priority-based methods do not suffer the systematic error caused by pipelines, the models built from them probably do not perform well for other surveys. Our training sample contains different surveys, and their biases propagate to the final regression models. Therefore, our models would have a wider application.
One catalog-based method may be good for regression of the same catalog, but J-PLUS is a more general catalog with a  Bai et al. 2018, Yang et al. 2021, and Galarza et al. 2022, respectively. Bai et al. 2018 only contains effective temperature regression. µ is the center of the Gaussian fit and ∆ is the standard deviation. We present the median values of the training uncertainties.  large sky coverage and many photometric wavebands. J-PLUS requires models from hybrid samples for general applications. There may not exist a regressor that can properly fit the data for all the surveys; this is analogous to the no-free-lunch the-orem, which says that a machine-learning algorithm that can solve every problem does not exist (Shalev-Shwartz & Ben-David 2014). This implies that the prediction would be biased if we use single catalog or priority-based methods to regress the data in another catalog. In machine learning, a larger amount of data does not always work, but higher diversity does (Wang et al. 2021).

Distribution of stellar parameters
While our data augmentation process employs Gaussian-based uncertainty sampling, the predicted stellar parameters may not necessarily follow a Gaussian distribution. The SVR embeds the samples to a higher-dimensional feature space by using a nonlinear embeding mapping. The nonlinear mapping may not hold the distribution of sample to the feature space.
To quantify whether our approach preserves the shape of the distribution, we applied the Lilliefors test (Lilliefors 1967)  Given the low number statistics in our predicted distributions of only 80 samples, it is possible that this test underestimates the true fraction of Gaussian distributed predictions. To estimate the influence of the number of samples, we constructed 27,970 sets (which was the same size as the blind test size) of random numbers based on N(100, 25) and find that about 94.87% passed the test with a significance level of 0.05. When we change the number of sets to 800, the result increases to 95.11%. The difference is less than 1%. Therefore, there is no evidence that the resulted stellar parameters follow a Gaussian distribution. Yang et al. (2021) To determine the robustness of our predictions, we performed a comparison with the stellar parameter catalog from Yang et al. (2021) 6 . We selected the reliable stars in both catalogs, which were stars in the interpolation catalog and the stars with stellar parameter FLAG=0 in Yang et al. (2021). The cross-match yielded 2,008,654 stars. The average stellar parameter differences are -247.9764 K for T eff , -0.1984 dex for log g, and -0.0998 dex for [Fe/H]. There are some extreme values in Yang et al. (2021), and some of them are unreliable. For example, the maximum and minimum value of surface gravity are 882.44 dex and -237.24 dex and 460.39 dex and -567.16 dex for [Fe/H]. These stars may be situated at the edge of the feature space and, thus, are subject to overfitting by the ANN model.

Comparison with
The training sample ranges of Yang et al. (2021) are about 4000 ∼ 7500 K for T eff , 0 ∼ 5 dex for log g, and -3 ∼ 1 dex for [Fe/H]. We restricted their results to these ranges and obtain 1,663,053 stars. The average differences decrease to -63.6817 K for T eff , 0.0886 dex for log g, and -0.1069 dex for [Fe/H] (Fig.  4). These differences are similar to our blind test uncertainties. The main difference between our work and Yang et al. (2021) is the control of the extrapolations. These extreme values might fall into the extrapolation category and suffer from overfitting. The training sample of Yang et al. (2021) is not an official LAMOST release, and the training sample difference may also be a minor reason for such a stellar parameter difference.

Conclusion
In this work, we predicted stellar parameters, the effective temperature T eff , the surface gravity log g, and the metallicity [Fe/H] for J-PLUS DR1 by using 3 × 80 SVR models. We chose stars from spectra-based surveys (LAMOST, APOGEE, and SEGUE) to construct our training sample in order to improve the reliability of the sample and gained 279,702 stars from the cross-match. We normalized the features of these stars by subtracting their average in order to accelerate the calculation. We held out ten percent of the set for the blind test and the remaining 251,732 stars were used for training. We used 12 three-dimensional density contours to distinguish the prediction reliability for a given star. We presented two catalogs that correspond to the classification catalogs of Wang et al. (2021). For the classification interpolation catalog, we present 1,898,154 stars with interpolated stellar parameters and 595,270 stars with extrapolations in our regression. For their classification extrapolation catalog, on the other hand, we present 13,274 interpolated and 220,650 extrapolated predictions with our SVR. Regarding the construction of the models, multi-model simulations give better results than the single model. The RMSEs of our validations are 159.6 for T eff , 0.3453 for log g, and 0.2502 for [Fe/H]. Using different catalogs as a sample set may increase the generalization ability of our model. Lastly, we compared our results to Yang et al. (2021) and find a decent agreement with their work, with discrepancies comparable to the RMSEs that we found in our blind test.

Appendix C: Pipeline differences
We present the differences among the pipelines. The residuals are fitted by Gaussian functions, and the means and variances are shown in Table 7.
Article number, page 15 of 18 A&A proofs: manuscript no. main