Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey

Complex organic molecules (COMs) have been detected in a few Class 0 protostars but their origin is not well understood. Going beyond studies of individual objects, we want to investigate the origin of COMs in young protostars on a statistical basis. We use the CALYPSO survey performed with the IRAM PdBI to search for COMs at high angular resolution in a sample of 26 solar-type protostars, including 22 Class 0 and four Class I objects. Methanol is detected in 12 sources and tentatively in one source, which represents half of the sample. Eight sources (30%) have detections of at least three COMs. We find a strong chemical differentiation in multiple systems with five systems having one component with at least three COMs detected but the other component devoid of COM emission. The internal luminosity is found to be the source parameter impacting the most the COM chemical composition of the sources, while there is no obvious correlation between the detection of COM emission and that of a disk-like structure. A canonical hot-corino origin may explain the COM emission in four sources, an accretion-shock origin in two or possibly three sources, and an outflow origin in three sources. The CALYPSO sources with COM detections can be classified into three groups on the basis of the abundances of oxygen-bearing molecules, cyanides, and CHO-bearing molecules. These chemical groups correlate neither with the COM origin scenarii, nor with the evolutionary status of the sources if we take the ratio of envelope mass to internal luminosity as an evolutionary tracer. We find strong correlations between molecules that are a priori not related chemically (for instance methanol and methyl cyanide), implying that the existence of a correlation does not imply a chemical link. [abridged]


Introduction
Class 0 protostars are key objects along the evolutionary path that leads to the formation of Sun-like stars. They represent the earliest stage of the main accretion phase when a stellar embryo Based on observations carried out with the IRAM Plateau de Bure Interferometer. IRAM is supported by INSU/CNRS (France), MPG (Germany), and IGN (Spain).
The CALYPSO calibrated visibility tables and maps are publicly available at http://www.iram-institute.org/EN/ content-page-317-7-158-240-317-0.html has just been formed but most of the mass is still stored in the collapsing protostellar envelope (André et al. 1993(André et al. , 2000. Class 0 protostars retain memory of the initial physical and chemical conditions of star formation that prevailed during the prestellar phase. The collapse of their envelope also initiates the formation of circumstellar disks (Maury et al. 2019) which will turn into protoplanetary disks and set the initial conditions for planet formation in the subsequent stages of star formation. While the formation of Sun-like stars is broadly understood, several issues remain. In particular, how young stars have gotten rid of most of the angular momentum initially stored in their protostellar envelopes remains unknown. This has been formulated as the angular momentum problem of star formation (Bodenheimer 1995).
The Continuum And Lines in Young ProtoStellar Objects (CALYPSO 1 ) Large Program of the Institut de Radioastronomie Millimétrique (IRAM) has been set up to tackle this angular momentum problem. CALYPSO is a survey of 16 nearby (d < 500 pc) Class 0 protostellar systems carried out at high angular resolution (∼0.5 ) with the IRAM Plateau de Bure interferometer (PdBI, now called Northern Extended Millimeter Array, NOEMA). This interferometric survey is complemented with observations with the IRAM 30 m single-dish telescope that provide short-spacing information. The observations were performed at three frequencies (94, 219, and 231 GHz) with both narrow-and broad-band spectrometers. In addition to the continuum emission used to probe circumstellar disks and protostellar multiplicity (Maury et al. 2019), the frequencies were selected in order to include in particular tracers of envelope rotation (Maret et al. 2014;Gaudel et al. 2020), disk rotation (Maret et al. 2020), jets and outflows (Codella et al. 2014;Santangelo et al. 2015;Podio et al. 2016;Lefèvre et al. 2017), and snow lines (Anderl et al. 2016). Here, we take advantage of the large frequency coverage of the survey (∼11 GHz in total) to probe the chemical composition of the targets, focusing on complex organic molecules (COMs), which are molecules with at least six atoms according to the definition adopted in astrochemistry (Herbst & van Dishoeck 2009). Maury et al. (2014) reported the detection of resolved emission of numerous COMs toward NGC 1333-IRAS2A on the basis of CALYPSO and a CALYPSO study specific to glycolaldehyde was published in De Simone et al. (2017).
Complex organic molecules have been detected for more than two decades in several Class 0 protostars such as IRAS 16293-2422, NGC 1333-IRAS4A, and NGC 1333IRAS2A (see, e.g., van Dishoeck et al. 1995Cazaux et al. 2003;Bottinelli et al. 2004Bottinelli et al. , 2007. This COM emission was found to be compact, and confined to regions with temperatures higher than ∼100 K. It is usually interpreted as resulting from the sublimation of the ice mantles of dust grains in the hot, inner parts of the envelope heated by the stellar embryo. This sublimation process is thought to either release COMs previously formed in the solid phase directly into the gas phase or trigger a hot gas-phase chemistry that subsequently forms COMs. These regions with compact COM emission were called hot corinos by Ceccarelli (2004), in analogy to hot cores, their counterparts around young high-mass stars (e.g., Walmsley 1992;van Dishoeck & Blake 1998;Kurtz et al. 2000).
Unsaturated carbon chain molecules, some of them being COMs, were detected a decade ago in the Class 0 protostar L1527, a source that was not known to harbor a hot corino. This led to the definition of a new class of protostars, the so-called Warm Carbon Chain Chemistry (WCCC) sources (see, e.g., the review of Sakai & Yamamoto 2013). While unsaturated carbon chain molecules are known to form in the early stages of the prestellar phase, before carbon atoms get locked up into CO, the presence of these molecules in the warm (>25 K) region of the protostellar envelope was interpreted as resulting from the desorption of CH 4 from the grain surfaces and subsequent gas-phase chemistry involving C + . Sakai et al. (2008Sakai et al. ( , 2009 proposed that hot corinos and WCCC sources are distinct classes of protostars, maybe related to different ice composition of the gain mantles resulting from different conditions during the prestellar phase. However, the detection of methanol at small scales in L1527 and the recent detection of a hot corino in L483, a candidate 1 See http://irfu.cea.fr/Projets/Calypso/ WCCC source, call for a revision of this interpretation (Sakai et al. 2014a;Oya et al. 2017).
Saturated COMs have also been detected at even earlier stages in prestellar cores (e.g., Bacmann et al. 2012), albeit with lower abundances compared to hot corinos. A fraction of the COMs detected in hot corinos may therefore have been formed during the prestellar phase. However, the origin of the COM emission in hot corinos (and hot cores) is not fully established. The sublimation of the ice mantles of dust grains could also occur through shocks generated by jets or direct UV irradiation by the protostar on the walls of the cavity excavated by these jets. Enhancements of COM emission have indeed been reported in shocked regions of outflows for more than two decades (Avery & Chiao 1996;Jørgensen et al. 2004;Arce et al. 2008;Sugimura et al. 2011;Lefloch et al. 2017). Numerical simulations have shown that an enhancement of COM abundances in irradiated cavity walls of outflows may be relevant and should strongly depend on the protostellar luminosity (Drozdovskaya et al. 2015). However, this process does not seem to dominate the COM emission detected along the outflow cavity of the high-mass protostar IRAS 20216+4104, which was instead interpreted as produced by shocks (Palau et al. 2017).
The COM emission could also be enhanced by accretion shocks at the centrifugal barrier in the interaction region between the disk and the collapsing envelope. Such an interpretation was proposed by Csengeri et al. (2018) to explain the presence of two hot spots detected in COM emission at small offsets from the high-mass protostar G328.2551-0.5321. However, the methanol emission detected with the Atacama Large Millimeter/submillimeter Array (ALMA) toward the Class 0 protostar B335 was found to be more extended than the centrifugal barrier (Imai et al. 2019). In the evolved Class 0 protostar L1527, Sakai et al. (2014b) reported a change of chemistry with an enhancement of SO abundance at the centrifugal barrier but, because of a lack of sensitivity, how COMs like methanol behave at this scale in this source is currently unclear (Sakai et al. 2014a).
High-angular-resolution observations with ALMA have recently revealed that the COM emission in the Class 0 protostar HH212, previously interpreted as a bona-fide hot corino in lower-angular-resolution observations (Codella et al. 2016a), actually originates near the centrifugal barrier from a rotating ring in the warm atmosphere above and below the disk detected around this source (Lee et al. 2017). The rotational temperatures derived from the COM emission in HH212 are on the order of 150 K (Lee et al. 2019). Lee et al. (2017) argue that the COMs were formed in the disk rather than in the rapidly infalling inner envelope and that their release in the disk atmosphere may result from the irradiation of the flared disk by the protostar. A disk wind may also play a role in their formation or their release into the gas phase (Leurini et al. 2016;Lee et al. 2019).
This introduction shows that no consensus exists about the origin of the COM emission in Class 0 protostars. Do all protostars show COM emission? Is this emission consistent with the usual picture of a hot corino or do disks, outflows, or disk winds dominate the COM emission? When does this emission appear? Can it be used as an evolutionary tracer? Does it depend on the envelope mass? Is the COM chemical composition of Class 0 protostars universal? In order to start addressing these questions on a statistical basis, we take advantage of the unprecedentedly large sample of young embedded protostars and the subarcsecond angular resolution provided by the CALYPSO survey to investigate the origin of small-scale COM emission in Class 0 protostars, as well as a few Class I protostars that happened to be in the covered fields. We use the CALYPSO data set to A198, page 2 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey search for COMs in the targeted sources and derive their COM chemical composition. We provide a short description of the observations in Sect. 2. The chemical composition of the sources derived from the observations is presented in Sect. 3. An analysis of the correlations between the chemical abundances of the detected COMs and the correlations between these abundances and various source properties is given in Sect. 4. These results are discussed in Sect. 5 and our conclusions are stated in Sect. 6.

Observations
The observations of the CALYPSO survey were performed with PdBI at wavelengths of 1.29, 1.37, and 3.18 mm. The interferometric data were obtained by combining observations carried out with two distinct PdBI array configurations (A and C), providing baselines ranging from 16 to 760 m and subarcsecond resolution. General information about the observations (dates of observation, conditions for each track) and the overall data calibration strategy have been published in Maury et al. (2019). The phase self-calibration corrections derived from the continuum emission gain curves were also applied to the line visibility data presented here (when applicable, that is for all sources with a 1.3 mm PdBI integrated flux density higher than 100 mJy in Table 4 of Maury et al. 2019. Observations of the molecular line emission analyzed in the present paper were obtained under the form of three spectral setups (setup S1 around 231 GHz, setup S2 around 219 GHz and setup S3 around 94 GHz) each containing two 1.8-GHzwide spectral windows covered with the WideX backends with a spectral resolution of 1.95 MHz. The frequency ranges covered by the survey are the following: 229.242-231.033 and 231.0422-232.8338 GHz in setup S1, 216.8700-218.6550 and 218.6720-220.4550 GHz in setup S2, and 91. 8560-93.6728 and 93.6753-95.5460 GHz in setup S3. The sources IRAM04191, NGC 1333-IRAS2A, L1448-NB, L1448-2A, L1448-C, and L1521F were already observed with configuration A in setup S1 by the beginning of our CALYPSO program (see, e.g., the pilot observations presented in Maury et al. 2010), hence configuration-A WideX data around 231 GHz were not obtained for these sources.
The continuum built from line-free channels was subtracted directly from the spectral visibility data sets, for each of the 30 continuum sources detected in the 16 fields targeted by the CALYPSO survey (see Table 3 in Maury et al. 2019). Details about this procedure are available on the CALYPSO data release webpage 2 . Here we focus our analysis on 26 of these sources, ignoring SerpM-S68Nc, L1448-NW, and SVS13C, which are located well outside the primary beam at 1.3 mm, and VLA3, the nature of which is unknown. These 26 sources are listed in Table 1 along with some of their properties. For readability reasons, we use in Table 1 and in the rest of the article the short names IRAS2A, IRAS4A, and IRAS4B for the IRAS sources located in the NGC 1333 molecular cloud.
We built spectral maps for each source covering the six frequency windows. The maps were obtained using a robust weighting scheme, resulting in synthesized beam sizes and rms noise levels reported in Table A.1.

Basic spectral features
The continuum-subtracted WideX spectra obtained in setups S1, S2, and S3 toward the main peak and some of the secondary continuum emission peaks found by Maury et al. (2019) in the CALYPSO sample are shown in Figs. 1, B.1, and B.2, respectively. Only a few lines are detected in the 3 mm spectra, but several sources show many spectral lines in the 1.4 and 1.3 mm bands. Qualitatively, the sources with a high density of detected spectral lines are IRAS2A1, IRAS4A2, IRAS4B, L1448-C, SVS13A, and SerpS-MM18a. The first three sources were already known to harbor a hot corino before the CALYPSO survey started (Bottinelli et al. 2004(Bottinelli et al. , 2007. Maret et al. (2004) reported a jump of H 2 CO abundance by three orders of magnitude in the region above 100 K in the envelope of L1448-C, suggesting the presence of a hot corino in this source as well. The presence of a hot corino in SVS13A was reported based on the Astrochemical Surveys At IRAM (ASAI) Large Program (Codella et al. 2016b) and on CALYPSO data (Lefèvre et al. 2017).

Channel count maps and line counts
We constructed maps of channel counts in order to search in a systematic way for hot-corino-like emission in each source of the sample. In the continuum-subtracted spectrum of each position in the field of view of a source, we counted the channels that have a flux density higher than six times the rms noise level. The noise level was derived as the median of the dispersions of the intensity distributions of all channel maps. This value was obtained with the task "go noise" of the GREG software 3 . Because the S3 data cubes have a lower angular resolution, the counting was performed on the S2 and S1 data cubes only. In order to avoid counting contributions from transitions of diatomic or triatomic molecules that may be dominated by emission produced by outflows and/or molecular jets, we excluded the following frequency ranges when counting the channels: 230.400-230.680 GHz (CO 2-1), 220.340-220.460 GHz ( 13 CO 2-1), 219.550-219.570 GHz (C 18 O 2-1), 217.040-217.205 GHz (SiO 5-4), 219.890-220.045 GHz (SO 5 6 -4 5 ), 218. , and 231.045-231.075 GHz . The excluded frequency ranges cover 0.8 GHz in total, meaning that the channel counting was performed over 6.4 GHz only. The resulting maps are shown in Fig. 2 and the values of the strongest peaks are listed in Table 2. Maps over a larger field of view are displayed in Fig. C.1. Given that this article is later focused on COMs, we would like to emphasize that molecules such as DCN, c-C 3 H 2 , and H 2 CO may contribute, depending on the source, with a handful of channels to the maps shown in Figs. 2 and C.1.
The inspection of Fig. 2 reveals six sources with a clear peak of channel counts associated with one of the continuum peaks, and a value at the peak, N peak c , higher than 20: SerpS-MM18a (N peak c = 171), L1448-C (90), IRAS2A1 (250), IRAS4A2 (423), IRAS4B (139), and SVS13A (394). All other CALYPSO sources have maximum channel counts below 20. We also counted by eye the number of spectral lines detected with a peak flux density above 6σ toward the continuum peak positions of the six line-rich sources, excluding the transitions listed in the previous paragraph. These line counts translate into spectral line densities between 5 lines per GHz for L1448-C and 34 lines per GHz for IRAS4A2 (Table 2). From the channel and line counts listed in Table 2, we deduce that the lines detected with a peak flux density above 6σ have flux densities above this threshold over two to three channels on average (∼5-8 km s −1 ). The maps of channel counts displayed in Fig. 2 can therefore be roughly converted  André et al. 2010), except for GF9-2 for which we use Wiesemeyer et al. (1997). The luminosity has been rescaled to the distance given in Col. 4. (e) Peak flux densities at 1.3 and 3 mm measured with PdBI by Maury et al. (2019) with the uncertainties in parentheses when available. ( f ) Integrated flux density at 1.3 mm over the source size given in Table 4, when the latter is larger than the beam. (g) This source is located outside the primary beam of the CALYPSO survey at 1.3 mm but is detected at 3 mm. References. References for the distance (Col. into maps of spectral line counts by dividing them by this average number of channel counts per line (fourth column of Table 2).

Associations between channel count peaks and continuum sources, and relation to outflows
The positions of the channel count peaks higher than 9 were derived from Gaussian fits to the maps shown in Fig. 2. The results are listed in Cols. 2 and 3 of Table 3. This table also gives the name (Col. 1) and coordinates (Cols. 4 and 5) of the nearest continuum source. As seen below (Sect. 3.7), COMs (at least methanol) are detected toward all sources listed in Table 3 except for L1448-NB2. The lines contributing to the channel count toward L1448-NB2 are from H 2 CO and c-C 3 H 2 . The angular separation between the channel count peak and its nearest continuum source, and its ratio to the width (HPBW) of the beam along the same position angle are listed in Cols. 8 and 11, respectively, of Table 3. For most sources, the positions of the channel count peak and its nearest continuum source agree to better than one-fifth of the beam width (HPBW). There are four exceptions: L1448-2A, L1448-NB2, SVS13A, and SerpM-SMM4b. For both L1448-2A and L1448-N, the contours of channel counts in Figs. 2a and b enclose a close binary, and the peak is located at roughly equal distance from each binary component. In both cases, the binary separation is roughly one beam, and therefore we cannot exclude that a map of channel counts at higher angular resolution would reveal two peaks coinciding with the binary components. In the case of SerpM-SMM4b, the contour map of channel counts is asymmetric and its actual peak is located half-way between the peak derived from the Gaussian fit and the continuum peak, that is less than one-fifth of the beam from the continuum peak (panel l of Fig. 2). The offset may therefore not be significant.
The only source for which there seems to be a significant offset between the continuum peak and the channel count peak is SVS13A. The separation is 0.11 , which corresponds to one A198, page 4 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey fourth of the beam. However, SVS13A is a tight binary, with the components VLA4A and VLA4B located west and east of the CALYPSO continuum peak, respectively (Lefèvre et al. 2017, see orange crosses in Fig. 2e). The channel count peak is located at an angular distance of 0.08 from VLA4A with a position angle of 77 • , which is nearly perpendicular to the outflow axis. This offset is less than one-fifth of the beam. VLA4A thus seems to be associated with the peak of molecular richness in SVS13A, as already noted by Lefèvre et al. (2017), and there is no obvious link between the channel count peak and the outflow. Higherangular-resolution ALMA data on SVS13A are currently being analyzed to clarify the distribution of COMs in this protostellar system (C. Lefèvre, priv. comm.). Figure 3 displays the position angle of the offset between the channel count peak and its nearest continuum peak, along with the position angles of the blueshifted and redshifted lobes of the outflow for the sources listed in Table 3, except for L1448-NB2 which lacks outflow information. There is no systematic alignment of the channel count peaks with the direction of the outflows. For the three sources with an offset larger than onefifth of the beam, that is the ones with an a priori more reliable, measured offset, the channel count peak tends to be close to the outflow direction (∼30-50 • ). However, as discussed above, the position angle of the offset for each of these sources may not be meaningful. For L1448-2A, this is due to the presence of the close binary that is barely resolved in the channel count map. The second contour of Fig. 2a actually suggests a channel count peak associated with each component. For SerpM-SMM4b, the asymmetry of the channel count contours leads to an overestimate of the offset. In the case of SVS13A, the channel counts peak very close to VLA4A, one of the tight binary components. If we take this component as reference, the offset becomes roughly aligned with the direction perpendicular to the outflow axis. On the basis of Fig. 3, there is therefore no obvious link between the channel count peaks and the outflows for the sample of sources with a maximum channel count higher than 9.

Sizes of the COM emission
For each source, we selected the spectral lines that have a peak flux density above 6σ in the spectrum of the continuum peak that has the highest number of detected lines among all continuum peaks in the field of view. We determined the size of the emission of each selected spectral line by fitting a two-dimensional Gaussian to the emission of the peak channel in the uv plane. The deconvolved sizes derived in this way are displayed in Figs. D.1-D.15. The uncertainties are significant, but the high number of detected lines in some sources allows estimating a representative size of their COM spectral line emission. The COM emission of the six line-rich CALYPSO sources has a size (HPBW) ranging between 0.3 for SVS13A and 0.5 for L1448-C and SerpS-MM18a (Table 4). The uncertainty is likely on the order of 0.1 . A198, page 5 of 81 A&A 635, A198 (2020) Fig. 2. Maps of number of channels with signal detected above six times the rms noise level in the continuum-removed WideX spectra at 1.3 and 1.4 mm, excluding CO 2-1, 13 CO 2-1, C 18 O 2-1, SiO 5-4, SO 5 6 -4 5 , OCS 18-17, and OCS 19-18. In each panel, the red ellipses show the synthesized beam sizes at 219 and 231 GHz. The pink crosses indicate the positions of the continuum emission peaks derived by Maury et al. (2019), and the orange crosses in panel e mark the positions of the binary components VLA4A and VLA4B of SVS13A determined by Lefèvre et al. (2017). The peak count is given in the bottom right corner. For sources with a peak count higher than 9, the blue cross and ellipse show the position of the peak and the width (FWHM) of a Gaussian fit to the map, respectively. The contour levels are indicated above each panel. The coordinates at the origin are listed in Table C For the sources listed in Table 4 but not shown in Figs. D.1-D.15, we assumed by default the same source size as (one of) the other source(s) present in the field of view.

Elongations of the COM emission
We used the two-dimensional Gaussian fits performed in Sect. 3.4 to search for a correlation between the position angle of the ellipses fitted to the COM emission and the position angles of the outflows. Here we also included molecules with five atoms (CH 2 CO, HC 3 N, c-C 3 H 2 , NH 2 CN, and t-HCOOH). Figure 4 displays the fit results for the 1.3 and 1.4 mm transitions that have an error on the fitted position angle smaller than 10 • . The molecules with less than five atoms are ignored for this investigation, as well as the unidentified transitions. The mean absolute deviation of the fitted COM position angles with respect to the mean position angle of the outflow lobes is less than 20 • for two out of six sources, IRAS2A1 and SerpS-MM18a. This is A198, page 6 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey Notes. (a) Peak number of channels with a continuum-subtracted flux density above 6σ in setups S1 and S2, excluding channels covering CO 2-1, 13 CO 2-1, C 18 O 2-1, SiO 5-4, SO 5 6 -4 5 , OCS 18-17, and OCS 19-18. This value corresponds to the peak of the contour map shown in Fig. 2. The total bandwidth is 6.4 GHz. (b) Number of spectral lines with a peak flux density above 6σ in setups S1 and S2 toward the continuum peak, excluding the transitions listed above. (c) Average number of channels with a flux density higher than 6σ per detected line with a peak flux density above the same threshold. (d) Spectral density of lines detected above 6σ toward the continuum peak.
also true for the mean algebraic (signed) deviations which are −3 • and 8 • , respectively, both with a small dispersion of ∼15 • . SVS13A and IRAS4A2 have a mean algebraic deviation smaller than 20 • as well, but with large dispersions (∼50 • and ∼30 • , respectively), and therefore we cannot conclude that the COM emission is preferentially elongated along the direction of their outflows. However, most (80%) transitions fitted for IRAS4B fall within ±20 • of the mean position angle of the outflow and the mean deviation is 14 • ± 24 • , and so we can consider that IRAS4B behaves like IRAS2A1 and SerpS-MM18a as well. It is therefore tempting to conclude that the COM emission is preferentially elongated along the direction of the outflow in IRAS2A1, IRAS4B, and SerpS-MM18a.

Spectral line modeling
We modeled the spectral line emission detected toward the continuum peaks of the CALYPSO sources assuming local thermodynamic equilibrium (LTE) with the Weeds package (Maret et al. 2011) of the CLASS software 3 . Weeds takes into account the finite angular resolution of the telescope and the opacity of the transitions. We corrected the spectra for primary beam attenuation. We followed the same modeling strategy as applied for Sgr B2 by Belloche et al. (2016). For each continuum peak position, each species is modeled separately and its synthetic spectrum is computed over the same frequency ranges as covered by setups S1 to S3. Each species is modeled with five free parameters: source size, column density, rotational temperature, line width, and velocity offset with respect to the systemic velocity of the source. For most species, we assumed the (Gaussian) source sizes listed in Table 4. After a first iteration of modeling all identified molecules, we constructed population diagrams of the detected complex organic molecules following the method described in Sect. 3 of Belloche et al. (2016). The diagrams are corrected for optical depth and contamination from species included in the full model that have overlapping transitions. These diagrams are presented in Figs. E.1-E.74. For each molecule, the rotational temperature is derived from a linear fit to the corrected population diagram in linear-logarithmic space. The derived temperatures are listed in Table E.1 for all sources and positions that are detected in Fig. 3. Comparison of the position angle PA c of the vector going from the continuum peak to the channel count peak (black plus symbol) and the position angles of the blueshifted (blue cross) and redshifted (red cross) outflow lobes for the sources listed in Table 3, except for L1448-NB2 which lacks outflow information. The black plus symbol is enclosed by a circle when the angular separation between the continuum peak and the channel count peak is larger than one-fifth of the beam. For the other ones, PA c may not be reliable.
several transitions of at least methanol. They are also plotted in a synthetic way for each source and position in Figs. E.75-E.86. Appendix E explains in detail how the temperatures for the radiative transfer modeling were then chosen to derive the column densities.
Most lines are barely spectrally resolved and were modeled assuming line widths in the range 3-6 km s −1 . With the source size, rotational temperature, and line widths set as described above, and the velocity offset derived directly from the spectra, the only remaining free parameter for each molecule is its column density. This parameter was adjusted by eye until the synthetic spectrum matched the observed spectrum reasonably well. For each molecule, we tried as hard as possible not to exceed the peak temperature of any of the detected lines (or the upper limit of any of the undetected transitions). The parameters resulting from these fits are listed in Table F.1 for a set of 12 (complex) organic molecules. Table F.1 also indicates the number of detected lines per molecule, as well as the detection status of each species (detected, tentatively detected, or not detected, depending on the number of detected lines and/or the strength of these lines). A species is considered as only tentatively detected when its lines are either all weak (below ∼3σ) or it has too few lines just above ∼3σ. Column density upper limits at the ∼3σ level are given for the nondetections.

Chemical composition
The chemical composition of all continuum peaks is displayed in terms of column densities in Fig. 5 for a selection of ten COMs and two simpler organic molecules, HNCO and NH 2 CN. Twelve of the 26 analyzed CALYPSO sources are detected in methanol, plus one (L1448-NA) tentatively. Among these 12 sources, nine have at least two COMs detected (CH 3 OH and CH 3 CN). We also show in Fig Table 3. Coordinates of channel count peaks and angular distance to the nearest continuum peak, for the sources with a maximum channel count higher than 9.
Source (a) Channel count peak (b) Cont. peak (c) α (hh:mm:ss) δ (dd:mm:ss) α (ss) δ (ss) Notes. (a) Continuum source nearest to channel count peak. (b) J2000 equatorial coordinates of channel count peak obtained from Gaussian fit in Fig. 2. (c) J2000 equatorial coordinates of nearest continuum peak from Maury et al. (2019). The hours, minutes, degrees, and arcminutes are not displayed. They are the same as in Cols. 2 and 3. (d) Equatorial offsets of channel count peak with respect to nearest continuum peak. (e) Angular distance between channel count peak and nearest continuum peak. ( f ) Position angle of vector going from nearest continuum peak to channel count peak, counted east from north. (g) Beam diameter (HPBW) along PA c . This is an average value for setups S1 and S2. (h) Position angle of the blueshifted and redshifted lobes of the outflow/jet from the CALYPSO survey (Podio et al., in prep.), except for L1448-2A for which we take the tentative detection of Kwon et al. (2019).  Table 3. Only the transitions with an error on the position angle smaller than 10 • were selected. The unidentified transitions were not used and the following molecules with less than five atoms were ignored: CO, 13 CO, C 18 O, SO, SO 2 , H 2 CO, H 2 13 CO, D 2 CO, OCS, O 13 CS, SiO, DCN, C 13 S, and HNCO. L1448-2A, SerpM-SMM4b, and L1157 have no transitions fulfilling the selection criteria. For each source, the olive circle and the two green squares indicate the angles ∆PA mean and ±|∆PA| mean with ∆PA mean and |∆PA| mean the mean deviation and mean absolute deviation, respectively, of the COM position angles with respect to the mean position angle of the outflow. The dispersion around the mean deviation is also displayed. The vertical dashed lines mark a deviation of ±20 • from the mean position angle of the outflow. Only IRAS2A1 and SerpS-MM18a fall within these limits with a small dispersion. methanol differs by up to at least four orders of magnitude, with L483 and IRAS16293B at the level of 10 19 cm −2 , and upper limits as low as 10 15 cm −2 for a number of sources where no COMs are detected (e.g., L1521F, GF9-2). The CALYPSO sources with the highest column densities of methanol are IRAS2A1, SVS13A, and IRAS4A2 (∼10 18 cm −2 ), followed by L1448C, IRAS4B, SerpM-SMM4b, SerpS-MM18a, and L1157 (∼10 17 cm −2 ), and L1448-2A, L1448-2Ab, SerpM-S68N, and SerpS-MM18b (∼10 16 cm −2 ).
Because the size of the molecular emission varies from source to source and their H 2 column densities at these small scales may vary too, we renormalized the column densities of the CALYPSO sources to obtain a quantity that may better reflect the abundances of the molecules with respect to H 2 . We used the ratio of the continuum flux density to the measured or assumed solid angle of the COM emission as a proxy for the H 2 column density at the scale of the COM emission. We normalized the molecular column densities with this proxy. By doing this, we neglected possible differences in terms of temperature and dust properties between the sources. We computed the normalization using the continuum emission at 1.3 mm or 3 mm. At 1.3 mm (from setups S1 and S2, see Maury et al. 2019), we used either the peak flux density of the continuum emission if the solid angle of the COM emission is smaller than the beam, or the continuum flux density integrated over the COM emission size if it is larger than the beam. At 3 mm, we used the peak flux density of the continuum emission. Figure G.1 shows the column densities renormalized to the 1.3 mm continuum emission as described in the previous paragraph. Among the CALYPSO sources detected in methanol, two stand out with a high "abundance" of methanol (∼4 × 10 4 cm −2 sr mJy −1 ): IRAS2A1 and SVS13A. On the contrary, the following sources have abundance of methanol that is lower by more than one order of magnitude (∼10 3 cm −2 sr mJy −1 ): L1448-NA (tentative), IRAS4B, SerpS-MM18b, and L1157. Seven sources lie in between: L1448-2A, L1448-2Ab, L1448-C, IRAS4A2, SerpM-S68N, SerpM-SMM4b, and SerpS-MM18a. Among the sources not detected in methanol, several stand out with a low upper limit of methanol abundance (<3 × 10 2 cm −2 sr mJy −1 ): L1448-NB1, L1448-NB2, IRAS4A1, IRAS4B2, L1527, and SerpM-SMM4a. The remaining sources with a 1.3 mm continuum detection have an upper limit in the range ∼3 × 10 2 -10 4 cm −2 sr mJy −1 (L1448-CS, SVS13B, A198, page 8 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey  IRAM04191, L1521F, SeprS-MM22, GF9-2). However, we remind the reader that these abundance upper limits are sensitive to the emission size assumed to derive the upper limits of the methanol column density. The sources remain grouped in the same way when we consider their column densities renormalized to the 3 mm continuum emission (see Fig. G.2). This suggests that the 1.3 and 3 mm continuum emissions trace to first order the same dust (and gas) reservoir. SerpM-S68Nb is plotted in Fig. G.2: it falls into the category of sources with a high upper limit of methanol abundance. Because the previous normalizations may not represent robust proxies of the molecular abundances relative to H 2 , we plot in Fig. 6 the abundances relative to methanol. These relative abundances are summarized in Table 5. Three groups of sources stand out on the basis of the abundances of oxygen-bearing molecules, cyanides, and CHO-bearing molecules relative to methanol. Group 1 (highlighted in blue) includes IRAS2A1, L483, and IRAS 16293B and is characterized by low abundances of COMs relative to methanol, in particular C 2 H 5 OH, CH 3 OCH 3 , CH 3 OCHO, and CH 3 CHO (<0.02 on average). Group 2 (in orange) includes SVS13A, IRAS4A2, IRAS4B, and SerpM-S68N and has abundances of these four oxygenbearing molecules relative to methanol higher by a factor of approximately six with respect to group 1. The other molecules are enhanced by a smaller factor (∼2-3 on average), except for C 2 H 5 CN which is enhanced by a factor of approximately ten with respect to group 1. Finally, group 3 (in magenta) includes L1448-2A, L1448C, SerpS-MM18a, and L1157. It is similar to group 2, but it has average abundances of CH 3 CN and C 2 H 5 CN relative to methanol enhanced by a further factor of approximately two and average abundances of CH 3 OCHO, CH 3 CHO, and NH 2 CHO relative to methanol reduced by a factor of approximately two with respect to group 2.
We assigned L1448-2A to group 3 because of its high CH 3 CN abundance relative to methanol. As the CH 3 CN detection is only tentative, the assignment of L1448-2A to this group is also tentative. No other COM is detected toward this source, but the upper limits are consistent with the average abundances relative to methanol derived for this group.
Both HNCO and CH 3 CN are detected toward SerpS-MM18b, with abundances relative to methanol of 0.029 and 0.028, respectively. The abundance of HNCO is similar to the average abundances of both groups 2 and 3, and the abundance of CH 3 CN lies in between these groups. Without other COM detections, it is therefore difficult to tell if this source belongs to group 2 or group 3. HNCO is detected toward L1448-2Ab and SerpM-SMM4b with abundances relative to methanol of 0.020 and 0.019, respectively, which are similar to the average abundances of both groups 2 and 3 and a factor of approximately three higher than the average abundance of group 1. However, no COM apart from methanol is detected toward these sources, which prevents us from distinguishing between groups 2 and 3. The status of L1448-NA is unclear, with CH 3 OCHO tentatively as abundant as methanol but no other COM detected.

Correlations between COM abundances
To investigate the chemical composition of the CALYPSO sources one step further, we show in Figs. H.1-H.6 correlation plots for all pairs of COMs that have at least four detections, except for CH 2 (OH)CHO. For this analysis, we considered the 20 sources that have an envelope mass and an internal luminosity listed in Table 1. We normalized the column densities in a different way in each figure, as stated at the bottom of the figure. The Pearson correlation coefficients and their 95% confidence intervals are indicated in each correlation plot and displayed in the lower left panel of each figure (see, e.g., Edwards 1976, for the calculation of confidence levels of correlation coefficients via a Fisher transformation); they are also summarized for all normalizations in Fig. 7 and in Table H.1.
Most pairs of COMs (25 out 28) have a Pearson correlation coefficient higher than 0.6 when we consider their column densities, suggesting that they are to some degree all correlated with each other. However, the low number of detections implies a large 95% confidence interval in many cases, meaning that an absence of correlation is not ruled out in these cases. A more robust correlation is found for 11 pairs out of 28, with a 95% confidence interval that does not extend below 0.3 (numbers in bold face in Table H.1). The results are similar when we normalize the column densities by the 1.3 or 3 mm proxies for H 2 introduced in Sect. 3.7: 23 out of 28 pairs and 21 out of 21 pairs have a Pearson correlation coefficient higher than 0.6 for the 1.3 and 3 mm normalizations, respectively, and 8 out of 28 pairs and 8 out of 21 pairs, respectively, have a 95% confidence interval that does not extend below 0.3. The degree of correlation is overall poorer when we normalize the abundances with the A198, page 9 of 81 A&A 635, A198 (2020) internal luminosity while it is higher when the normalization is done with the envelope mass or the ratio of internal luminosity to envelope mass: the previous numbers (23/8 out of 28) fall to 19/5 in the first case but increase to 28/9 and 28/10 in the latter cases. Given that both normalization factors, namely envelope mass and ratio of internal luminosity to envelope mass, have a dynamical range of two orders of magnitude, they may introduce a bias that increases the degree of correlation of the normalized abundances.
The best correlations with a narrow 95% confidence interval (i.e., an interval ending not lower than ∼0.6), whatever the type of normalization, occur for the following pairs of COMs: CH 3 CN/CH 3 OH, NH 2 CHO/CH 3 OH, and CH 3 OCH 3 / C 2 H 5 OH, followed with a lower degree of confidence by CH 3 CN/ CH 3 OCH 3 , CH 3 CHO/CH 3 OCHO, CH 3 OCH 3 / CH 3 OCHO, CH 3 OCH 3 /CH 3 OH, and CH 3 OCHO/C 2 H 5 OH.
Coming back now to the individual correlation plots, for instance in Fig. H.2, we consider the nondetections that were not taken into account in the correlation analysis. None of the nondetections in the correlation plots of CH 3 CN/CH 3 OH and NH 2 CHO/CH 3 OH are inconsistent with the correlations noted above. In the correlation plot of CH 3 OCH 3 /C 2 H 5 OH, the data point corresponding to L1448-C with a detection of CH 3 OCH 3 but an upper limit for C 2 H 5 OH is marginally consistent with the correlation. In the correlation plot of CH 3 CN/CH 3 OCH 3 , the data point corresponding to L1157 with a detection of CH 3 CN but an upper limit for CH 3 OCH 3 is marginally consistent with the correlation. In the correlation plot of CH 3 OCH 3 /CH 3 OH, the data point corresponding to L1448-CS with a tentative detection of CH 3 OCH 3 but an upper limit for CH 3 OH is largely inconsistent with the correlation. This casts some doubt on the tentative detection of CH 3 OCH 3 in this source, which relies on only one line just below the 3σ level. For the three other pairs mentioned in the previous paragraph, the upper limits are not inconsistent with the correlations.

Correlations between molecules and source properties
We now search for correlations between the COM column densities and the following source properties: internal luminosity, envelope mass, and ratio of envelope mass to internal luminosity, which has been proposed as an evolutionary indicator that decreases with time (André et al. 2000). We test four different normalizations of the column densities: column densities  such obvious threshold in envelope mass for the range explored with the CALYPSO survey (0.2-20 M ). We notice also that no COMs are detected for sources with M env /L int higher than 1.5 M /L . However the opposite is not true: there are sources with internal luminosities higher than 2 L or with M env /L int lower than 1.5 M /L that have no COM detection.
Because of the low number of detections, most plots in Figs. I.1-I.3 and 8 are consistent with the absence of a correlation between (normalized) column densities and source properties ( Fig. 9 and Table 6). However, there is a clear anticorrelation between the abundances of CH 3 CHO and CH 2 (OH)CHO relative to methanol and the internal luminosities of the sources (green bars in the bottom panel of Fig. 9). Although it is less significant, there also seems to be a correlation between the internal luminosities of the sources and the column densities of CH 3 OH normalized to the ratio of the continuum flux density at 3 mm (and, to a lesser degree, 1.3 mm as well) to the COM solid angle (blue and red bars in bottom panel of Fig. 9). The same applies to CH 3 CN and NH 2 CHO, though with even less significance. Finally, there seems to be an anticorrelation between the ratio of envelope mass to internal luminosity and the column densities of C 2 H 5 OH and CH 3 OCH 3 normalized to the ratio of the continuum flux density at 1.3 mm to the COM solid angle (red bars in top panel of Fig. 9), but the significance of this anticorrelation is low. The internal luminosity thus appears to be the parameter most impacting the COM chemical composition of the sources, followed by the ratio of envelope mass to internal luminosity, while the envelope mass itself does not imprint any obvious signature in the COM chemical composition.
However, we should also pay attention to the nondetections, which were not taken into account in the correlation analysis. Figure 8 indicates that the CH 3 CHO nondetections are consistent with the anticorrelation mentioned above, except for L1157 which is marginally inconsistent. The CH 2 (OH)CHO nondetections are all consistent with the anticorrelation.
For CH 3 OH, two nondetections are inconsistent with the correlation mentioned above (see Fig. I.3). They correspond to SVS13B and SerpS-MM4a. The other nondetections are consistent or marginally consistent with the correlation. For CH 3 CN, the upper limits are consistent with the (loose) correlation, three of them only marginally. For NH 2 CHO, all nondetections are consistent with the (loose) correlation noted above.
One upper limit (L1448-C) is inconsistent with the loose correlation of C 2 H 5 OH with M env /L int in Fig. I.2, one is marginally consistent (SerpS-MM18b), and the others are consistent. For CH 3 OCH 3 , three upper limits are inconsistent with the loose correlation, five are marginally consistent, and four are consistent.

Correlations between molecules and disk properties
We also search for correlations between the COM column densities and the following properties of the candidate disk-like structures detected in continuum emission in the CALYPSO sample by Maury et al. (2019): disk size (FWHM), disk flux A198, page 11 of 81 A&A 635, A198 (2020) Table 5. Chemical composition relative to methanol of nine CALYPSO sources and two additional Class 0 protostars, and model results of Bergner et al. (2017).

Source
Column density relative to CH 3 OH Notes. The values in bold face are mean values over the group of sources that precedes. X (Y) means X × 10 Y . Tentative detections are indicated with a star symbol. n.a. means not available, that is the column density of the molecule was not reported in the articles we compiled for L483 and IRAS16293B (see references in Sect. 3.7).
density, and ratio of disk to envelope flux densities 4 . We test the same four normalizations as in Sect  Table 7. There are no obvious correlations between the COM column densities and the disk properties, whatever the chosen normalization of the column densities. All distributions have a 95% confidence level interval of their Pearson correlation coefficient that is so large that it includes zero, which means that the distributions are consistent with the absence of a correlation. Figure. J.1 shows that, out of four sources with a large (FWHM disk > 100 au) candidate disk-like structure, three have COM detections (IRAS4A2, IRAS4B, SVS13A), but the source with the largest one does not (SerpM-SMM4a). However, no sign of Keplerian detection was found by Maret et al. (2020) in SerpM-SMM4a, and Maury et al. (2019) argued in their Appendix C.12 that the true nature of its candidate disk-like structure is unclear. One source with a small (FWHM disk = 46 au), resolved disk-like structure has COM detections (L1448-C) while both sources with an intermediate-size (FWHM disk ∼ 81 and 54 au), resolved disk-like structure do not (SerpS-MM22 and L1527). Among the 14 sources with a detected (resolved or unresolved) disk-like structure, the seven sources with a methanol column density higher than 8 × 10 16 cm −2 all have disk flux densities higher than 50 mJy. However, three sources with similarly high disk flux densities do not have methanol detections 4 The kinematical analysis of the CALYPSO data on small scales has revealed Keplerian rotation in only two sources (L1448-C and L1527) while no sign of Keplerian rotation has been found for radii larger than 50 au in the other Class 0 sources of the sample (Maret et al. 2020).
Among the nine CALYPSO sources that we classified into groups (1 to 3) on the basis of their COM composition relative to methanol (Sect. 3.7), only two have no detection of a disklike structure: SerpM-S68N (group 2) and L1448-2A (group 3), with F disk /F env < 4 and 2%, respectively, while the other seven sources have a detection with F disk /F env ranging from 3.5 to 50%. We do not see any obvious differences in terms of disk detection or disk properties between the three groups. One slight difference is that, while 75% of the sources have a disk detection for both groups 2 and 3, all three disk-like structures in group 2 are resolved while only one disk-like structure out of three is resolved in group 3: the disk-like structures in group 2, when detected, are larger than the disk-like structures in group 3. The single CALYPSO source in group 1 has a detected disk-like structure that is unresolved, like two sources of group 3.

Correlations between rotational temperatures and source properties
We use Fig. 12 Table E.1. Figure 12a shows that there is no correlation between the rotational temperatures and the internal luminosities of the sources.
We also compared the rotational temperatures derived from the COMs to the gas kinetic temperature expected at the radius Fig. 7. Pearson correlation coefficients with 95% confidence interval (left) and P-value (right) for various pairs of COMs for the CALYPSO source sample. The P-value is the probability of observing a correlation plot under the null hypothesis of no correlation. These parameters were derived from the plots shown in Figs. H.1-H.6. The colors indicate the variables used to evaluate the degree of correlation between molecules: column density (black), column density multiplied by the solid angle Ω of the COM emission and divided by either the 1.3 mm continuum peak flux density or the 1.3 mm continuum flux density integrated over the size of the COM emission, S c1 (red), column density multiplied by the solid angle of the COM emission and divided by the 3 mm continuum peak flux density (blue), column density multiplied by Ω/S c1 and divided by the envelope mass (orange), column density multiplied by Ω/S c1 and divided by the internal luminosity (green), column density multiplied by Ω/S c1 and divided by the ratio of internal luminosity to envelope mass (magenta).
A198, page 13 of 81 A&A 635, A198 (2020) of the COM emission, r COM , equal to half of the size reported in Table E.1. Given the high densities expected at these small scales, we expect the dust and gas to be well coupled thermally, and the rotational temperatures of the COMs to trace the kinetic temperature of the gas. Therefore, we could expect a correlation between the rotational temperatures of the COMs and the dust temperature of the COM emitting region if the COMs trace the region where they desorb from the icy mantles of dust grains as a result of the heating by the central protostar.
The method that we adopted to estimate the dust temperature at r COM is described in Appendix K. We estimate the uncertainty on this dust temperature to be at least a factor 1.3. Figure 12b shows no obvious correlation between the rotational temperatures and the expected dust temperatures at the scale of the COM emission. Given the large dispersion (and uncertainties) of rotational temperatures and the large uncertainty on the dust temperatures, it is possible that the rotation temperatures do trace the dust temperatures at the radius of the COM emission, which would be consistent with the fact that the presence of COMs in these sources is directly related to the heating by the nascent protostar. However, the large uncertainties that affect Fig. 12b may mask deviations that would point to other processes (e.g., accretion shocks, shocks in outflows) as the source of the COM emission in these sources. An accurate modeling of the dust radiative transfer of the CALYPSO sources at subarcsecond scale would be necessary to make progress in this area.

Comparison of COM, disk, and thermal heating sizes
For the 21 sources that have an internal luminosity listed in Table 1, Fig. 13 compares, the radius of the COM emission (green) when COMs are detected to the disk radius (blue) when a candidate disk-like structure is detected in continuum emission by Maury et al. (2019) in the CALYPSO survey, and the range of radii over which the thermal heating by the protostar is expected to produce temperatures of 100-150 K (red). This range of radii was computed from the luminosities listed in Table 1 using Eq. (K.1) and the correction factor defined in Appendix K. We want to investigate whether the presence of COMs is due to accretion shocks at the edge of the disk, or simply reveals a classical hot corino picture where the COMs become detectable in the gas phase once they have thermally desorbed from the dust grain mantles under the influence of the protostellar luminosity.
There are several systematic uncertainties that are difficult to evaluate properly in Fig. 13: namely (a) the relationship between the size of the disk-like structure derived by Maury et al. (2019) and the location of the centrifugal barrier where accretion shocks may occur or the radial extent of a disk atmosphere such as the one detected in HH212 (Lee et al. 2019); (b) the uncertainty on the correction factor used to compute the dust temperature (see Appendix K); (c) the uncertainties in the derivation of the protostellar luminosities especially in the case of binaries unresolved A198, page 14 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey Fig. 9. Pearson correlation coefficients with 95% confidence interval (left) and P-value (right) between nine COMs and several source properties for the CALYPSO source sample. These parameters were derived from the plots shown in Figs. I.1-I.3 and 8. The colors indicate the variables used to evaluate the degree of correlation between the molecules and the source properties: column density (black), column density multiplied by the solid angle Ω of the COM emission and divided by either the 1.3 mm continuum peak flux density or the 1.3 mm continuum flux density integrated over the size of the COM emission, S c1 (red), column density multiplied by the solid angle Ω of the COM emission and divided by the 3 mm continuum peak flux density, S c3 (blue), column density normalized to that of CH 3 OH (green). The source properties investigated here are the internal luminosity (bottom), the envelope mass (middle), and the ratio envelope mass over internal luminosity (top). by Herschel; (d) the uncertainty on the radius of the COM emission, which is most of the time barely resolved in the CALYPSO survey; and (e) the complexity of the small-scale structure, for instance the binarity of SVS13A and L1448-2A. Still, this figure is interesting in a statistical sense; it reveals a diversity of configurations in the CALYPSO sample: -two sources have a COM radius similar to the radius of the disk-like structure and larger than the 100-150 K region (IRAS4A2, IRAS4B); -two sources have a COM radius falling in the 100-150 K region with a smaller (IRAS2A1) or larger disk-like structure (SVS13A); -five sources have COM radii larger than both the 100-150 K radius range and the radius of the disk-like structure or its upper limit (L1448-2A, L1448-C, SerpM-S68N, SerpS-MM18a, L1157). In the case of L1157, the COM radius is in fact only slightly larger than the upper limit on the radius of the disk-like structure and the radius at 100 K; -four sources have a detected and resolved disk-like structure but no COM emission detected with CALYPSO (L1527, SerpM-SMM4, SerpS-MM22, GF9-2); -three sources have a detected but unresolved disk-like structure and no COM emission (IRAM04191, L1521F, SVS13B); -for the remaining five sources, we have no information about their disk, and they have either no COM emission (L1448-NB2, SerpM-S68Nb) or they have only tentative COM detections (L1448-NA, L1448-CS) or COM emission but no good estimate of its size (SerpS-MM18b). We conclude from this analysis that the detection of a disklike structure does not imply the detection of COM emission, and vice versa, and that the size of the COM emission, when detected, is not systematically related to the size of the disk-like structure or to the extent of the hot (100-150 K) inner envelope. Nevertheless, Fig. 13 shows that all four sources belonging to group 3 have a COM radius larger than both the radius of the A198, page 15 of 81 A&A 635, A198 (2020) Table 6. Correlations between column densities of ten COMs and properties of the CALYPSO sources.
Prop. (a) Molecule N pts Notes. (a) Source property: internal luminosity (L int ), envelope mass (M env ), and ratio envelope mass over internal luminosity. (b) Number of CALYPSO sources detected or tentatively detected. Pearson correlations are evaluated for the following quantities, only when at least four sources are detected (N pts > 3): (c) column density; (d) column density times solid angle of COM emission divided by continuum flux density at 1.3 mm; (e) column density times solid angle of COM emission divided by continuum peak flux density at 3 mm; ( f ) column density normalized to that of CH 3 OH; for each type of correlation, ρ is the Pearson correlation coefficient with its 95% confidence interval, and P is the P-value. X(Y) means X × 10 Y . Pearson coefficients with a confidence interval outside [−0.3, 0.3] are highlighted in bold face.
disk-like structure and the extent of the hot inner envelope. No systematic pattern is visible in this figure for group 2, though.

Spatial origin of COMs in protostars
This study, which relies on the CALYPSO survey, presents the largest sample of solar-type Class 0 protostars investigated for COM emission on the subarcsecond scale. Compact emission of at least one COM (methanol) was detected for 12 sources out of 26. Even though the angular resolution of the survey is barely sufficient to resolve the COM emission in the sources where COMs are detected, we attempt to draw conclusions about the spatial origin of COMs around Class 0 (and I) protostars. Table 8 summarizes the COM results that we obtained in the previous sections.
An enhanced desorption of COMs in accretion shocks at the centrifugal barrier has been proposed by Oya et al. (2016) and  Csengeri et al. (2018) to explain the results of observations performed with ALMA toward IRAS 16293A and the high-mass protostar G328.2551-0.5321, respectively. Theoretical calculations show that this is indeed a possible mechanism if the grain population carrying the adsorbed molecules is dominated by small grains of ∼0.01 µm in size (Miura et al. 2017). If we assume that each protostellar disk has an accretion shock at the radius of its centrifugal barrier, and that this shock should lead to the desorption of the COMs formed in the ice mantles of dust grains then we would expect a strong correlation between the detection of a disk and the detection of COM emission, as well as a good match between the disk radius and the radius of the COM emission. The analysis presented in Sect. 4.5 shows that this is not the case: although this picture appears to work well for IRAS4A2 and IRAS4B, the detection of COM emission is not systematically correlated with the detection of a disk-like structure in the full CALYPSO sample.
One possibility for this lack of correlation is that the shock parameters (pre-shock density and velocity) or the size distribution of dust grains are not the same in all sources. Miura et al. (2017) showed that the desorption efficiency significantly depends on these parameters. Another possibility could be that sources with a disk radius (or upper limit thereof) smaller than the radius at which COMs would thermally desorb in the envelope under the influence of the protostar luminosity (at temperatures higher than 100-150 K) could have COM emission dominated by this thermal process. This hot corino scenario appears to work for IRAS2A1 (and marginally for L1448-C and L1157), as well as for L483 for which Jacobsen et al. (2019) found a COM emission radius of ∼40 au with ALMA, consistent with their estimate of the sublimation radius in the envelope (∼50 au), while they did not detect a disk and obtained an upper limit to its radius of 15 au. However, this hot corino scenario does not explain the presence of COMs in L1448-2A, SerpM-S68N, and SerpS-MM18a which all have a COM emission radius ∼1.5-4 times larger than the expected radius of the hot (T >100 K) inner envelope. SVS13A has a COM radius consistent with the radius of the hot inner envelope, but its disk-like structure is larger and therefore the accretion shock scenario does not work for this source.
As mentioned in Sect. 1, the presence of COMs in the gas phase of a protostar could also be related to its jet or outflow. We A198, page 17 of 81 A&A 635, A198 (2020) Fig. 11. Pearson correlation coefficients with 95% confidence interval (left) and P-value (right) between nine COMs and several disk properties for the CALYPSO source sample. These parameters were derived from the plots shown in Figs. J.1-J.3 and 10. The colors indicate the variables used to evaluate the degree of correlation between the molecules and the disk properties, with the same meaning as in Fig. 9. The disk properties investigated here are the disk radius (bottom), the disk flux density (middle), and the flux density ratio of the disk to the envelope (top). found in Sect. 3.5 that the COM emission tends to be elongated along a direction close to the outflow axis in three sources of the CALYPSO sample: IRAS2A1, IRAS4B, and SerpS-MM18a. This will need to be confirmed with observations at higher angular resolution, for instance with ALMA.
If we merge the previous results, we find possible COM origin scenarios for seven out of the nine sources with a measured COM emission size: (1) hot corino origin for IRAS2A1, SVS13A, and, marginally, L1448-C and L1157; (2) accretion shock or disk atmosphere origin for IRAS4A2 and IRAS4B; (3) outflow origin for IRAS2A1, IRAS4B, and SerpS-MM18a. L1157 could also fit into the accretion shock or disk atmosphere scenario if its actual disk size is close to the CALYPSO upper limit. The two remaining sources, L1448-2A and SerpM-S68N, do not fit into any of the three scenarios.
The lack of COM emission in sources with detected and resolved disk-like structures (L1527, SerpM-SMM4, SerpS-MM22, GF9-2) questions the accretion shock scenario as a general mechanism for the release of COMs in the gas phase around protostars. Specific shock parameters may be required for this process to be efficient (see, e.g., Miura et al. 2017). However, methanol emission was detected with a size (FWHM) of about 1 by Sakai et al. (2014a) toward L1527 with ALMA, which roughly corresponds to the radius of the centrifugal barrier (∼100 au) determined by Sakai et al. (2014b) in this source. Sakai et al. (2014a) derived a methanol column density more than one order of magnitude lower than the upper limit we obtained from the CALYPSO data, even after accounting for the different temperatures assumed in both studies. Therefore, we cannot exclude that the CALYPSO nondetection of COM emission in SerpM-SMM4, SerpS-MM22, and GF9-2 is simply due to a lack of sensitivity like in L1527.
Given that L1527 has a luminosity of only 0.9 L , the ALMA detection of methanol toward this source also means that the upper limit in internal luminosity of 2 L below which COMs are not detected with CALYPSO (see Sect. 4.2) may simply be due A198, page 18 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey Table 7. Correlations between column densities of ten COMs and disk properties of the CALYPSO sources.
Prop. (a) Molecule N pts Notes. Same as Table 6 for the following disk properties: radius (R disk ), flux density (F disk ), and flux density ratio of disk and envelope (F disk /F env ).
to a lack of sensitivity and not to an intrinsic property of sources fainter than 2 L . Several COMs were also detected with ALMA by Imai et al. (2016Imai et al. ( , 2019 toward the isolated Class 0 protostar B335 which is even fainter than L1527 (0.7 L , Evans et al. 2015). The radius of the methanol emission, ∼15 au (Imai et al. 2019;Bjerkeli et al. 2019), is consistent with the hot corino scenario for this source given its luminosity. In addition, the radius of the centrifugal barrier inferred by Imai et al. (2019) for B335 is more than three times smaller than the radius of the methanol emission, suggesting that the accretion shock or disk atmosphere scenario does not work for this source. The lack of COM emission in CALYPSO sources with a detected but unresolved disk-like structure (SVS13B, IRAM04191, L1521F) may be due to a lack of sensitivity if the COM emission is unresolved as well. Similarly, the hot (T > 100 K) region in IRAM04191 and L1521F is expected to be tiny (R < 4 au, Fig. 13) so the CALYPSO survey certainly lacks sensitivity to detect COM emission from a hot corino in both sources. However, the hot region in SVS13B is expected to be as extended as in IRAS4A2 and so the lack of COM emission in SVS13B suggests that an accretion shock origin for the COM emission in IRAS4A2 may indeed be a more likely scenario than a pure hot corino origin.

Complex organic molecule composition as an evolutionary tracer?
The CALYPSO sample is not homogeneous in terms of COM detection or, when COMs are detected, in terms of chemical composition. Slightly less than half of the sources have COM  detections. Our analysis of the COM composition of the nine CALYPSO sources with a sufficient number of detections, plus L483 and IRAS16293B, revealed the existence of three groups (Sect. 3.7). Group 1 has low abundances of COMs (especially the oxygen-bearing ones) relative to methanol compared to groups 2 and 3, and group 3 has enhanced and reduced abundances of cyanides and CHO-bearing molecules relative to methanol, respectively, by a factor of approximately two with respect to group 2. The first question to address is whether these chemical groups reveal an evolutionary sequence. The sources in groups 1, 2, and 3 have a ratio of envelope mass to internal luminosity, M env /L int , ranging from 0.14 to 0.26, 0.018 to 1.43, and 0.17 to 0.75 M /L , respectively. For L483 and IRAS 16293 (taken as system) in group 1, we assumed a luminosity of 13 L and an envelope mass of 1.8 M (Shirley et al. 2000), and 21 L and 5.4 M (Jørgensen et al. 2005;Schöier et al. 2002), respectively. If M env /L int is an evolutionary tracer, then sources in groups 1 and 3 are in similar evolutionary stages, yet with different chemical composition, while group 2 contains sources that span a broad range of evolutionary stages, from Class 0 (IRAS4B, SerpM-S68N) to Class I (SVS13A and, maybe, IRAS4A2), whilst sharing a similar chemical composition. This absence of correlation between the chemical groups and M env /L int is consistent with the lack of correlation between this ratio and the COM column densities (Sect. 4.2). Both results tend to imply that the COM chemical composition of protostars is not an evolutionary tracer, if M env /L int is such a tracer.
Episodic accretion, which is known to occur during protostellar evolution (Safron et al. 2015) and is thereby thought to solve the "luminosity problem" (Kenyon et al. 1990;Dunham et al. 2010), may however change the picture as it suggests that M env /L int can only be used as a robust evolutionary indicator in a statistical sense and not necessarily in all individual cases. In turn, the presence and extent of COM emission and the COM composition of the sources might be in some way related to (present or past) bursts of accretion (Taquet et al. 2016). Evidence for past accretion bursts has been claimed in a few low-mass protostars and very-low-luminosity objects on the basis of simple molecules like CO and N 2 H + (Jørgensen et al. 2015;Hsieh et al. 2018Hsieh et al. , 2019. However, none of the six CALYPSO sources (L1448-C, IRAS2A, SVS13A, IRAS4A, IRAS4B, and L1527) included in the sample studied by Jørgensen et al. (2015) show a clear sign of a past accretion burst according to their analysis. No evidence for a past accretion burst was found for IRAS4A, IRAS4B, L1448-C, or L1157 by Anderl et al. (2016) either, on the basis of the C 18 O and N 2 H + data of the CALYPSO survey itself. Therefore, it is unlikely that the sample of CALYPSO sources with COM emission is as a whole strongly affected by episodic accretion. We note, however, that signs of episodic accretion have been inferred for SerpS-MM18 by Plunkett et al. (2015) from their detection with ALMA of 22 knots in the outflow, which they interpret as being the result of episodic ejection events. Given their high luminosities, it could also be that IRAS2A1 and SVS13A are currently experiencing an accretion burst (Hsieh et al. 2019). However, they are not classified in the same COM chemical group (group 1 versus group 2, respectively), and SerpS-MM18 indeed belongs to the third group, and so episodic accretion may not be the determining factor in the COM chemical composition.
Interestingly, L1448-2A, which does not fit in any of the three COM scenarios described in Sect. 5.1, may have experienced an accretion burst less than 10 3 yr ago (Hsieh et al. 2019). This could explain why its COM radius is a factor of between four and ten times larger than the size of the current 100-150 K region of its envelope. Hsieh et al. (2019) deduced an accretion luminosity of 8-36 L during the past burst from their snow line analysis. Such a burst would have pushed the 100-150 K region away by a factor 2.8-6, in rough agreement with the size of the current COM emission. It would be interesting to investigate if a past accretion burst could also explain the size of the COM emission in SerpM-S68N, the other source that we could not assign to any of the three COM scenarios in Sect. 5.1.
The chemical groups do not seem to correlate with the COM origin scenarios tentatively assigned to the CALYPSO sources in Sect. 5.1. Sources with a possible hot corino origin or a possible A198, page 20 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey   Table 1 for references and uncertainties). outflow origin for their COM emission are found in all three groups. However, those with a possible accretion shock origin of their COM emission are found in group 2 only (and possibly one in group 3). The enhanced abundance of cyanides relative to methanol in group 2 appears to be inconsistent with the chemical differentiation found by Csengeri et al. (2018Csengeri et al. ( , 2019 in G328.2551-0.5321 where, according to their interpretation, the accretion shocks at the centrifugal barrier are traced by oxygen-bearing molecules such as methanol while the cyanides trace a more compact region closer to the protostar. It is therefore unclear whether the accretion shock scenario is the correct interpretation for these group 2 sources. Lee et al. (2019) reported the column densities of several COMs associated with the disk atmosphere of the Class 0 protostar HH212. These column densities turn into abundances relative to methanol of 0.021 for C 2 H 5 OH, 0.025 for CH 3 OCHO, 0.012 for CH 3 CHO, and 0.0012 for NH 2 CHO, with uncertainties of a factor of a few. Among the three chemical groups identified in the CALYPSO survey, HH212 seems to fit best into group 1. Given that a two-sided disk atmosphere corresponds to an elongation along the outflow axis, it may not be surprising that we found a possible outflow origin for IRAS2A1 in group 1. Higher angular resolution observations would be needed to verify if the COM emission in IRAS2A1 traces a disk atmosphere like in HH212.

Relation between chemical composition of protostars and cloud environment
The three chemical groups identified in this study are not related on a one-to-one basis to the molecular clouds that the sources belong to. Group 3 contains sources in L1448 (L1448-2A, L1448-C), Serpens South (SerpS-MM18a), and Cepheus (L1157), group 2 contains sources in NGC1333 (SVS13A, IRAS4A2, IRAS4B) and Serpens Main (SerpM-S68N), while another source in NGC 1333 (IRAS2A1) belongs to group 1 that also contains sources in Ophiuchus (IRAS 16293B) and the Aquila rift (L483). However, one cloud of our survey stands out: no COM was detected with CALYPSO toward the three protostars located in Taurus (IRAM04191, L1521F, L1527). IRAM04191 and L1521F are the sources with the lowest luminosity in the CALYPSO A198, page 21 of 81 A&A 635, A198 (2020) Fig. 13. Comparison of the COM emission radius (green cross) to the disk radius (blue circle, with error bar when available) and the range of radii over which ices are expected to sublimate from the grain mantles (red bar terminated by stars). The COM emission radius corresponds to half the FWHM size reported in Table 4, when COMs are detected. The disk radius corresponds to half the FWHM size derived by Maury et al. (2019) when they detect a candidate disk-like structure, corrected for the new distance when needed. The radii of the candidate disk-like structures for SVS13A (circumbinary in this case) and IRAS4A2 come from a similar analysis (A. Maury, priv. comm.). A blue dot with an arrow pointing to the left indicates the upper limit on the radius of a detected but unresolved disk. A blue bar with an arrow pointing to the left represents a disk nondetection with an upper limit on the radius that could still be estimated by Maury et al. (2019). All disk parameters were rescaled to the new distances when necessary. For each source, the two red stars mark the temperature range 100-150 K (from right to left) expected from the luminosity given in Table 1. The names of the sources belonging to groups 1, 2, and 3 are color-coded with the color scheme indicated in the bottom right corner, as in the case of the frames in Fig. 6 and as described in Sect. 3.7. sample, and the size of their hot (100-150 K), inner envelope is expected to be more than one order of magnitude smaller than the CALYPSO beam (see Fig. 13). It is therefore perhaps not surprising that we did not detect any COM toward these sources. Methanol has been detected toward L1527 with ALMA (Sakai et al. 2014a) and CALYPSO was not sensitive enough to see it. However, the upper limit that we obtained for its column density normalized by the dust continuum emission is about one order of magnitude lower than the normalized column density of methanol in the sources where CALYPSO detected it (see Fig. G.1). This suggests that the methanol abundance in L1527 is significantly lower than in these sources. However, sources such as IRAS4A1, IRAS4B2, and SerpM-SMM4a have similarly low or even lower upper limits for methanol in Fig. G.1, and so the low abundances of COMs in the Taurus protostars observed with CALYPSO do not necessarily result from the Taurus environment itself.
We conclude from all this that the differences in COM chemical composition in the CALYPSO sample do not reflect the global environment in which these sources are embedded. Source-to-source variations in chemical composition within a given cloud rather suggest an evolutionary effect or the influence of local conditions (episodic accretion?).

Correlation between column densities does not imply chemical link
In Sect. 4.1 we reported strong correlations for three pairs of molecules: CH 3 CN/CH 3 OH, NH 2 CHO/CH 3 OH, and CH 3 OCH 3 /C 2 H 5 OH. It has been argued in the past that correlations between the column densities or abundances of molecules reveal chemical links between them. For instance, Jaber et al.
(2014) found a strong correlation over five orders of magnitude between methyl formate, CH 3 OCHO, and dimethyl ether, CH 3 OCH 3 . These latter authors suggested that this correlation results from the two species having the same precursor, as previously advocated by Brouillet et al. (2013) on the basis of similar arguments, or from one being the precursor of the other. Jaber et al. (2014) drew the same conclusion for methyl formate and formamide, which showed a similar correlation. The results we obtain from the CALYPSO survey shed a new light on this matter. To our knowledge, there is no obvious chemical link between methyl cyanide and methanol: the formation of methyl cyanide is dominated by the reaction of CH 3 + with HCN in the gas phase in R. Garrod's chemical models for instance (e.g., Belloche et al. 2016), while methanol is known to only form efficiently on the grain surfaces by hydrogenation of CO. Nevertheless, we find a strong correlation between methanol and methyl cyanide. We conclude from this that the existence of a strong correlation between two molecules in a sample of sources does not imply that these molecules are related chemically. A similar conclusion was drawn by Quénard et al. (2018) for the correlation between HNCO and NH 2 CHO reported in earlier studies. These latter authors deduced from their chemical modelling that this correlation comes from the same response of the two molecules to the temperature rather than from a direct chemical link. Mottram et al. (2014Mottram et al. ( , 2017 analyzed water emission observed with Herschel in two samples of Class 0 and I protostars (the WISH and WILL samples). These samples have the following sources in common with CALYPSO: L1448-C (L1448-MM), IRAS2A, SVS13 (IRAS 3A), IRAS4A, IRAS4B, L1527, Serp-SMM4 (Ser-SMM4), L1157, L1448-2 (PER01), L1448-N (PER02), and SerpS-MM18 (SER02). L1448-C, IRAS2A1, IRAS4A2, IRASB, and SerpS-MM18a, which all have many COMs detected with CALYPSO, have relatively strong detections of water components corresponding to cavity shocks or spot shocks, whereas L1527, with no COM detected with CALYPSO, has much weaker water emission. This could suggest a link between the detection of COMs and the interaction of the jet or outflow with the envelope. However, L1448-N and SerpM-SMM4, which have no firm COM detection or only a methanol detection, have strong water detections like the former sources, and L1157, which has three COMs detected with CALYPSO, has a much weaker water emission. Furthermore, the Herschel surveys did not have the angular resolution needed to separate the individual components of multiple systems, which show very different COM properties in the CALYPSO sample. Overall, it is therefore unclear whether the COM emission in Class 0 protostars is related to their jet or outflow activity. SVS13-A, a Class I protostar with many COMs detected with A198, page 22 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey CALYPSO, was observed in only one low-energy water line, and so we cannot compare it to the other sources in a meaningful way. Taquet et al. (2015) reported results of their observations of IRAS2A and IRAS4A with PdBI at 2 mm with an angular resolution of ∼2 . The column densities that these latter authors derived for C 2 H 5 OH, CH 3 OCH 3 , CH 3 OCHO, NH 2 CHO, CH 3 CN, C 2 H 5 CN, and CH 2 (OH)CHO all agree within a factor of two with the ones we obtained with CALYPSO, after rescaling the column densities of IRAS2A to account for the different source size that they assumed (0.2 vs. 0.35 here). However, their methanol column densities are a factor 3 and 9 times higher than the ones we derived for these sources. This may be due to the larger beam of their observations which may be more contaminated by the outflow. This would then imply that the abundances relative to methanol of the COMs listed above are much lower in the outflow than in the inner region traced with CALYPSO. While the agreement between the CALYPSO column densities and the ones derived by Taquet et al. (2015) is good, we find discrepancies larger than a factor of two for most COMs detected by López-Sepulcre et al. (2017) toward IRAS4A2 with ALMA and PdBI at ∼0.5 and ∼1.2 angular resolution, respectively, after rescaling their column densities from a source size of 0.3 to 0.35 . The largest discrepancy, by a factor of seven, occurs for NH 2 CHO. The reasons for these discrepancies is unclear. Like López-Sepulcre et al. (2017), we do not detect any COM toward IRAS4A1. Graninger et al. (2016) and Bergner et al. (2017) reported the results of a survey of 16 embedded protostars with the IRAM 30 m telescope. Most of their targets are Class I objects and the COM emission they detected is characterized by low rotational temperatures, suggesting that their survey probes the cold and low-density part of the protostellar envelopes rather than the hot, high-density, inner regions traced with CALYPSO. Their sources were all detected in methanol, while CH 3 CHO and CH 3 CN were detected toward six and seven sources, respectively. Only two sources were detected in CH 3 OCH 3 and CH 3 OCHO. All sources but two were detected in HNCO. The HNCO abundances relative to methanol range from ∼0.06 to ∼0.5, with most values around 0.2. This is more than one order of magnitude higher than what we obtained for the CALYPSO sample, which is dominated by Class 0 protostars. This difference unlikely results from an evolutionary effect, because the Class I protostar SVS13A in the CALYPSO sample has a HNCO abundance relative to methanol also one order of magnitude lower than the IRAM 30 m sample. It probably rather reflects a difference in chemical composition between the (cold) envelope traced with the IRAM 30 m telescope and the small-scale, hot emission traced with PdBI.

Comparison to other COM surveys
The two sources of Bergner et al. (2017) with detections of CH 3 OCH 3 and CH 3 OCHO have abundances relative to methanol of ∼0.09 and ∼0.06 for these two molecules, respectively, which is similar to what we obtain for group 3 for both molecules and group 2 for CH 3 OCH 3 , but is a factor of between three and five higher than group 1 (Table 5). In addition, Bergner et al. (2017) report a median value of CH 3 CHO abundance relative to methanol similar to the values of groups 2 and 3 but a factor of approximately five higher than group 1. Group 1 may therefore represent sources that have undergone a different type of chemical processing of O-bearing COMs on a small scale compared to the dominant processing in the cold envelope.
The situation is less clear for the N-bearing COMs: the CH 3 CN abundances relative to methanol derived by Bergner et al. (2017) range from 0.005 to 0.05, which encompasses the values we derive for the three groups of the CALYPSO sample. Bergner et al. (2017) found that methanol is well correlated with CH 3 CHO and CH 3 CN in their source sample. We find a good correlation between methanol and CH 3 CN for the CALYPSO sources as well, but the correlation is much weaker between methanol and CH 3 CHO. Bergner et al. (2017) also claim to find a correlation between the envelope mass and the column densities of all molecules but this does not seem to really be the case for the COMs shown in their Fig. 8 (CH 3 CN, CH 3 OH, and CH 3 CHO). Their claim of a positive correlation with bolometric luminosity is not convincing either (see their Fig. 9). Furthermore, their CH 3 CHO abundances relative to methanol do not correlate with the luminosity, while we find an anti-correlation between the internal luminosities and the abundances of CH 3 CHO and CH 2 (OH)CHO relative to methanol (Sect. 4.2).
On the basis of a compilation of literature results on low-, intermediate-, and high-mass hot cores combined with their measurements on IRAS4A and IRAS2A, Taquet et al. (2015) found a correlation between the luminosity and the abundances of C 2 H 5 CN and CH 2 (OH)CHO relative to methanol (see their Fig. 8). However, the sample of sources with CH 2 (OH)CHO is small in both theirs and our study (four sources), and their correlation does not hold for the (three) sources with luminosities below 100 L . Ospina-Zamudio et al. (2018) also found a correlation between the abundance of C 2 H 5 CN relative to methyl formate and the luminosity for a compilation of eight low-, intermediate-, and high-mass hot cores. However, they did not find any systematic variations of the abundances of O-bearing COMs with luminosity between the low-and intermediate-mass sources which cover nearly two orders of magnitude in luminosity. Nevertheless, their sample does not cover luminosities below 9 L while our sample of sources with COM detection extends down to 2 L . The luminosity may therefore have a greater impact on the O-bearing COM chemical composition over the range 1-10 L than over the range 10-500 L .
The ASAI survey performed with the IRAM 30 m telescope has revealed the chemical composition at large scale of a sample of low-mass star forming regions, from prestellar cores to protostars. Lefloch et al. (2018) used the ratio of the number of detected O-bearing species to hydrocarbons to classify the sources into WCCC sources or hot corinos. Four sources are in common with CALYPSO: L1157 and L1527 were classified as WCCC sources, and IRAS4A and SVS13A as hot corinos on the basis of the ASAI results. The former two sources have a low channel count peak in the CALYPSO survey, while the latter two have a high channel count peak. The chemical richness of protostars probed on a small scale with PdBI therefore seems to be related to the chemical composition probed at large scale with the single-dish telescope. However, L1157 has several COMs detected on a small scale with CALYPSO, so its WCCC nature on a large scale does not prevent the existence of a hot corino on a small scale. This is similar to the candidate WCCC source L483 which also harbors a hot corino (Oya et al. 2017). No COMs are detected on a small scale toward L1527 with CALYPSO, but methanol was detected with ALMA (Sakai et al. 2014a). Given that we classified L1157 in the same COM chemical group as SerpS-MM18a, L1448-C, and L1448-2A (group 3), it would be interesting to check whether or not the latter three sources also present a WCCC nature on a large scale.
A198, page 23 of 81 A&A 635, A198 (2020) 5.7. Comparison to COM chemical models Bergner et al. (2017) presented results of chemical simulations computed with the three-phase chemical kinetics code MAGICKAL (Garrod 2013). These latter authors used a grid of simulations to map the computed abundances to the physical structure of a low-mass protostellar envelope. They considered two cases, a 1 L and a 10 L protostar, and compared the results of these simulations to their single-dish observations of a sample of Class 0 and I protostars, which were also sensitive to the large-scale COM emission of the envelopes. The CALYPSO interferometric data that we have used here probe only the inner parts of the envelopes because of the spatial filtering by the interferometer. Therefore, in order to compare the chemical composition derived from the CALYPSO survey to the one predicted by the models, we considered only the inner regions of the models of Bergner et al. (2017). In practice, we integrated the modeled column densities of the molecules over the inner region where the abundance of methanol is the highest and forms a plateau (see Fig. 12 of Bergner et al. 2017). The radius of this region is about 6 and 20 au for the 1 L and 10 L simulations, respectively. The resulting average abundances of several COMs relative to methanol are listed at the bottom of Table 5. Table 5 shows that the modeled COM abundances relative to methanol are relatively insensitive to the adopted luminosity of the protostar. Both models underestimate the abundances derived from the CALYPSO survey by at least a factor three and up to two orders of magnitude for the COMs, and even three orders of magnitude for HNCO. Nevertheless, we notice that, among the three chemical groups of the CALYPSO sample, group 1 comes the closest to the model predictions, in particular with respect to CH 3 OCH 3 and CH 3 OCHO, which agree within a factor of three and six with the model, respectively, which can be considered as satisfactory given the uncertainties on the reaction rates in the chemical network used for the simulations. We could therefore speculate that group 1 is the most consistent with a hot-corino origin of the COM emission, but this is not in agreement with our conclusion stated in Sect. 5.2 that sources matching a possible hot-corino scenario are found in all three groups. In addition, the discrepancy for the other molecules is about two orders of magnitude for group 1.
The models of Bergner et al. (2017) underpredict by typically one order of magnitude the abundances of most COMs in their sample of protostars. Only CH 3 CHO showed a good match. These latter authors argued that the discrepancy may result from cold (large-scale) methanol being overabundant in the model, due to an overactive chemical desorption. However, we find that this discrepancy holds also on the small scales probed by CALYPSO, where thermal desorption is most likely dominant. The reason for the discrepancy between models and observations must lie elsewhere. Higher abundances of COMs such as CH 3 OCHO and CH 3 OCH 3 relative to methanol were obtained in models that include proton-transfer reactions with ammonia in the gas phase and/or luminosity outbursts (Taquet et al. 2016). In the latter case, the abundance of these two molecules relative to methanol are enhanced after the burst because they recondense more slowly than methanol. However, the timescale for this recondensation is short according to the simulations of Taquet et al. (2016) (<1000 yr), and we have concluded in Sect. 5.2 that the sample of CALYPSO sources with COM detections does not seem, as a whole, to be affected by episodic accretion, and so this is unlikely to be the source of the discrepancy. We do not know the abundance of ammonia in the CALYPSO sources, and therefore it is difficult to conclude whether or not the proton-transfer mechanism proposed by Taquet et al. (2016) can explain the discrepancy between the COM abundances relative to methanol obtained with CALYPSO and the ones predicted by the models of Bergner et al. (2017). Chuang et al. (2016) investigated the cold surface formation of glycolaldehyde, ethylene glycol, and methyl formate with laboratory experiments that involve the recombination of free radicals formed via H-atom addition and abstraction reactions, starting from ice mixtures of CO, H 2 CO, and CH 3 OH. In most of their experiments, glycolaldehyde is found to have a similar abundance as methyl formate, or to be more abundant, while only two experiments (CO+H 2 CO+H and H 2 CO+CH 3 OH+H) produce less glycolaldehyde than methyl formate, with an abundance ratio of about 1:3. In their more recent work where they compared the production of these three molecules under H-atom addition and/or UV irradiation, glycolaldehyde and ethylene glycol are both always more abundant than methyl formate (Chuang et al. 2017). The numerical simulations of Garrod (2013) also produce three to six times more glycolaldehyde than methyl formate in the gas phase after sublimation from the ice mantles of dust grains where they are formed.

Implications for the formation of COMs
These experimental and numerical results are much different from the abundance ratios [CH 2 (OH)CHO]/[CH 3 OCHO] that we obtained for the CALYPSO sample: the derived ratios range from 3% (SVS13A) to 11% (IRAS4A2), and the upper limits in sources where methyl formate is detected but not glycolaldehyde range from 9% (SerpS-MM18a) to 80% (L1157). Taquet et al. (2017) reported a similarly low upper limit (<6%) in the cold dark cloud Barnard 5 and concluded that the discrepancy with the experimental results of Chuang et al. (2016) suggests that surface chemistry is not the dominant mechanism for the formation of methyl formate. Comparing their experimental results to a compilation of observational results including IRAS2A and IRAS4A from Taquet et al. (2015), Chuang et al. (2017) also conclude that the overabundance of glycolaldehyde over methyl formate suggests that gas-phase chemistry plays a significant role, either through the destruction of glycolaldehyde or an enhanced production of methyl formate. A similar conclusion could be drawn now from the CALYPSO sample. On the other hand, Skouteris et al. (2018) argued that glycolaldehyde could be formed in the gas phase. It would therefore be interesting to investigate whether or not pure gas-phase chemistry can produce a [CH 2 (OH)CHO]/[CH 3 OCHO] ratio consistent with the observed ones. Drozdovskaya et al. (2019) reported a correlation between the chemical composition of IRAS16293B and comet 67P/ Churyumov-Gerasimenko, concluding that the volatile composition of cometesimals and planetesimals is partially inherited from the prestellar and protostellar phases. Our analysis of the CALYPSO sample shows that IRAS 16293B has a similar COM chemical composition to those of IRAS2A1 and L483 (chemical group 1, see Sect. 3.7), suggesting that comet 67P has a similar composition to that of group 1. However, groups 2 and 3 are characterized by abundances of O-bearing molecules relative to methanol that are higher than those of group 1 by a factor of approximately 6. This factor is similar to the dispersion of the correlation found by Drozdovskaya et al. (2019). Therefore, on the basis of the limited sample of COMs analyzed here, it is still A198, page 24 of 81 A. Belloche et al.: Questioning the spatial origin of complex organic molecules in young protostars with the CALYPSO survey too premature to conclude if one of the three chemical groups is more correlated to the chemical composition of comet 67P than the others.

Conclusions
We have taken advantage of the CALYPSO survey to explore the presence of COMs in a large sample of 22 Class 0 and four Class I protostars at high angular resolution. Methanol is detected in 12 sources and tentatively detected in one source, which represents half of the sample. Eight sources (30%) have detections of at least three COMs. We derived the column densities of the detected COMs and searched for correlations with various source properties, either collected from the literature or derived from the CALYPSO survey. The main conclusions of this analysis are the following: 1. The high angular resolution of the CALYPSO survey reveals a strong COM chemical differentiation in multiple systems: five systems have at least three COMs detected in one component while the other component is devoid of COM emission. This is markedly different from the prototypical hot-corino source IRAS 16293 where many COMs have been reported towards both components of the binary in the literature. This also raises the question of whether all protostars go through a phase showing COM emission. 2. All CALYPSO sources with an internal luminosity higher than 4 L have at least one detected COM (methanol). On the contrary, no COM emission is detected in sources with an internal luminosity lower than 2 L . This seems to be due to a lack of sensitivity rather than an intrinsic property of low-luminosity sources. 3. The internal luminosity is the source parameter impacting the COM chemical composition the most. The abundances of CH 3 CHO and CH 2 (OH)CHO relative to methanol are anti-correlated with the internal luminosity. There seems to be a correlation between the internal luminosity and the column density of CH 3 OH normalized to the continuum emission. 4. The detection of a disk-like structure in continuum emission does not imply the detection of COM emission, and vice versa. The size of the COM emission, when detected, is not systematically related to the size of the disk-like structure or to the extent of the hot inner envelope. 5. No single scenario can explain the origin of COMs in all the CALYPSO sources with COM detections. For seven sources out of the nine with a measured COM emission size, we find that the COM emission can be explained by a canonical hot-corino origin in four sources, an accretion-shock origin in two or possibly three sources, and an outflow origin in three sources; three of these seven sources fit into two of the three scenarios. One of the two remaining sources may fit into a hot-corino scenario coupled to a recent accretion burst. 6. The CALYPSO sources with COM detections show different chemical compositions. We identified three groups on the basis of the abundances of oxygen-bearing molecules, cyanides, and CHO-bearing molecules relative to methanol. These chemical groups do not correlate with the three scenarios mentioned above; they do not seem to correlate either with the evolutionary status of the sources if we take the ratio of envelope mass to internal luminosity as an evolutionary tracer. However, the chemical groups do not correlate either with the cloud environment in which the sources are embedded. The source-to-source variations in COM chemical composition may thus rather reflect an evolutionary effect or the influence of local conditions such as episodic accretion. 7. The column densities of several pairs of COMs correlate well with each other although some of these pairs, such as CH 3 OH and CH 3 CN, are not linked chemically. Therefore, the existence of a strong correlation between two molecules does not imply that these molecules are related chemically. While the CALYPSO survey was initially not designed for an extensive study of the COM emission in young protostars, its high angular resolution and sensitivity has allowed us to start shedding light on the presence of COMs in a more statistical way than has been done before. However, no single scenario that can explain the origin of COMs in the CALYPSO sample emerges from our analysis. Future imaging spectral line surveys of a larger sample of young protostars at even higher angular resolution sufficient to resolve the expected disk scales (a few tens of au) will be necessary to make further progress in this area. The determination of individual internal luminosities in close binaries and multiple systems will also be necessary in order to search for correlations in a more robust way. Searching for correlations between the COM emission and the jet or outflow properties of the sources may also be promising.
Acknowledgements. We thank Rob Garrod for sending us the abundance profiles of the models of Bergner et al. (2017) in electronic format. We are grateful to Bilal Ladjelate for sending us his estimates of the internal luminosities prior to publication and Benoît Tabone for discussions about the relation between disks and COMs. We thank Sandrine Bottinelli, Benoît Commerçon, Cornelis Dullemond, Patrick Hennebelle, Ralf Klessen, and Ralf Launhardt for their participation to the preparation of the CALYPSO project. Ph.A. and A.M.  Table A.1 lists the beam sizes, positions angles, and the noise levels of the data cubes used to analyze the COM emission of the CALYPSO sources.   1 and B.2 show the S2 and S3 spectra obtained toward some of the continuum peaks of the CALYPSO sources like in Fig. 1 for setup S1.  L1448-2A. The population diagram of methanol yields a rotational temperature that is consistent with 150 K albeit with a large uncertainty. We adopted this temperature to derive the column density (or upper limit) of all selected organic molecules.

Appendix A: Beam sizes and noise levels
L1448-2Ab. Only two population diagrams could be constructed. Given the large dispersion of the measured integrated intensities of transitions with E u /k B < 200 K, the rotational temperature of methanol is not constrained. However, the low peak intensities of one transition with E u /k B > 200 K and the nondetection of transitions from within the first torsionally excited state suggest a rotational temperature below 200 K. We assumed a value of 150 K, which we also used for the other complex molecules. The population diagram of HNCO contains only two data points and the uncertainties are too large to constrain the rotational temperature of this molecule.
L1448-NA. We assumed a temperature of 100 K to derive upper limits to the column densities of the selected organic molecules.
L1448-NB1. The nondetection of the 8 −1 -7 0 transition of methanol at 229758.756 MHz suggests a temperature lower than 100 K, but this is uncertain. We assumed a temperature of 100 K to derive column density upper limits for the other selected organic molecules.
L1448-NB2. We assumed a temperature of 100 K to derive upper limits to the column densities of the selected organic molecules.
L1448-C. The fitted rotational temperatures of CH 3 OH, CH 3 CHO, and CH 3 CN are consistent with a temperature of 100 K within ∼1σ. CH 3 OCH 3 and HNCO have lower temperatures of 75 and 59 K, respectively, but these values are based on three data points only. The lower value derived for CH 3 OCHO is uncertain due to the narrow range of upper level energies. It is consistent within 3σ with a temperature of 100 K. The temperature is not constrained for NH 2 CHO due to the low signal-to-noise ratios and the small number of data points. We assumed a temperature of 100 K for all selected organic species.
L1448-CS. We assumed the same temperature as for L1448-C.
IRAS2A1. For all species but NH 2 CHO, we assumed a rotational temperature that is within 1σ of the value derived from the fit to the population diagram. The fit of NH 2 CHO relies on only four transitions. The high fitted temperature is strongly biased by the 12 = 1 group of transitions at 218.18 GHz that is most likely blended with a transition of an unidentified species. We did not trust this high value and assumed a temperature of 250 K as for methanol instead.
SVS13B. We assumed a temperature of 150 K to derive upper limits to the column densities of the selected organic molecules.
SVS13A. For all species but C 2 H 5 CN, we assumed a rotational temperature that is within 1σ of the value derived from the fit to the population diagram. The transition of C 2 H 5 CN with E u /k B ∼ 140 K at ∼218390 MHz is most likely severely contaminated by emission from an unidentified species. This strongly biases the population diagram and prevents a reliable estimate of its rotational temperature. We assumed the same temperature as for CH 3 CN.

IRAS4A1.
We assumed a temperature of 150 K to derive upper limits to the column densities of the selected organic molecules.
IRAS4A2. For all species but two, we used a temperature that is consistent within 1σ with the value derived from the population diagram. The temperature used for CH 3 OCH 3 is consistent with the fitted rotational temperature within 2.5σ. The fit to the population diagram of C 2 H 5 CN is unconstrained. We assumed the same temperature as for CH 3 CN.
IRAS4B. The population diagrams of CH 3 OH and HNCO indicate temperatures on the order of 300 K, which we adopted for the modeling. The population diagrams of CH 3 OCHO and CH 3 CHO also suggest temperatures on the order of 300 K, but such high temperatures would overestimate some transitions which are not detected. As a compromise, we adopted a temperature of 200 K for both species. The other population diagrams do not constrain the rotational temperatures well and we assumed a temperature of 150 K for all other complex species. The low rotational temperature derived from the population diagram of C 2 H 5 CN is not reliable due to the narrow range of E u /k B .
IRAS4B2. We assumed a temperature of 150 K to derive upper limits to the column densities of the selected organic molecules.

IRAM04191.
We assumed a temperature of 100 K to derive upper limits to the column densities of the selected organic molecules.
L1521F. We assumed a temperature of 100 K to derive upper limits to the column densities of the selected organic molecules.
L1527. We assumed a temperature of 100 K to derive upper limits to the column densities of the selected organic molecules.
SerpM-S68N. Only three population diagrams could be constructed. The fit to the population diagram of CH 3 OH indicates a temperature of about 200 K, which we adopted. The population diagram of CH 3 OCHO does not constrain the temperature because of the too narrow range of E u /k B . The one of CH 3 CN is too noisy to constrain the temperature. We assumed a temperature of 150 K for both molecules as well as all other selected organic molecules.
SerpM-S68Nb. We assumed a temperature of 150 K to derive upper limits to the column densities of the selected organic molecules.
SerpM-SMM4a. We assumed a temperature of 150 K to derive upper limits to the column densities of the selected organic molecules.
SerpM-SMM4b. Only two population diagrams could be constructed. The fit to the population diagram of CH 3 OH indicates a temperature of about 220 K. We adopted a value of 250 K, consistent with the latter within 1σ. The population diagram of HNCO does not constrain the temperature well. For this molecules and all the other selected organic molecules, we assumed a temperature of 250 K.
SerpS-MM18a. The population diagrams of nine molecules could be constructed and fitted. The rotational temperature of CH 3 OH is well constrained, around 150 K. This temperature is consistent within 2σ with the rotational temperatures derived for most of the other molecules. There are three exceptions, CH 3 OCH 3 , CH 3 CN, and C 2 H 5 CN. The fit of C 2 H 5 CN is highly uncertain due to the narrow energy range of the covered transitions. Therefore, we also assumed a temperature of 150 K for this molecule. The population diagram of CH 3 CN suggests a somewhat higher temperature and we adopted 200 K, consistent within 2σ with the fit result. The population diagram of CH 3 OCH 3 suggests a lower temperature, and we used 110 K, consistent within 1σ with the fit result.
SerpS-MM18b. The populations diagrams of CH 3 CN and HNCO have only two points and are unconstrained. The fit to the population diagram of methanol yields a low temperature of ∼70 K. However, fitting the spectum with such a low temperature predicts that a transition with E u /k B = 40 K at 230 027 MHz should be detected while it is not. As a compromise we used a temperature of 120 K for methanol and all other selected organic molecules.
SerpS-MM22. We assumed a temperature of 150 K to derive upper limits to the column densities of the selected organic molecules.
L1157. The population diagram of methanol indicates a temperature on the order of 200 K, which we adopted for this molecule. The other two population diagrams (CH 3 OCHO and CH 3 CN) poorly constrain the rotational temperature of these molecules. However, the non-detection of the 12-11 transition of CH 3 CN at 220.4 GHz is not consistent with a temperature of 200 K or higher. We thus adopted a temperature of 150 K for this molecule as well as for CH 3 OCHO. The column density upper limits for the other complex organic molecules were derived with this temperature.
GF9-2. We assumed a temperature of 100 K to derive upper limits to the column densities of the selected organic molecules.                                                                                              Notes. (a) Number of CALYPSO sources detected in both tracers. Pearson correlations are evaluated for the following quantities: (b) column density; (c) column density multiplied by the solid angle of the COM emission and divided by either the 1.3 mm continuum peak flux density or the 1.3 mm continuum flux density integrated over the size of the COM emission; (d) column density multiplied by the solid angle of the COM emission and divided by the 3 mm continuum peak flux density; (e) column density multiplied by the solid angle of the COM emission, divided by either the 1.3 mm continuum peak flux density or the 1.3 mm continuum flux density integrated over the size of the COM emission, and divided by the internal luminosity; ( f ) column density multiplied by the solid angle of the COM emission, divided by either the 1.3 mm continuum peak flux density or the 1.3 mm continuum flux density integrated over the size of the COM emission, and divided by the envelope mass; (g) column density multiplied by the solid angle of the COM emission, divided by either the 1.3 mm continuum peak flux density or the 1.3 mm continuum flux density integrated over the size of the COM emission, and divided by the ratio of internal luminosity to the envelope mass. For each type of correlation, ρ is the Pearson correlation coefficient with its 95% confidence interval, and P is the P-value. X(Y) means X × 10 Y . Pearson coefficients with a confidence interval outside

Appendix K: Estimation of dust temperatures
The dust temperature can be computed using Eq.
(2) of Motte & André (2001): which is derived from Eq. (3) of Terebey et al. (1993). This equation assumes optically thin emission of dust heated by a central protostar in spherical symmetry and a dust opacity index, β, of 1.0. L int is the internal luminosity given in Table 1. Given that the assumption of optically thin emission may not be valid on the small scales where the COMs emit, we verify the reliability of Eq. (K.1) by comparing it to the temperature profiles derived by Kristensen et al. (2012) with the one-dimensional radiative transfert code DUSTY. These latter authors modeled five of the sources shown in Fig. 12 (L1448-C, IRAS2A, IRAS4A, IRAS4B, and L1157). For this comparison, we assume the same distances and luminosities as Kristensen et al. (2012). We find that the temperature profiles agree relatively well with each other at large radii (beyond an angular radius of 0.5 , 0.7 , 1.0 , 0.3 , and 0.23 , respectively), while the dust temperature profile obtained with DUSTY is much steeper in the inner part, likely because of the dust optical depth 6 . Because r COM is 6 Maury et al. (2019) find a ratio of the continuuum effective radiation temperature to the dust temperature of 0.16 for L1448-C, 0.13 for IRAS2A, 0.5 for IRAS4B, and 0.20 for L1157, which correspond to optical depths of 0.18, 0.14, 0.7, and 0.22, respectively, meaning that the optically thin assumption no longer holds for the emission on scales smaller than the beam. smaller than this radius in all five sources, Eq. (K.1) underestimates the dust temperature at r COM in these cases. The correction factors to apply to Eq. (K.1) are 1.3, 1.7, 2.5, 1.3, and 1.3, respectively. For a given angular radius, the temperature computed with Eq. (K.1) is independent of the distance. We therefore assume that the correction factors derived above are also applicable at the revised distances listed in Table 1. As a first caveat, we mention that, while the luminosities used by Kristensen et al. (2012) for L1157, L1448-C, and IRAS2A are similar to ours (once rescaled to the same distance), the luminosities we use for IRAS4A and IRAS4B are a factor of approximately three lower. This may have a small impact on the correction factor to apply to Eq. (K.1) for these sources, but we neglect this additional correction because it is not straightforward to evaluate. The second caveat concerns the density profile, which is the key parameter controlling where the dust emission becomes optically thick. This is critical, in particular, in the case of binaries where the individual density profiles of the components are not well known. For instance, the large correction factor obtained for IRAS4A does most likely not apply to IRAS4A2 which likely dominates the luminosity of the system but not the mass. A dedicated radiative transfer simulation of IRAS4A2 with a reduced mass would yield a smaller correction factor. Given these uncertainties on the density profiles of the sources, we decide to use a single correction factor of 1.3 for all sources shown in Fig. 12. On the basis of the discussion above, we think that the dust temperatures derived at r COM with Eq. (K.1) and this correction factor are uncertain by at least a factor 1.3.