Issue 
A&A
Volume 678, October 2023



Article Number  A178  
Number of page(s)  22  
Section  Planets and planetary systems  
DOI  https://doi.org/10.1051/00046361/202245752  
Published online  20 October 2023 
CROCODILE
Incorporating mediumresolution spectroscopy of closein directly imaged exoplanets into atmospheric retrievals via crosscorrelation
^{1}
ETH Zurich, Institute for Particle Physics and Astrophysics,
WolfgangPauliStrasse 27,
8093
Zurich, Switzerland
email: jhayoz@ethz.ch
^{2}
Department of Astronomy, University of Michigan,
Ann Arbor, MI
48109, USA
^{3}
National Center of Competence in Research PlanetS,
Switzerland
^{4}
Max Planck Institute for Intelligent Systems,
MaxPlanckRing 4,
72076
Tübingen, Germany
Received:
21
December
2022
Accepted:
28
August
2023
Context. The investigation of the atmospheres of closely separated, directly imaged gas giant exoplanets is challenging due to the presence of stellar speckles that pollute their spectrum. To remedy this, the analysis of medium to highresolution spectroscopic data via crosscorrelation with spectral templates (crosscorrelation spectroscopy) is emerging as a leading technique.
Aims. We aim to define a robust Bayesian framework combining, for the first time, three widespread directimaging techniques, namely photometry, lowresolution spectroscopy, and mediumresolution crosscorrelation spectroscopy in order to derive the atmospheric properties of closein directly imaged exoplanets. Current atmospheric characterisation frameworks are indeed either not compatible with all three observing techniques or they lack the commitment to efficient sampling strategies that allow highdimensional forward models.
Methods. Our framework CROCODILE (crosscorrelation retrievals of directly imaged selfluminous exoplanets) naturally combines the three techniques by adopting adequate likelihood functions. To validate our routine, we simulated observations of gas giants similar to the wellstudied β Pictoris b planet and we explored the parameter space of their atmospheres to search for potential biases.
Results. We obtain more accurate measurements of atmospheric properties when combining photometry, low and mediumresolution spectroscopy into atmospheric retrievals than when using the techniques separately as is usually done in the literature. Indeed, the combined fit is, on average, 20% more accurate than fitting only mediumresolution crosscorrelation spectroscopy. We find that mediumresolution (R ≈ 4000) Kband crosscorrelation spectroscopy alone is not suitable to constrain the atmospheric properties of our synthetic datasets; however, this problem disappears when simultaneously fitting photometry throughout the Y and M bands and lowresolution (R ≈ 60) spectroscopy between the Y and H bands. Our thorough testing demonstrates that free chemistry is a suitable forward model to retrieve the atmospheric thermal and chemical properties of cloudless gas giants at chemical equilibrium.
Conclusions. CROCODILE provides a robust statistical framework to interpret mediumresolution spectroscopic data of closein directly imaged exoplanets, where speckles originating from stellar stray light render the extraction of the continuum difficult. Our framework allows the atmospheric characterisation of directly imaged exoplanets using the highquality spectral data that will be provided by the new generation of instruments such as the Enhanced Resolution Imager and Spectrograph (ERIS) at the Very Large Telescope, the MidInfrared Instrument (MIRI) aboard the James Webb Space Telescope, and in the future the Midinfrared ELT Imager and Spectrograph (METIS) at the Extremely Large Telescope.
Key words: planets and satellites: atmospheres / methods: data analysis / techniques: imaging spectroscopy / techniques: high angular resolution
© The Authors 2023
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1 Introduction
Determining the ratios of elemental abundances in the atmosphere of exoplanets is emerging as a key analysis to unwrap the history of their formation. Nascent planets do indeed inherit some of the chemical properties of the protoplanetary disk at the location of their formation: Öberg et al. (2011) showed that the molecular snowlines are expected to directly affect the atmospheric carbontooxygen (C/O) number ratio. Further processes that modify the chemical composition of the disk can add nuance to this initial picture; for example, the inward drift of pebbles covered with CO ice might enhance the abundance of CO gas within the CO iceline (Öberg & Bergin 2016), or dust transport processes such as grain growth, settling, and vertical mixing can affect the C/O ratio depending on the location of snowlines and pressure bumps in the disk (van der Marel et al. 2021). Furthermore, the chemical composition of a planet can drastically change after formation due to the later accretion of surrounding material while it migrates. For instance, the accretion of icerich planetesimals can lead to the enrichment of metals in the atmosphere of hot Jupiters (Mordasini et al. 2016), or affect the initial atmospheric C/O ratio inherited from core accretion or gravitational collapse and its initial location with respect to the water and CO_{2} icelines (Nowak et al. 2020). Currently, the details of gas giant formation are still mostly unconstrained. Many interconnected physical processes are believed to influence their final bulk and atmospheric composition. This creates model degeneracies that prevent us from unequivocally inverting their full formation history (Mollière et al. 2022). Nevertheless, on a casebycase basis, measurements of atmospheric composition might be leveraged to identify specific aspects of formation processes; for example, a high N/O ratio may indicate a large orbital distance during formation (Piso et al. 2016; Ohno & Fortney 2022). Meanwhile population analyses might identify statistical trends to inform formation models (Hoch et al. 2022).
It is particularly important to develop robust statistical frameworks in light of the new generation of telescopes and instruments that have recently started operating or are coming up this decade: the James Webb Space Telescope (JWST) with its MidInfrared Instrument (MIRI; Wells et al. 2015), capable of integral field spectroscopy at a spectral resolution of R ~ 1500–3500 between 4.9 and 28.3 µm; the new Enhanced Resolution Imager and Spectrograph (ERIS; Davies et al. 2018) at the Very Large Telescope (VLT), which contains the newly refurbished Integral Field Unit (IFU) SPIFFIER (George et al. 2016), upgraded from the SPectrometer for Infrared Faint Field Imaging (SPIFFI) within the old Spectrograph for INtegral Field Observations in the Near Infrared (SINFONI), with a spectral resolution of R ≈ 5000–10 000 in the J, H, and K bands; and finally the Midinfrared ELT Imager and Spectrograph (METIS; Brandl et al. 2021) on the upcoming Extremely Large Telescope (ELT), which allows longslit spectroscopy at low and medium resolution (R ≈ 400–1900), as well as integral field spectroscopy at high resolution (R ≈ 100 000) in the L and M bands.
The first measurement of the atmospheric C/O ratio of an exoplanet was performed by Madhusudhan et al. (2011), who applied an atmospheric fitting code to Spitzer and groundbased photometric data of the transiting hot Jupiter WASP12b. Since then, many atmospheric retrieval codes have been developed for photometric and spectroscopic data of both transiting and directly imaged exoplanets; for example, readers can refer to CHIMERA (Line et al. 2013), NEMESIS (Lee et al. 2012), TauREx (Waldmann et al. 2015), HELIOSRETRIEVAL (Lavie et al. 2017), and petitRADTRANS (Mollière et al. 2019). On the directimaging side, the first proposed strategy to measure the atmospheric C/O ratio has been the combination of photometric and lowresolution (R < 1000) spectroscopic data: for example, Lavie et al. (2017) analysed HR 8799 data from Bonnefoy et al. (2016) and Zurlo et al. (2016) with the VLT SpectroPolarimetric Highcontrast Exoplanet REsearch instrument (SPHERE; Beuzit et al. 2019) and the Gemini Planet Imager (GPI; Macintosh et al. 2014); Mollière et al. (2020) studied HR 8799 e more closely with additional VLTI/GRAVITY spectrointerferometric data; and Nowak et al. (2020) observed β Pictoris b with VLTI/GRAVITY to measure its atmospheric C/O ratio. More recently, Ruffio et al. (2021) observed the HR 8799 system with the OHSuppressing Infrared Integral Field Spectrograph (OSIRIS; Quirrenbach et al. 2003; Larkin et al. 2006) at the Keck Observatory in the K band at medium spectral resolution (R ≈ 4000) and measured stellar C/O ratios for the b, c, and d planets, contradicting the previous findings of Lavie et al. (2017) of the supersolar C/O ratio for the b planet, hinting at a possible discrepancy between measurements at low versus medium spectral resolution. Finally, Xuan et al. (2022) simultaneously analysed photometric and low and highresolution spectroscopic data of the brown dwarf HD 4747 B to derive its C/O ratio, while Wang et al. (2023) additionally considered mediumresolution spectroscopy while applying the same framework based on Wang et al. (2020) to constrain the atmospheric parameters of HR 8799 c.
Several studies have explored how to leverage mediumresolution spectroscopy to investigate the atmospheric composition of directly imaged gas giants, in particular when using crosscorrelation with spectral templates – a method informally called crosscorrelation spectroscopy (CCS). Relying on it, Hoeijmakers et al. (2018) introduced the technique of molecular mapping, which consists in applying a spectral crosscorrelation to the spectrum in each spatial pixel (spaxel) of the datacube of an integral field unit (IFU), thereby unveiling the imprint of each molecule in the atmosphere of the observed exoplanet. Instead of filtering out the starlight as part of data postprocessing, Ruffio et al. (2019) – later improved by Ruffio et al. (2021) – defined a joint likelihood function of the forward model for the planetary atmosphere and stellar point spread function (PSF), allowing to constrain the barycentric radial velocity of the exoplanet as well as the atmospheric parameters using a grid of BTSettl models (Allard 2013) after marginalising over the stellar PSF model.
The molecular mapping technique resulted in the detection of water and carbon monoxide in the atmospheres of β Pic b (Hoeijmakers et al. 2018) and HIP 65426 b Petrus et al. (2021) using Kband mediumresolution (R ≈ 5000) spectroscopic data obtained with VLT/SINFONI (Eisenhauer et al. 2003), as well as in the atmosphere of HR 8799 b (Petit dit de la Roche et al. 2018) using H and Kband mediumresolution (R ≈ 4000) spectroscopic data obtained with Keck/OSIRIS. Cugno et al. (2021) applied the technique to Kband spectroscopic data of the forming planet PDS 70 b taken with VLT/SINFONI, and quantified the extinction of the surrounding dusty environment required to explain the nondetection of H_{2}O and CO. Patapis et al. (2022) examined the feasibility and limiting factors of applying molecular mapping with the MidInfrared Instrument (MIRI; Wells et al. 2015) on the James Webb Space Telescope (JWST) using mock observations of the systems HR 8799 and GJ 504, and predicted the detectability of H_{2}O, CO, CH_{4}, and NH_{3} in the atmospheres of directly imaged gas giants. Using the joint modelling of the stellar PSF and exoplanet atmosphere, Ruffio et al. (2019) inferred the barycentric radial velocity (RV) of the planets HR 8799 b and c using H and Kband mediumresolution (R ≈ 4000) spectroscopic data from Keck/OSIRIS, while Ruffio et al. (2021) constrained the C/O ratios of the HR 8799 b, c, and d planets.
Arguably, the frameworks described in these studies are missing at least one of two key ingredients, namely 1. the simultaneous consideration of archival photometric and lowresolution spectroscopic data on top of the newly published mediumresolution spectroscopic data within the Bayesian framework, or 2. the commitment to a fullon atmospheric retrieval by exploring the whole parameter space using dedicated sampling algorithms to not only find the maximum likelihood estimator, but also derive the full posterior distribution.
With this study, we propose a new framework to combine low resolution spectrophotometric and crosscorrelation spectroscopic data into a full Bayesian statistical framework to derive atmospheric properties of directly imaged exoplanets and to prepare for newly commissioned and future instruments. To that end, we built upon the work of Brogi & Line (2019), who defined a Bayesian statistical framework combining low and highresolution spectroscopy with the original purpose of deriving atmospheric properties of transiting exoplanets. We present our framework CROCODILE (CROssCOrrelation retrievals of Directly Imaged selfLuminous Exoplanets)^{1}, which embeds the radiative transfer routine petitRADTRANS (Mollière et al. 2019) into the mathematical framework of Brogi & Line (2019).
In Sect. 2, we present the main parts of CROCODILE, namely the Bayesian estimator, the modification of the loglikelihood function to include the spectral crosscorrelation between model and data, and our forward model for the atmosphere of gas giant exoplanets. In Sect. 3, we put CROCODILE to a full battery of tests, including a comparison between a regular atmospheric retrieval running on lowresolution spectroscopy (LRS) and photometry and a pure crosscorrelationbased retrieval, as well as a review of the performance of CROCODILE when applied to a large portion of the parameter space of the atmospheres of gas giants. Finally, we discuss the implications of our findings as well as future opportunities for CROCODILE in Sect. 4, before summarising our work in Sect. 5.
2 CROCODILE: A Bayesian framework to combine all directimaging observing techniques into atmospheric retrievals
CROCODILE is composed of a Bayesian estimator coupled to a forward model for the emission spectra of gas giants. More precisely, the routine starts with the prior distributions of the model parameters, from which the Bayesian estimator samples a set of input parameters for the simulator. The forward model computes the spectrum from the sampled parameters, before formatting it to the observing techniques that make up the data, that is photometry, lowresolution spectroscopy, or continuumremoved mediumresolution spectroscopy. These synthetic model spectra are used together with the data to calculate the sum of the three loglikelihood functions associated to each observing technique. Finally, the logprior is added to the loglikelihood to obtain the logposterior which is then used to update the sampled points according to a nested sampling algorithm. The input of CROCODILE is the data, the forward model, the opacity database, and the priors. Its output is the posterior distribution of the forward model, from which the posterior thermal structure, chemistry, and spectrum can be computed. A conceptual sketch of CROCODILE is shown in Fig. 1, while we describe its implementation in the following sections.
2.1 The Bayesian estimator
Atmospheric retrievals are a wellestablished framework within the exoplanet community (for a review, readers can refer to Madhusudhan 2018). Fundamentally, they seek to invert the data, meaning that, for a given model describing the transfer of electromagnetic radiation through the atmosphere of exoplanets, they estimate how well each combination of parameters of said model explains the data. Essentially, they contain three components, namely 1. the data (which are usually obtained via photometry or spectroscopy), 2. the forward model (which maps the parameters of the exoplanetary atmosphere to an emission or transmission spectrum), and 3. a Bayesian inference algorithm that takes in the data as well as prior knowledge and calculates a posterior probability distribution over the parameter space of the model. Bayes' theorem (Bayes 1763) provides the mathematical framework by formulating the problem as the computation of the posterior distribution:
$$p\left(\theta d\right)=\frac{p\left(\theta d\right)p\left(\theta \right)}{p\left(d\right)},$$(1)
where the vector θ denotes the parameters of the forward model (e.g. the equilibrium temperature, surface gravity, or chemical composition), d the data, p(d  θ) = ℒ(θ) the likelihood function, p(θ) the prior distribution, and p(d) = ∫_{Θ}p(d  θ)p(θ) the evidence defined as the integral of the likelihood function multiplied by the prior over the parameter space Θ. Calculating the likelihood function is expensive since it requires to evaluate the forward model including the computationally heavy radiative transfer routine. Therefore, we need an efficient parameter sampling method to reduce the number of evaluations of the forward model. Our code CROCODILE makes use of the software pymultinest (Buchner et al. 2014), which is broadly used in the community of atmospheric retrievals (e.g. Lavie et al. 2017; Mollière et al. 2019, 2020; Konrad et al. 2022), a Python implementation of MultiNest (Feroz et al. 2009, 2019) which utilises the multimodal nested sampling algorithm from Skilling (2006).
The choice of the likelihood function ℒ(θ) depends on the noise statistics of the data, which depend on the observing and postprocessing techniques used. Typically, photometric data d_{PH} obtained by different instruments or in different observing conditions are statistically independent of each other, allowing the decomposition of the likelihood function into a product of likelihood functions for each datapoint. Furthermore, the noise in highcontrast imaging data is usually assumed to be independent identically distributed Gaussian, leading to the following loglikelihood function (ignoring constant factors that only depend on the data):
$$\mathrm{log}{\mathcal{L}}_{\text{PH}}\left(\theta \right)\approx \frac{1}{2}{\displaystyle \sum _{i=1}^{{N}_{\text{PH}}}\frac{{\left({d}_{\text{PH},i}{m}_{\text{PH},i}\left(\theta \right)\right)}^{2}}{{\sigma}_{\text{PH},i}^{2}}},$$(2)
where m_{PH}(θ) is the synthetic photometric data obtained by evaluating the forward model at the parameter θ, and σ_{PH} is the uncertainty associated to the data d_{PH}.
For lowresolution spectroscopic data d_{LRS}, the noise statistics can be more intricate, since different wavelength channels of a spectrum might be correlated (e.g. Todorov et al. 2016; Madhusudhan 2018; Mollière et al. 2019; Nowak et al. 2020). In this case, the likelihood function is often chosen as a multivariate normal distribution (again ignoring constant factors):
$$\mathrm{log}{\mathcal{L}}_{\text{LRS}}\left(\theta \right)\approx \frac{1}{2}{\left({d}_{{}_{\text{LRS}}}{m}_{\text{LRS}}\left(\theta \right)\right)}^{T}{W}_{\text{LRS}}^{1}\left({d}_{\text{LRS}}{m}_{\text{LRS}}\left(\theta \right)\right),$$(3)
where m_{LRS}(θ) is the synthetic lowresolution spectrum that is obtained by evaluating the forward model at the parameter θ (Greco & Brandt 2016). The covariance matrix W_{LRS} describes the correlation of the data across spectral channels and needs to be computed for each observation (e.g. Nowak et al. 2020).
Fig. 1 Schematic of our atmospheric retrieval framework CROCODILE. The input of the routine is the data – which can be photometry, low, or mediumresolution spectroscopy –, the forward model, the opacity database, and the priors. CROCODILE starts by sampling parameters from the prior, from which the forward model is evaluated. It is then compared to the data by computing the sum of the loglikelihood functions corresponding to the three observing techniques, from which new parameters are sampled and the loop restarts. The output is the posterior distribution of the forward model. 
2.2 Incorporating crosscorrelation spectroscopy into atmospheric retrievals
Crosscorrelation spectroscopy (CCS; Sparks & Ford 2002) was introduced to the field of exoplanet directimaging to leverage the full spectral resolution of spectrographs. With the exception of M stars, the high temperatures found in the photosphere of stars prevent the atoms from bonding into molecules. For that reason, stellar spectra only show atomic absorption lines in the infrared range, which differ significantly from the molecular absorption lines found in colder objects such as brown dwarfs and gas giant exoplanets. Therefore, it is possible to filter out the stellar spectrum and crosscorrelate the residuals with molecular spectral templates, thereby picking up the signal of molecules present in the atmosphere of the lowmass companion, for example using highresolution spectral differential imaging (HRSDI; e.g. Hoeijmakers et al. 2018; Haffert et al. 2019; Cugno et al. 2021).
To incorporate HRSDItreated IFU data into atmospheric retrievals, we followed the framework of Brogi & Line (2019), who expanded the conventional Gaussian likelihood function from Eq. (2) by assuming constant noise across all spectroscopic channels and replacing the standard deviation by its maximum likelihood estimator, thereby making the crosscovariance K between model and data appear in the loglikelihood function:
$$\mathrm{log}{\mathcal{L}}_{\text{CCS}}\left(\theta \right)\approx \frac{{N}_{\text{CCS}}}{2}\mathrm{log}\left({s}_{d}^{2}2K\left(\theta \right)+{s}_{m}^{2}\left(\theta \right)\right),$$(4)
with the following definitions
$${s}_{d}^{2}=\frac{1}{{N}_{\text{CCS}}}{\displaystyle \sum _{i=1}^{{N}_{\text{CCS}}}{d}_{\text{CCS,i}}^{2}}$$(5)
$${s}_{m}^{2}\left(\theta \right)=\frac{1}{{N}_{\text{CCS}}}{\displaystyle \sum _{i=1}^{{N}_{\text{CCS}}}{m}_{\text{CCS,i}}^{2}\left(\theta \right)}$$(6)
$$K\left(\theta \right)=\frac{1}{{N}_{\text{CCS}}}{\displaystyle \sum _{i=1}^{{N}_{\text{CCS}}}{d}_{\text{CCS},i}{m}_{\text{CCS},i}\left(\theta \right)}.$$(7)
In Eq. (7), the crosscovariance K is linked to the crosscorrelation C over the relation:
$$K\equiv \sqrt{{s}_{d}^{2}{s}_{m}^{2}},$$(8)
meaning that either the crosscorrelation C or the crosscovariance K can be computed as long as the right variable is used in Eq. (4). For the sake of completeness, we provide the derivation of Eq. (4) used by Brogi & Line (2019) in Appendix A. So far, we have kept the explicit dependency of the model M on the parameter θ for the sake of mathematical rigorousness; however, we have dropped the dependency of the model spectrum on the frame of reference. Whereas the effect is insignificant for photometry and lowresolution spectroscopy of closeby stellar systems, the Doppler shift created by the radial velocity υ_{R} of the target exoplanet with respect to the observatory indeed becomes significant at medium spectral resolution. More precisely, for a spectrum at spectral resolution R and at wavelength λ, and therefore with channel spacing $\text{\Delta}\lambda =\frac{\lambda}{R}$, the Doppler shift $\frac{\text{\Delta}\lambda}{\lambda}=\frac{{\upsilon}_{\text{R}}}{c}$ (with c the speed of light in vacuum) resulting from the radial velocity υ_{R} is equal to the wavelength spacing of the spectrum when ${\upsilon}_{\text{R}}=\frac{c}{R}$. At medium spectral resolution (R ≈ 5000), this effect occurs with a radial velocity of υ_{R} ≈ 60 km s^{−1}, whereas it requires a much higher radial velocity (υ_{R} > 600 km s^{−1}) at low spectral resolution (R < 500). Hence, the forward model for medium and highresolution spectroscopy needs to be Dopplershifted to the reference frame of the observed spectrum. Instead of computing it once and for all before the retrieval or adding it in as a parameter of the model, we incorporated it into the algorithm by taking the maximum of the crosscovariance K over a range of radial velocity
$$K\left(\theta \right)=\underset{{\upsilon}_{0}\le {\upsilon}_{\text{R}}\le {\upsilon}_{\text{l}}}{\mathrm{max}}\text{\hspace{0.17em}}K\left(\theta ,{\upsilon}_{\text{R}}\right),$$(9)
where K (θ, υ_{R}) is the crosscovariance where the model is Doppler shifted by the radial velocity υ_{R}, and υ_{0} and υ_{1} are the lower and higher bounds of the range of radial velocities considered. To Doppler shift the forward model spectrum, we used the methods dopplerShift and crosscorrRV from the Python package PyAstronomy (Czesla et al. 2019), and we selected radial velocities between υ_{0} = −400 and υ_{1} = 400 km s^{−1} in steps of 0.5 km s^{−1}, which corresponds to 0.8% of the wavelength spacing at medium spectral resolution R = 5000. The size of the steps have conservatively been chosen small because Dopplershifting the spectra is a cheap computational operation relative to the cost of computing the forward model.
These algorithms use interpolation to Doppler shift spectra, which can lead to nonconservation of the total flux. To limit this effect, we applied the Doppler shift before downsampling the spectra to a lower spectral resolution when simulating data. On the other hand, the routine crosscorrRV interpolates the already downsampled forward model to the Doppler shifted wavelength before computing the correlation. To understand how the use of interpolation might negatively affect the crosscorrelation, we implemented and tested an algorithm which first applies the Doppler shift before downsampling and computing the correlation. To quantify the improvement, we simulated three spectra using our forward model from Sect. 3.3: one data spectrum with added random scatter, and two template spectra with different model parameters. We then computed the likelihood ratio (Eq. (4)) between the two template spectra using both algorithms. The new algorithm did not significantly change the results: the likelihood ratios differed by 0.03%. However, the new algorithm was slower by a factor of 1000 due to the extra downsampling step, which would become a bottleneck during retrievals. Hence, we did not use our new algorithm in this work and continued evaluating Eq. (4) using the routine crosscorrRV. We note that the issue discussed here might yield a different result with another combination of spectral resolution and wavelength range. Therefore, we recommend caution when using an interpolationbased Doppler shifting algorithm; in particular, the test described above should be repeated when working with alternate datasets.
Finally, it is worth stating explicitly that the spectral data d_{CCS} are assumed to have been reduced using HRSDI, and thus its continuum has been removed. Therefore, the model also needs to be highpass filtered with the same cutoff frequency as the data. We describe our highpass filter in Sect. 3.2.
2.3 Combining data from different instruments
In the two previous sections, we motivated our choice of loglikelihood functions associated to photometric, lowresolution spectroscopic, and crosscorrelation spectroscopic data. To integrate all three techniques into one framework, we need to compute their joint likelihood function:
$$\mathcal{L}\left(\theta \right)=p\left({d}_{\text{PH}},{d}_{\text{LRS}},{d}_{\text{CCS}}\theta \right).$$(10)
In general, the joint likelihood functions of two datasets d_{1} and d_{2} depends on their conditional probability p(d_{1}, d_{2}  θ) = p(d_{1}  d_{2}θ)p(d_{2}  θ). In this work, we assumed that all datasets are independent of each other, that is p(d_{1}  d_{2}θ) = p(d_{1}  θ). This assumption should hold whenever two datasets come from different telescopes or different instruments, or if they were taken on different nights or under different meteorological conditions. Therefore, we can apply this argument recursively to the joint likelihood function given above to obtain
$$p\left({d}_{\text{PH}},{d}_{\text{LRS}},{d}_{\text{CCS}}\theta \right)=p\left({d}_{\text{PH}}\theta \right)p\left({d}_{\text{LRS}}\theta \right)p\left({d}_{\text{CCS}}\theta \right),$$(11)
which translates to the following loglikelihood function for CROCODILE:
$$\begin{array}{l}\mathrm{log}\mathcal{L}\left(\theta \right)=\mathrm{log}{\mathcal{L}}_{\text{PH}}\left(\theta \right)+\mathrm{log}{\mathcal{L}}_{\text{LRS}}\left(\theta \right)+\mathrm{log}{\mathcal{L}}_{\text{CCS}}\left(\theta \right)\hfill \\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}\approx \frac{1}{2}{\displaystyle \sum _{i=1}^{{N}_{\text{PH}}}\frac{{\left({d}_{\text{PH},i}{m}_{\text{PH},i}\left(\theta \right)\right)}^{2}}{{\sigma}_{\text{PH},i}^{2}}}\hfill \\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}\frac{1}{2}{\left({d}_{\text{LRS}}{m}_{\text{LRS}}\left(\theta \right)\right)}^{T}{W}_{\text{LRS}}^{1}\left({d}_{\text{LRS}}{m}_{\text{LRS}}\left(\theta \right)\right)\hfill \\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}\frac{{N}_{\text{CCS}}}{2}\mathrm{log}\left({s}_{d}^{2}2K\left(\theta \right)+{s}_{m}^{2}\left(\theta \right)\right).\hfill \end{array}$$(12)
2.4 The forward model
The forward model of an atmospheric retrieval is composed of two parts, namely the theoretical model describing the chemical and thermal properties of the planetary atmosphere, and a radiative transfer routine that calculates the spectrum escaping from it. For the latter, we utilised petitRADTRANS (Mollière et al. 2019), which solves the radiative transfer equation on a discrete 1D planeparallel atmosphere. It can be used in two spectral modes, namely the highresolution linebyline mode (lbl; λ/Δλ = 10^{6}) or the mediumresolution correlatedk (ck) approximation (λ/Δλ = 10^{3}). We used the lbl mode when calculating spectra of instruments with a spectral resolution higher than 1000, and the other for lower resolution or photometry. We set the number of atmospheric layers on which the radiative transfer equation is solved to 100 between a top and bottom pressure of 10^{−6} and 10^{2} bar.
For the physical model of the atmosphere, two main approaches are used in the literature, namely selfconsistent (Seager et al. 2005; Burrows et al. 2008; Fortney et al. 2008; Moses et al. 2011; Gandhi & Madhusudhan 2017) and free models (Todorov et al. 2016; Lavie et al. 2017; Nowak et al. 2020; Mollière et al. 2020). Selfconsistent models aim to account for the known relevant physical processes that might take place in a planetary atmosphere and therefore implement physical constraints such as hydrodynamical, thermodynamical, and chemical equilibria, and sometimes even disequilibria due to dynamical interactions such as, for example, winds (Gandhi & Madhusudhan 2018) or photochemistry (Moses et al. 2011). Evidently, the risk of this approach is to falsely account for the relevant physical processes, for example by wrongly assuming chemical equilibrium (Madhusudhan 2018). The solution to that problem is to leave out physical consistency and adopt a parameterised model that is flexible enough to simultaneously account for both wellknown and unknown physical processes. The disadvantage of that approach is that it often requires a higherdimensional model and that it might return physically impossible solutions. In this work, we selected a mixed approach with a physically motivated analytical thermal structure combined with free chemistry. Indeed, we are mainly interested in constraining the abundances of specific molecules, therefore requiring to fit for each of them independently, whereas investigating the diversity of thermal structures that can be constrained by atmospheric retrievals is beyond the scope of this work.
Specifically, our free chemistry model was parameterised by the mass fractions X_{i} of the molecular abundances which are assumed to be constant vertically across all atmospheric layers (e.g. Todorov et al. 2016; Line et al. 2017). The rest of the atmospheric mass was made of 75% molecular hydrogen and 25% helium (Line et al. 2012). We considered the line opacities of H_{2}O, CO, CO_{2}, CH_{4}, H_{2}S, FeH, TiO, K, and VO, at low (ck) and high resolution (lbl) depending on the spectral resolution of the instrument for which the spectrum was calculated. We included Rayleigh scattering of molecular hydrogen and helium together with the CollisionInduced Absorption (CIA) crosssections of H_{2}H_{2} and H_{2}He. The line lists and opacities used in this study were downloaded at the online documentation of petitRADTRANS^{2} and are listed in Table 1 for reference.
We parameterised the thermal structure of the atmosphere with the analytic semigrey model derived by Guillot (2010). It is parameterised by six quantities, of which the equilibrium temperature T_{equ} and surface gravity g are free parameters of our forward model, whereas the other four are fixed: the average opacity in the infrared κ_{IR} = 0.01 cm^{2}/g, the ratio of the average opacity in the optical and infrared γ = κ_{V}/κ_{IR} = 0.4, the interior temperature T_{int} = 200 K, and the bottom pressure P_{0} = 100 bar.
The last parameter was the planet radius R which, together with the distance from the Earth d, served to scale the atmospheric spectrum down by a multiplicative factor of R^{2}/d^{2}, since petitRADTRANS calculates the flux escaping the top of the atmosphere of the planet per unit surface.
In this study, we purposefully ignored clouds, which are known to be present in the atmospheres of gas giant exoplanets (Nowak et al. 2020; Mollière et al. 2020). However, considering clouds would add too much complexity to our model and would dilute our message. Indeed, the modelling of clouds is an issue that all atmospheric retrieval frameworks face and therefore need to be investigated separately.
Finally, we ignored rotational broadening and limb darkening. Indeed, PalmaBifani et al. (2023) included both parameters into their retrieval on VLT/SINFONI J, H, and Kband spectroscopic observations of AB Pic b and reported an erratic behaviour of their posteriors, which was probably due to the insufficient spectral resolution of VLT/SINFONI. Furthermore, there was no noticeable correlation between υsin(i) and any other parameter in their published 2D cornerplots (figures 7 and 8). Similarly, Bryan et al. (2020) measured the rotational broadening of the planetarymass companion 2MASS J01225093–2439505 using highresolution (R ≈ 25 000) spectroscopic observations taken with Keck/NIRSPEC and reported no significant dependence on the C/O ratio of the atmospheric models used to crosscorrelate with the data.
Line lists and opacities included.
3 Tests on simulated data
In the following, we aim to demonstrate the capability of CROCODILE to recover the thermal and chemical properties of the atmospheres of gas giants using a common combination of groundbased techniques in the near and mid infrared such as photometry, lowresolution spectroscopy (LRS), and mediumresolution crosscorrelation spectroscopy (MRCCS).
We therefore submit it to a full series of tests that we describe and justify in the following.
As a preliminary test of CROCODILE, we checked our statistical framework in Sect. 3.3 for potential mistakes or biases. For this, we controlled our implementation of the crosscorrelationbased likelihood given by Eq. (4) by running a retrieval with MRCCS data alone. Then, we tested our implementation of the regular retrieval framework on LRS and photometric (LRSP) data. Finally, we tested the implementation of CROCODILE on the combination of the three techniques.
To test CROCODILE in a meaningful way, we considered a realistic study case for groundbased instruments. This had the following two consequences, namely first that we chose a wellknown target to simulate for which some atmospheric properties have been derived already, and second we assumed a realistic list of instruments to simulate spectroscopic and photometric observations with. To that end, we selected β Pictoris b (Lagrange et al. 2010) as a bright and young exoplanet that was observed by many instruments and was the object of several atmospheric studies (e.g. Hoeijmakers et al. 2018; Stolker et al. 2020; Nowak et al. 2020). We describe this target and the used instruments in detail in Sects. 3.1 and 3.2, and show the results of that test in Sect. 3.4.
Furthermore, we show that combining MRCCS with LRSP generally allows to better constrain the forward model as compared to using only LRSP data as it has been done until now, by applying our framework to different values of the parameter space of the atmospheres of gas giants (readers can refer to Sect. 3.1 for details). The results of that test are presented in Sect. 3.5.
3.1 Simulated targets
To test our method, we need to know the ground truth values of the parameters of an exoplanetary atmosphere and see if we can retrieve them truthfully. However, it would not be realistic (namely: too optimistic) to use the same chemical model to simulate such an atmosphere that we use as our forward model. In particular, the underlying chemical model of our target should have a more complex chemistry than our simple forward model. Therefore, whereas our forward model assumed free chemistry as we described in Sect. 2.4, for our simulated targets we chose the chemical equilibrium model that was computed by easyCHEM from Mollière et al. (2017)^{3} as a first approximation of the chemistry of an exoplanetary atmosphere, not accounting for clouds (Burrows & Sharp 1999), vertical mixing (Griffith & Yelle 2000), or other processes causing disequilibrium chemistry. Concretely, it was calculated in the following way: we set the pressuretemperature profile as well as the carbontooxygen (C/O) ratio and metallicity (Fe/H) and the code calculated the resulting steadystate molecular abundances at chemical equilibrium in each layer of the atmosphere by minimising the Gibbs free energy. We considered the opacities of H_{2}O, CO, CH_{4}, CO_{2}, H_{2}S, NH_{3}, FeH, HCN, TiO, PH_{3}, K, VO, and Na (i.e. HCN in addition to the opacities included in Nowak et al. 2020). We included more opacities here than in our forward model described in Sect. 2.4. Moreover, we also included Rayleigh scattering of molecular hydrogen and helium together with the CollisionInduced Absorption (CIA) crosssections of H_{2}H_{2} and H_{2}He. Finally, we used the same pT profile as our forward model (Guillot 2010). The spectra of our synthetic targets were also generated with petitRADTRANS (Mollière et al. 2019).
As mentioned above, we aim to test CROCODILE on a meaningful subspace of exoplanetary atmospheres. In particular, we considered the three parameters that control the chemical equilibrium model, namely the equilibrium temperature, the carbontooxygen ratio, and the metallicity, and we selected a baseline equal to the values derived for β Pic b by Nowak et al. (2020), namely T_{equ} = 1742 K, C/O = 0.44, and Fe/H = 0.66 dex, and subsequently varied each of these parameters. Table 2 summarises the full list of exoplanetary atmospheres simulated. We point out that the values given in the table did not exactly match the input values used for easyCHEM (which were 0.3, 0.44, and 1 for C/O and −1, 0.66, and 2 for Fe/H), since we only considered the gaseous content of the atmosphere and not the liquid and solid reactants available with easyCHEM. The values given in Table 2 are the mean C/O ratio and metallicity along the vertical extent of the simulated atmospheres after removing all opacities from nongaseous reactants.
Additionally, our simulated targets had a planetary radius R_{p} = 1.36 R_{J} (Nowak et al. 2020) and an arbitrary radial velocity of υ_{R} = 31 km s^{−1}, the former being a parameter of the forward model while the latter only plays a role at medium (or higher) spectral resolution and gets determined at every computation of the likelihood function using Eq. (9). Finally, our model of the targets implicitly included six more parameters which were kept fixed between the data and the forward model. Five of these describe the pT profile and are the same as given in Sect. 2.4. The remaining parameter was the distance d = 19.75 pc to β Pictoris b as measured by Gaia Collaboration (2018).
Parameter values assumed for the simulated datasets.
3.2 Synthetic data
For each of the seven synthetic targets described above, as well as for each evaluation of our forward model, we recreated synthetic photometry, LRS, and MRCCS by matching the instrumental characteristics of the archival datasets of β Pic b listed in Table 3. Considering real observations of β Pic b enabled us to simulate realistic measurement uncertainties and random noise for each instrument in a straightforward way, thereby creating a realistic synthetic dataset. To simulate the format recorded by each of these instruments, we started by computing a spectrum with petitRADTRANS using the atmospheric model described in Sect. 3.1 in either the full spectral resolution linebyline mode (R ≈ 10^{6}) for VLT/SINFONI and VLTI/GRAVITY or the correlatedk approximation (R ≈ 1000) for VLT/SPHERE and photometry.
Photometry was calculated by integrating the spectrum multiplied by the normalised filter transmission function, which we downloaded with the Python package species (Stolker et al. 2020)^{4}. We estimated the uncertainty of the photometric flux by taking the same relative error as the real measurement, and added normally distributed random scatter with standard deviation equal to the estimated uncertainty.
For the VLTI/GRAVITY instrument, the resampling from the petitRADTRANS to the observation bins was done using the Python package spectRES (Carnall 2017) which preserves the total flux. We then estimated the uncertainty of the synthetic fluxes by scaling the covariance matrix of the real measurement to the simulated spectrum. To create realistic flux measurements, we added a random (multivariate normally distributed) scatter to each flux bin with a covariance matrix equal to the simulated covariance matrix. For VLT/SPHERE, where we had no access to a spectrum of β Pic b, we chose a simpler approach by setting the uncertainty to 10% (which is an optimistic estimation of the errorbars found in Fig. 11 of Langlois et al. 2021) and added normally distributed random scatter in the same way as for photometry.
Concerning the VLT/SINFONI MRCCS data, we first applied a Doppler shift with a radial velocity of 31 km s^{−1} to the spectrum before resampling the spectrum using spectRES. To match the treatment of HRSDI, we applied a lowpass filter consisting of a Gaussian filter with standard deviation of 4.9 nm to compute the continuum, which we subsequently subtracted from the spectrum. We found the value of 4.9 nm by repeating our preliminary test from Sect. 3.3 for different filter sizes (readers can refer to Appendix C for details); however, we note that the optimal filter size should be investigated separately when using a real observation. To add random scatter to our MRCCS data, we aimed to reproduce the signaltonoise ratio (S/N) of the real observations that we considered. Therefore, we generated a continuumremoved spectrum using the values retrieved by Nowak et al. (2020) for β Pic b and crosscorrelated it with the SINFONI spectrum from Hoeijmakers et al. (2018). We then took the ratio of the peak of the crosscorrelation function to its standard deviation 200 km s^{−1} away from the peak (similar to Cugno et al. 2021) and obtained a value of 14.4. Thus, when we simulated SINFONI spectra, we added Gaussian noise with a standard deviation such that the CCF between noisy and clean spectra also yielded the same ratio of the peak to standard deviation. This made the synthetic SINFONI planet spectra similar (in terms of S/N) to the β Pic b data presented in Hoeijmakers et al. (2018).
3.3 Verifying the framework of CROCODILE
To verify the statistical framework used in CROCODILE, we submitted it to a preliminary test in an idealised setup. To that end, we created yet another synthetic target with the same free chemistry as in our forward model. More precisely, we used the same pT profile as in our simulated atmosphere 1 from Table 2 but changed the chemical composition to only contain H_{2}O and CO with vertically constant mass fractions equal to −1.8 dex and −1.6dex respectively. Moreover, we did not add any random scatter contained in our lowresolution datasets to ensure that, if the retrieval did not converge to the input values, it was not because of the noise but rather because of a mistake in our framework. Finally, we reduced the amount of random scatter by a factor of three in the MRCCS data with respect to the synthetic observations introduced in Sect. 3.2 instead of removing it completely because otherwise the argument of the logarithm could become zero if the forward model matches the simulated data.
We first ran CROCODILE on MRCCS data alone to test the implementation of the likelihood function based on the crosscorrelation function (Eq. (4)). We then tested the implementation of the regular likelihood function based on the residuals between data and model (Eq. (3)) using only LRSP. Finally, we checked that the combination of MRCCS together with LRSP via the addition of their loglikelihood functions was also valid. We show the resulting 2D posterior distributions of these three evaluations of CROCODILE in Fig. 2.
The three tests yielded educative results: the retrieval on LRSP data alone yielded perfectly centred posteriors with small uncertainty as can be expected from the lack of random scatter in the data. In particular, the abundances of water and carbon monoxide were well constrained due to the good coverage of their spectral features by the VLT/SPHERE/IFS/YH and VLTI/GRAVITY/K lowresolution spectroscopic data. On the other hand, the retrieval on MRCCS data yielded a systematic overestimation of the temperature, surface gravity, radius, and H_{2}O abundance, while the abundance of CO was constrained very accurately. Furthermore, the radius seemed correlated with the surface gravity and the abundances of H_{2}O and CO, and to a lesser extent to T_{equ}.
These correlations can be tentatively explained by the removal of the continuum in the treatment of the MRCCS data in the following way. Without continuum, the radius only has the role of scaling the absorption lines, which is counteracted on the one hand by the surface gravity, which in the pT model from Guillot (2010) brings the same temperatures at higher pressures, which reduces the thermal gradient between the top of the atmosphere and the region at optical depth of unity and hence decreases the strength of the line. The molecular abundances also counteract the scaling effect of the radius by increasing the height at which the optical depth reaches unity, thereby also reducing the thermal gradient between the top of the atmosphere and the emitting layer. Interestingly, if the radius is already known and fixed to a previously measured value during the retrieval, then the results improve greatly as can be seen in Fig. D.1. Indeed, the abundances of H_{2}O and CO and the surface gravity were all retrieved within 1σ with a smaller uncertainty; however, the temperature was still equally overestimated. This test indirectly hints that if the radius – or any other flux scaling parameter – is set as fixed parameter to a wrong value, then there is a high risk to bias the results when using MRCCS data only.
Finally, when combining LRSP and MRCCS data, the results were slightly more accurate than when used on LRSP data alone. In particular the abundance of CO was retrieved with a 25% smaller 1σ spread. In summary, this test verified the statistical framework of CROCODILE and already showed the systematics that appear when using only MRCCS data.
List of observations of β Pic b used for our simulations.
Fig. 2 Resulting posterior distributions for our preliminary test that controls our implementation of CROCODILE using the same chemistry as input as our forward model, namely vertically constant abundances of H_{2}O and CO, and reduced random scatter. The orange posterior shows the results of the retrieval using only MRCCS data to test the implementation of the crosscorrelation likelihood function, the blue posterior is for the regular framework using only LRSP, and the green posterior shows the result for the combination of MRCCS and LRSP data. The dashed red line shows the ground truth used as input. 
Fig. 3 Results of the three retrievals of the same simulated atmosphere using different combinations of the synthetic spectra of β Pic b: the SINFONI MRCCSonly retrieval in orange, the LRSPonly retrieval in blue, and the retrieval using all techniques in green. The offaxis panels show the projected 2D posterior distributions of the main model parameters (readers can refer to the description in Sect. 2.4). The indicated values above each histogram on the diagonal refer to the median and 1sigma uncertainty for CROCODILE. For the molecular abundances, we omit the logarithm in front of the names of the molecules for the sake of readability. The dashed red line represents the true parameter value, and it is not shown for the abundances given the different chemical treatment between simulated observations and forward model. 
3.4 Case study: A synthetic β Pictoris b
After validating our statistical framework, we investigated possible biases by applying it to the synthetic targets described in Sect. 3.1, which span across a portion of the parameter space around the gas giant β Pic b. First, we give a more indepth report for our baseline in this subsection while we describe all our results in the next. Figure 3 compares the posterior distributions obtained with MRCCS data only, with LRSP, or with MRCCS and LRSP simultaneously; the posterior distributions of the molecular abundances are shown in Fig. 4; and the retrieved thermal profiles, C/O ratio, and metallicity Fe/H are presented in Fig. 5. Our retrieval setup and priors are described in Appendix B.
The results obtained with MRCCS data only (in orange in Fig. 3) showed that the surface gravity, planet radius, and abundances of water and carbon monoxide all yielded Gaussian marginal distributions with the simulated values outside of 1σ of the mean. The equilibrium temperature was overestimated by 1σ resulting in a pT profile that was about 150 K too hot at all altitudes relative to the ground truth, similarly to the planetary radius which was also overestimated. This behaviour is identical to what we observed in the previous section, with similar correlations between the same parameters, once again demonstrating the effects of the removal of the continuum. The posterior distributions of the molecular abundances resulted in a good constraint of CO with a large standard deviation of 0.4 dex. The abundance of H_{2}O was overestimated by 1 dex, while we obtained correct upper bounds (i.e. the true value was indeed lower than the bound) for CH_{4} and CO_{2}, and uniform posteriors for FeH, TiO, H_{2}S, K, and VO, which is expected from the weak spectral signatures of these molecules in the K band. The C/O ratio and metallicity did not get retrieved within 1σ, despite large standard deviations of 0.14 and 0.5 dex respectively.
When we applied CROCODILE to LRSP data only (in blue in Fig. 3), the equilibrium temperature, surface gravity, and planetary radius were retrieved within 1σ with a precision of 10 K, 0.05 dex, and 0.015 R_{J} respectively. The pT profile was well constrained, with the true profile less than 20 K away from the posterior median. Contrary to the abundances of H_{2}O and CO which were well constrained with Gaussian posterior distributions, the abundances of FeH, K, and VO resulted in an upper bound, where the posterior maximum correctly identified an upper bound and lower values were not totally ruled out.
This can be attributed to the strong features produced by those molecules at short wavelengths, identified by the lowresolution SPHERE/IFS data. For the other molecules, we obtained correct upper bounds with the true abundances at lower values. The C/O ratio and metallicity were accurately retrieved within 1σ with a spread of 0.06 and 0.09 dex respectively.
Finally, when we applied CROCODILE to all three techniques (in green in Fig. 3), we obtained equivalent constraints on all parameters, with a negligible improvement on the spread of the C/O ratio and the accuracy of the metallicty. Interestingly, the lower C/O ratio constrained by MRCCS relative to LRSP is similar to the inconsistent C/O ratios that have been reported for the HR 8799 b planet: Lavie et al. (2017) used photometric and lowresolution H and Kband spectroscopic data from KECK/OSIRIS (binned down to a resolution of 60 to increase the S/N) and obtained a value close to unity, while Ruffio et al. (2021) derived a much lower value of 0.578 on mediumresolution (R ≈ 4000) H and K band crosscorrelation spectroscopic observations taken with the same instrument but processed differently.
In summary, our test seemed to further confirm the systematics of the retrieval on Kband MRCCS data observed in the previous section, while these systematics disappear when combined with LRSP data. In the next section, we investigate whether that is still the case when changing the input atmosphere.
Fig. 4 Retrieved posterior distributions of all the molecular abundances included in the three retrievals of the experiment described in 3.4. The solid red lines represent the profile (in log scale) of each molecular abundance along the vertical extent of the atmosphere between 100 bar at the bottom and 10^{−6} bar at the top. 
Fig. 5 Results of the experiment described in 3.4. Top panel: thermal profiles retrieved for each technique (MRCCS: orange, LRSP: blue, CROCODILE: green). The two levels of opacity correspond to the 16 to 84 and the 2 to 98 percentile envelopes of the thermal profile. The legend reports the median and 64 % confidence interval retrieved for the equilibrium temperature. Middle and bottom panel: posterior distributions of the C/O ratio and metallicity Fe/H inferred from the retrieved abundances for each technique. The dashed lines represent the posterior median. The dashed red line shows the ground truth in every panel. 
3.5 Exploration of the parameter space
Since our results – derived for a simple synthetic model of ß Pic b – might depend on the input atmosphere, we next explored how our framework fares in the parameter space around it as described in 3.1. The model setup and priors used for our retrievals were the same as in the previous section (readers can refer to Appendix B for details). Figure 6 reports the scores of three metrics applied to our retrievals in order to compare the results of the three techniques. Figure 7 focuses on the fifth atmospheric model as it is discussed below, while the fitted SEDs and posterior pT profiles are shown in Figs. 8 and 9 respectively. Figure E.1 summarises all the resulting posterior distributions of the seven synthetic targets defined in Sect. 3.1, while the tables F.2, F.3, and F.4 report the numerical values.
In order to quantitatively compare the results obtained by LRSP, MRCCS, or LRSP+MRCCS, we computed a slightly modified Mahalanobis distance d_{θ} (Eq. (13)), the relative error ϵ_{θ} (Eq. (14)), and the mean squared error MSE_{θ} (Eq. (15)) from the marginalised posterior distribution of each parameter θ and for each simulated atmosphere:
$${d}_{\theta}=\left(\widehat{\theta}{\theta}_{\text{T}}\right)/{\sigma}_{\theta}$$(13)
$${\u03f5}_{\theta}=\left(\widehat{\theta}{\theta}_{\text{T}}\right)/{\theta}_{\text{T}}$$(14)
$${\text{MSE}}_{\theta}=\frac{1}{{N}_{\text{S}}}{\displaystyle \sum _{i=1}^{{N}_{\text{S}}}{\left({\theta}_{i}{\theta}_{\text{T}}\right)}^{2}}.$$(15)
In the equations above, $\widehat{\theta}$ is the posterior median, θ_{T} is the ground truth of parameter θ, σ_{θ} is the distance of the median from the closest bound of the 68th percentile confidence interval (i.e. if $\theta <\widehat{\theta}$, then σ_{θ} is equal to the difference between the median and the 16th percentile, and the difference between the median and the 84th percentile otherwise), and θ_{i} denotes the N_{s} points sampled by the nested algorithm. We choose these three metrics for the following reasons. First, the Mahalanobis metric measures the distance from the ground truth in terms of standard deviations (more precisely here in terms of 68th percentile confidence interval), meaning that a bad score is obtained when the uncertainty is underestimated. Second, the relative error measures the accuracy with which the parameters are retrieved. Finally, the mean squared error measures the quality of the median posterior as an estimator for the ground truth as it incorporates both its variance and its bias. Since the true molecular abundances have nonconstant vertical profiles, θ_{T} represents the point of the true profile closest to the retrieved value, that is ($\widehat{\theta}{\theta}_{\text{T}}$) is zero if there is an atmospheric layer where the retrieved abundance matches the true value. Figure 6 summarises the results of these three metrics for each retrieval case and for each technique (MRCCS, LRSP, or MRCCS+LRSP, i.e. CROCODILE), and by averaging over three groups of parameters: the molecular abundances, the physical parameters T_{equ}, log(g), and R, and the C/O ratio and metallicity.
The retrieval using MRCCS data alone resulted in the worst overall score with respect to every metric and for every subgroup of parameters, except for the molecular abundances with the Mahalanobis metric. This can be explained by the broad posterior distributions obtained with the technique, meaning that the ground truth was more often contained within the 68th percentile confidence interval although the retrieved value was significantly far from the true value, especially for T_{equ}, log(g), and R. Combining MRCCS with LRSP data lead to the lowest relative error and mean squared error for all groups of parameters, and yielded approximately equal Mahalanobis distance than the retrievals on LRSPonly data. Overall, that is across all the cases, the combined fit on LRSP and MRCCS data achieved to retrieve 68% of the parameters within 0.7σ, against 0.6σ for the LRSPonly fit and 1.1σ for the MRCCSonly fit, while the same proportion of the parameters was retrieved with a relative error of less than 6% for LRSP+MRCCS, against 8% for LRSP and 28% for MRCCS. Hence, the combined fit on MRCCS and LRSP data was overall slightly more accurate and resulted in less underestimation of the uncertainty than the fit on LRSP data only, while the fit on MRCCS data only was largely inaccurate.
It is worth noting which molecule could be constrained and in what scenario. While the abundances of H_{2}O and CO resulted in narrow posterior distributions, those of CO_{2}, H_{2}S, TiO, and K almost were never retrieved accurately, with at most upper bounds and only a good constraint of K at high C/O ratio. This can be explained by the strong absorption features of H_{2}O and CO included in the spectral range covered by our synthetic data, while this is not the case for CO_{2}, H_{2}S, TiO, and K. This can be seen in Fig. 10, where we illustrate the spectral signatures of the molecules included in our retrieval by computing the absolute difference in flux between the full chemical model at equilibrium and when each molecule is removed successively. The abundance of CH_{4} seemed to be retrievable when it was significantly higher in the input atmosphere, namely at low temperature and at high C/O ratio. Finally, FeH, and VO obtained similar results with mostly correct upper bounds and a few good constraints of FeH at high C/O ratio and low metallicity when LRSP data were included, which can be explained by the spectral features of FeH and VO that are covered by the VLT/SPHERE/IFS Y – Hband LRS data but were not included in the MRCCS data.
Looking at Fig. 6 case by case, it seems that there is one case that yielded relatively worst scores than the others when including LRSP data (i.e. LRSP alone or combined with MRCCS), namely at high C/O ratio (case 5). Indeed, all three subgroups of parameters resulted in relatively higher Mahalanobis distance and relative error than the other cases. This can be understood better in Fig. E.1: the retrieved surface gravity, the abundances of CO and K as well as the metallicity all overestimated the ground truth by a few σ. However, this is more due to the small spread of the posterior distributions (as is the case for the surface gravity), quickly resulting in large Mahalanobis distance, or to the fact that the ground truth itself is a small number (for example the metallicity), also resulting in a large relative error. Looking at the mean squared error, the fifth case actually obtained fairly similar scores to the others.
Finally, the quality of an atmospheric retrieval is usually estimated from the fit to the input spectrum, which we provide in Fig. 8. Remembering that all our datasets contained random scatter corresponding to their errorbar, the spectral fit – computed using the median of the posterior distributions of each parameter – showed no significant deviation from the LRSP data when it was included in the fit. Nonsurprisingly, the retrieval using only MRCCS data did not match the LRSP data due to its inability to constrain the continuum and thus the thermal profile of the atmosphere. Indeed, as shown in Fig. 9, the temperaturepressure profile was largely unconstrained by the MRCCS data with a 1σ envelope of 200 K mostly overestimating the input profile except in the cases 6 and 7. Both the fit on LRSP data only and the combined fit were able to precisely constrain the thermal profile.
Fig. 6 Scores of the three metrics summarising the results of our exploration of the parameter space of gas giants using LRSP, MRCCS, or LRSP+MRCCS (CROCODILE). Each column of panels represents a different metric, and the three rows refer to different parameters grouped together. Within each panel, the score is shown for each technique and for each simulated atmosphere (cf. Table 2 for the input parameters), and averaged over the group of parameters. The dotted lines show the average score over the seven cases. 
Fig. 7 Posterior distributions obtained for the retrievals of our simulated atmosphere 5 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The input values are shown as dashed red lines. 
4 Discussion
4.1 Applicability of CROCODILE
The results presented in the last section show the robustness of the framework of CROCODILE to retrieve the parameters of a diverse population of gas giant atmospheres across a range of temperatures and chemical composition. However, our analysis also showed that the addition of Kband SINFONI MRCCS data along the list of LRSP data in the retrieval did not significantly improve our results: the posterior distributions mostly matched, with a slight gain of accuracy and better estimation of the uncertainty as reported in Sect. 3.5. We argue that this is because of the highquality Kband VLTI/GRAVITY spectrum at a spectral resolution of R ≈ 500 which can already constrain the abundance of CO on its own, and hence the mediumresolution spectrum provided by VLT/SINFONI does not add much information that is not already there. To demonstrate this point, we repeated the retrievals on case 1, however this time we excluded the GRAVITY spectrum from the fit. The results of this final test are presented in Fig. 11 and in Table F.5. Indeed, the lowresolution Y to Hband VLT/SPHERE spectrum together with the photometric points were not enough on their own to constrain the molecular abundance of CO, leading to a mostly undetermined C/O ratio and a poorly constrained metallicity. When the Kband SINFONI data were added to the fit, the difference was significant: the abundance of CO became well constrained, leading to a good estimation of the C/O ratio ($\text{C}/\text{O}={0.41}_{0.16}^{+0.17}$ against the input value of 0.49) and accurate constraint of the metallicity ($\text{Fe}/\text{H}={0.61}_{0.23}^{+0.30}$ against the input value of 0.61).
While Sect. 3.5 demonstrated the robustness of CROCODILE against different input atmospheres, this last test showed its actual strength. Namely, it allows the combination of low signaltonoise crosscorrelation spectroscopic data together with a few photometric points and lowresolution spectroscopic data to measure the molecular abundances of closely separated directly imaged lowmass companions, without the need for highquality spectroscopic data such as measured by VLTI/GRAVITY.
4.2 Future opportunities
The major disadvantage of removing the continuum to perform crosscorrelation with spectral templates is made obvious in Fig. E.1: the information carrying the temperature is mostly lost with the continuum, rendering its determination almost impossible using only crosscorrelation and further degrading the determination of the other parameters such as surface gravity, C/O ratio, and metallicity. It therefore seems that the continuum should not be removed. If it is true that molecular lines present in substellar companions differ enough from the atomic lines found in stellar spectra, then this fact stays true if the continuum of the planet can be retrieved. Several studies are indeed able to retrieve the continuum of directly imaged lowmass companions using mediumresolution spectroscopy: for example Daemgen et al. (2017) and Petrus et al. (2021) were able to measure the spectrum of the planetarymass objects HD 106906 b and HIP 65426 b using VLT/SINFONI, while most recently Miles et al. (2022) released the spectrum of the young brown dwarf VHS 1256 b taken with the NIRSpec IFU and MIRI MRS onboard JWST, the highest fidelity spectrum to date of a planetarymass object, which will undoubtedly yield the most robust derivation of the atmospheric properties of a planetarymass object to date.
The point of our study is not that the continuum of those spectra could have been removed with similar constraints for the atmosphere of the exoplanet, but rather to enable the study of closein lowmass companions where the continuum cannot be extracted at all (or at least not within reasonable integration times) due to contamination by stellar speckles, as is the case in Hoeijmakers et al. (2018), Cugno et al. (2021), Ruffio et al. (2021), and Wang et al. (2021). Indeed, future directimaging searches for exoplanets will focus more and more on the innermost regions of planetary systems, where radial velocity searches indicate that the occurrence rate of planets is higher (Cumming et al. 2008; Fulton et al. 2021). However, at such close separations, the location of a planet is only at a few resolution elements from the central star. There, the stellar PSF is still so intense that extracting medium and high spectral resolution spectra for such planets will come at the cost of the continuum – at least initially. That is when CROCODILE – and similar statistical frameworks based on the crosscorrelation of continuumremoved spectra with spectral models (e.g. Ruffio et al. 2021) – comes into play by enabling a full Bayesian analysis to characterise more and more challenging directly imaged exoplanets.
Fig. 8 Comparison of the fitted spectra to our simulated datasets 1 to 7 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The fitted spectra were calculated using our forward model from the posterior median of the retrieved distributions shown in Fig. E.1. The 11 colourful crosses represent the simulated photometric values and their corresponding uncertainty and equivalent width; the orange and blue error bars show the simulated YHband VLT/SPHERE/IFS and Kband VLTI/GRAVITY spectra and corresponding uncertainties; and the contiuumsubtracted light green spectrum shows the simulated VLT/SINFONI MRCCS data. 
Fig. 9 Posterior thermal profiles obtained for the retrievals of our simulated atmospheres 1 to 7 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The different envelopes of each colour represent the 68th, 95th, and 99th percentile credible intervals – corresponding to 1σ 2σ and 3σ of a normal distribution – of the posterior distributions of the retrieved thermal profiles. The input pT profiles are shown as dashed red lines. 
5 Summary and conclusion
The results of this work are summarised in the following: – We adapted the statistical framework of Brogi & Line (2019) from exoplanet transit spectroscopy to the different observational techniques of directly imaged exoplanets, including photometry, lowresolution spectroscopy, and medium (and higher) resolution crosscorrelation spectroscopy, which can constrain the atmospheric properties of gas giant exoplanets such as temperaturepressure profile and chemical composition. Our Python routine, called CROCODILE, is freely available for download on GitHub^{5}.
We thoroughly tested our framework, first on a synthetic atmosphere created with the same input as our forward model to verify the framework, and then on a grid of atmospheres at chemical equilibrium to test the ability of our free forward model to robustly constrain atmospheric properties in a realistic scenario.
We find that using mediumresolution crosscorrelation spectroscopy alone leads to a generally poor constraint of the atmospheric parameters due to the loss of the continuum. The results improved a lot when fixing the planetary radius to its ground truth. However, we noticed strong correlations between the radius and the surface gravity and the molecular abundances of H_{2}O and CO This means that fixing the radius to a wrong value might strongly bias the retrieved values.
We further find that combining mediumresolution crosscorrelation spectroscopy to photometry and lowresolution spectroscopy leads to an accurate determination of the atmospheric parameters. Indeed, 68% of the parameters were retrieved within 0.9σ of the ground truth, with no systematical bias despite the different chemical model used in the forward model and in the simulated data.
Our last test shows that lowquality (i.e. high noise) mediumresolution (R ≈ 4000) Kband crosscorrelation spectroscopy, when combined with photometry, leads to an equivalently accurate constraint of the abundance of CO than highquality (i.e. low noise) lowresolution (R ≈ 500) Kband spectroscopy. To put into perspective, the actual observations that our synthetic data are based on was acquired in 2.5 h of integration time for VLT/SINFONI on one telescope (Hoeijmakers et al. 2018) versus 1.4 hr for VLTI/GRAVITY on four telescopes (Nowak et al. 2020), that is the mediumresolution crosscorrelation spectroscopy required less than half of the total telescope time that was required for the lowresolution spectrointerferometry.
Our work shows that mediumresolution crosscorrelation spectroscopy in the K band is best interpreted alongside Y – Mband photometry and, if available, lowresolution spectroscopy in the Y – H bands, to robustly infer the atmospheric properties of gas giant exoplanets.
In summary, CROCODILE provides the statistical framework to interpret medium (and possibly higher) resolution spectroscopic data alongside photometric and lowresolution spectroscopic data of directly imaged gas giant exoplanets at close separations to their host star, where the continuum of their spectrum can only be extracted reliably at low spectral resolution due to the pollution by the stellar PSF. CROCODILE allows the measurement of the atmospheric properties of gas giants such as the pressuretemperature profile and the abundances of individual molecules. In the future, we plan to add the option to choose between different thermal, chemical, and cloud models, in order to allow model selection via computation of Bayes’ factor between competing models.
With the arrival of the next generation of exoplanet directimaging instruments such as ERIS at the VLT, MIRI onboard JWST, and the upcoming METIS at the ELT, it is crucial to define robust statistical frameworks that take into account the intricacies of the different observational and postprocessing techniques, while acknowledging the risk of model degeneracies. In particular, such frameworks should be validated on a representative subset of the parameter space of exoplanetary atmospheres to identify the limits of applicability of the forward model before being carried out on real data.
Fig. 10 Absolute difference between the spectrum calculated with the full chemical model used for our simulated atmosphere 1 (as described in Table 2) and spectra where each molecule included in our framework was removed one after the other. For example water shows a large difference due to its absorption features throughout the whole range. The black lines indicate the spectral coverage of the different spectroscopic instruments included in our tests. 
Fig. 11 Posterior distributions obtained when removing the synthetic VLTI/GRAVITY data from the atmospheric fit. The input atmosphere is the same as case 1 (cf. Table 2), and as in Fig. 7 we also compare the results when applying CROCODILE to MRCCS data alone (orange), LRSP data (blue), and on both MRCCS+LRSP combined (green). However, this time the lowresolution spectroscopy only included synthetic VLT/SPHERE data in the Y – H bands. The input values are shown as dashed red lines. 
Acknowledgements
The authors thank the anonymous referee for their constructive insights which helped to improve the quality of this work. JH and SPQ gratefully acknowledge the financial support from the Swiss National Science Foundation (SNSF) under project grant number 200020_200399. GC, PP, EOG thank the SNSF for financial support under grant numbers P500PT_206785 and 200020_200399. MJB acknowledges the financial support from ETH Zurich. This work has been carried out within the framework of the NCCR PlanetS supported by the Swiss National Science Foundation under grants 51NF40_182901 and 51NF40_205606. TDG acknowledges funding from the Max Planck ETH Center for Learning Systems. Some of our plots were made using the Python package corner.py developed by ForemanMackey (2016). Contributions: the authors listed in this article are ordered according to the following: JH built the framework, did the analysis, and wrote the paper; GC, SPQ, and PP provided feedback and supervision during the whole project; while the rest of the coauthors, who are ordered alphabetically, provided feedback in later stages of the project.
Appendix A Derivation of the loglikelihood function for crosscorrelation spectroscopy
In the following, we provide the derivation of the loglikelihood function log ℒ_{CCS} given in Eq. 4 originally developed by Brogi & Line (2019), but adapted to our notation. Let d = (d_{1},…, d_{N}) be the measured continuumremoved spectrum of an observed exoplanet with N spectral channels and m (θ, v_{R}) =: m = (m_{1},…, m_{N}) be the associated forward model. For the sake of readability, we drop the dependency on the model parameters θ and the radial velocity V_{R} in the following. We assume that the data are of the form d = am + n, with a scaling factor a and noise n = (n_{1},…, n_{N}), where the components obey a normal distribution with a common standard deviation σ, that is n_{i} ~ N (0, σ). Brogi & Line (2019) choose a = 1 and scale their model spectrum by the stellar flux and the area ratio between the planet and the star. In our study, the data and the forward model were created with the same scaling factor, therefore a = 1 also holds here. However, if the observed spectrum is provided in arbitrary units, then one could replace the multiplicative factor R^{2}/d^{2} used in our forward model by a and treat is as a nuisance parameter.
The likelihood function associated to the noise statistics is given by
$$\mathcal{L}={\left(\frac{1}{\sqrt{2\pi {\sigma}^{2}}}\right)}^{N}\mathrm{exp}\left({\displaystyle \sum _{i=1}^{N}\frac{{\left({d}_{i}{m}_{i}\right)}^{2}}{2{\sigma}^{2}}}\right),$$(A.1)
which leads to the following loglikelihood function (ignoring constant additive terms):
$$\mathrm{log}\mathcal{L}\approx N\mathrm{log}\sigma \frac{1}{2{\sigma}^{2}}{\displaystyle \sum _{i=1}^{N}{\left({d}_{i}{m}_{i}\right)}^{2}}.$$(A.2)
The variable σ represents the uncertainty associated to each spectral channel and is assumed to be constant for all spectral channels. It is challenging to compute due to the pollution by telluric absorption lines or stellar light. Therefore, we replace it by its maximum likelihood estimator $\widehat{\sigma}$:
$$\begin{array}{l}\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}0=\frac{\partial \mathrm{log}\mathcal{L}}{\partial \sigma}=\frac{N}{\sigma}+\frac{1}{{\sigma}^{3}}{\displaystyle \sum _{i=1}^{N}{\left({d}_{i}{m}_{i}\right)}^{2}}\hfill \\ {\widehat{\sigma}}^{2}=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}{\left({d}_{i}{m}_{i}\right)}^{2}}.\hfill \end{array}$$(A.3)
By inserting $\widehat{\sigma}$ back into Eq. A.2 and expanding the product (d_{i} – m_{i})^{2} = ${d}_{i}^{2}2{d}_{i}{m}_{i}+{m}_{i}^{2}$, we finally obtain:
$$\begin{array}{l}\mathrm{log}\mathcal{L}\approx \frac{N}{2}\mathrm{log}\left(\frac{1}{N}{\displaystyle \sum _{i=1}^{N}\left({d}_{i}^{2}2{d}_{i}{m}_{i}+{m}_{i}^{2}\right)}\right)\frac{N}{2}\hfill \\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}\approx \frac{N}{2}\mathrm{log}\left({s}_{d}^{2}2K+{s}_{m}^{2}\right),\hfill \end{array}$$(A.4)
with the definitions
$${s}_{d}^{2}=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}{d}_{i}^{2}}$$(A.5)
$${s}_{m}^{2}=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}{m}_{i}^{2}}$$(A.6)
$$K=\frac{1}{N}{\displaystyle \sum _{i=1}^{N}{d}_{i}{m}_{i}},$$(A.7)
which corresponds to Eq. 4. For the sake of clarity, we note here again that instead of the crosscovariance K, one can use the normalised crosscorrelation C by using Eq. 8 and inserting it into the loglikelihood function:
$$\mathrm{log}\mathcal{L}\approx \frac{N}{2}\mathrm{log}\left({s}_{d}^{2}2K+{s}_{m}^{2}\right)$$(A.8)
$$\approx \frac{N}{2}\mathrm{log}\left({s}_{d}^{2}2C\sqrt{{s}_{d}^{2}{s}_{m}^{2}}+{s}_{m}^{2}\right).$$(A.9)
Appendix B The retrieval setup
Our atmospheric retrieval was setup in the following configuration. We set the equilibrium temperature T_{equ} and logarithm of the surface gravity log ɡ as free retrievable parameters governing the thermal structure. Implicitly, the thermal structure from Guillot (2010) contains four more parameters which were kept fixed in the simulated datasets and in the forward model: the average opacity in the infrared κ_{IR} = 0.01 cm^{2}/g, the ratio of the average opacity in the optical and infrared γ = κ_{V}/κ_{IR} = 0.4, the interior temperature T_{int} = 200 K, and the bottom pressure P_{0} = 100 bar. For the free parameters governing the chemical composition, we included the logarithms of the mass fractions (assumed to be vertically constant) of the following molecules: H_{2}O, CO, CO_{2}, CH_{4}, H_{2}S, FeH, TiO, K, and VO. Finally, we set the abundances of H_{2} and He equal to 75 % and 25 % of the remaining mass fraction in each layer. The last free parameter of the model was the radius of the planet R. The priors used in this study are given in Table B.1.
Priors used.
Appendix C Derivation of the optimal size of the Gaussian filter
We derived the optimal size of the Gaussian filter used to remove the continuum of our synthetic mediumresolution spectroscopic data with the following test. We repeated the setup of Sect. 3.3 on MRCCS data alone for filter sizes of 1.2, 2.4, 4.9, 7.4, 14.7, 22.1, 29.4, 36.8, and 49 nm (corresponding to 5, 10, 20, 30, 60, 90, 120, 150, and 200 spectral bins at the spectral resolution of VLT/SINFONI), and computed the same three metrics as in Sect. 3.5. To obtain an overall score for each of the three metrics, we multiplied the results for each parameter together, and divided by the worst score to give the results as percentages. The result of this analysis is shown in Fig. C.1. The filter size at 4.9 nm outperformed all other filters with respect to the Mahalanobis and relative distance, and was a close second in terms of mean squared error (7.0% against 5.4%).
Fig. C.1 Scores for different sizes of the Gaussian filter used to remove the continuum of the synthetic mediumresolution spectroscopic data used in this work. 
Appendix D Results: Verification of the framework
Fig. D.1 Resulting posterior distributions of our preliminary test on MRCCS data when fixing the radius to its simulated value (orange), compared to the previous results with the radius included as free parameter (grey: old results on MRCCS data, blue: only LRSP, green: MRCCS and LRSP). The dashed red line shows the ground truth used as input. 
Appendix E Results: Exploration of the parameter space
Fig. E.1 Comparison of the posterior distributions obtained for the retrievals of our simulated atmospheres 1 to 7 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The posterior distributions are shown as histograms of the sampled parameters and are to be read row by row for each model parameter and column by column to compare to the other input atmospheres. The input values are shown as dashed red lines. The input molecular abundances – which were calculated at chemical equilibrium and therefore vary vertically with the pressure and the temperature – are shown as functions of the atmospheric layers between 10^{2} and 10^{−6} bar in logarithmic scale. The posterior distributions of the C/O ratio and metallicity Fe/H are inferred from the posterior distributions of the molecular abundances. 
Appendix F Detailed numerical results of our study
List of posterior median and 68th percentile credible interval for the retrievals based only on MRCCS data.
List of posterior median and 68th percentile credible interval for the retrievals based only on LRSP data.
List of posterior median and 68th percentile credible interval for the retrievals based on both MRCCS and LRSP data.
References
 Allard, F. 2013, Proc. Int. Astron., 8, 271 [NASA ADS] [CrossRef] [Google Scholar]
 Bayes, T. 1763, Phil. Trans. R. Soc., 53, 370 [CrossRef] [Google Scholar]
 Beuzit, J.L., Vigan, A., Mouillet, D., et al. 2019, A & A, 631, A155 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Bonnefoy, M., Lagrange, A. M., Boccaletti, A., et al. 2011, A & A, 528, L15 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Bonnefoy, M., Zurlo, A., Baudino, J. L., et al. 2016, A & A, 587, A58 [CrossRef] [EDP Sciences] [Google Scholar]
 Borysow, A. 2002, A & A, 390, 779 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Borysow, A., & Frommhold, L. 1989, ApJ, 341, 549 [NASA ADS] [CrossRef] [Google Scholar]
 Borysow, J., Frommhold, L., & Birnbaum, G. 1988, ApJ, 326, 509 [NASA ADS] [CrossRef] [Google Scholar]
 Borysow, A., Frommhold, L., & Moraldi, M. 1989, ApJ, 336, 495 [NASA ADS] [CrossRef] [Google Scholar]
 Borysow, A., Jørgensen, U. G., & Fu, Y. 2001, J. Quant. Spec. Radiat. Transf., 68, 235 [NASA ADS] [CrossRef] [Google Scholar]
 Brandl, B., Bettonvil, F., Van Boekel, R., et al. 2021, The Messenger, 182, 22 [NASA ADS] [Google Scholar]
 Brogi, M., & Line, M. R. 2019, ApJ, 157, 114 [Google Scholar]
 Bryan, M. L., Chiang, E., Bowler, B. P., et al. 2020, AJ, 159, 181 [Google Scholar]
 Buchner, J., Georgakakis, A., Nandra, K., et al. 2014, A & A, 564, A125 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Burrows, A., & Sharp, C. M. 1999, ApJ, 512, 843 [NASA ADS] [CrossRef] [Google Scholar]
 Burrows, A., Budaj, J., & Hubeny, I. 2008, ApJ, 678, 1436 [Google Scholar]
 Carnall, A. C. 2017, arXiv eprints [arXiv:1705.05165] [Google Scholar]
 Chan, Y. M., & Dalgarno, A. 1965, Proc. Phys. Soc., 85, 227 [NASA ADS] [CrossRef] [Google Scholar]
 Chubb, K. L., Rocchetto, M., Yurchenko, S. N., et al. 2021, A & A, 646, A21 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Cugno, G., Patapis, P., Stolker, T., et al. 2021, A & A, 653, A12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Cumming, A., Butler, R. P., Marcy, G. W., et al. 2008, PASP, 120, 531 [CrossRef] [Google Scholar]
 Currie, T., Burrows, A., Madhusudhan, N., et al. 2013, ApJ, 776, 15 [NASA ADS] [CrossRef] [Google Scholar]
 Czesla, S., Schröter, S., Schneider, C. P., et al. 2019, PyA: Python astronomyrelated packages, Astrophysics Source Code Library [record ascl:1906.010] [Google Scholar]
 Daemgen, S., Todorov, K., Quanz, S. P., et al. 2017, A & A, 608, A71 [CrossRef] [EDP Sciences] [Google Scholar]
 Dalgarno, A., & Williams, D. A. 1962, ApJ, 136, 690 [NASA ADS] [CrossRef] [Google Scholar]
 Davies, R., Esposito, S., Schmid, H. M., et al. 2018, in Groundbased and Airborne Instrumentation for Astronomy VII (SPIE) [Google Scholar]
 Eisenhauer, F., Abuter, R., Bickert, K., et al. 2003, in SPIE Proceedings, 4841: Instrument Design and Performance for Optical/Infrared Groundbased Telescopes (SPIE) [Google Scholar]
 Feroz, F., Hobson, M. P., & Bridges, M. 2009, MNRAS, 398, 1601 [NASA ADS] [CrossRef] [Google Scholar]
 Feroz, F., Hobson, M. P., Cameron, E., & Pettitt, A. N. 2019, OJAp, 2, 10 [Google Scholar]
 ForemanMackey, D. 2016, JOSS, 1, 24 [NASA ADS] [CrossRef] [Google Scholar]
 Fortney, J. J., Lodders, K., Marley, M. S., et al. 2008, ApJ, 678, 1419 [CrossRef] [Google Scholar]
 Fulton, B. J., Rosenthal, L. J., Hirsch, L. A., et al. 2021, ApJS, 255, 14 [NASA ADS] [CrossRef] [Google Scholar]
 Gaia Collaboration 2018, VizieR Online Data Catalog: I/345 [Google Scholar]
 Gandhi, S., & Madhusudhan, N. 2017, MNRAS, 472, 2334 [Google Scholar]
 Gandhi, S., & Madhusudhan, N. 2018, MNRAS, 474, 271 [Google Scholar]
 George, E. M., Gräff, D., Feuchtgruber, H., et al. 2016, in SPIE Proceedings, 9908: Groundbased and Airborne Instrumentation for Astronomy VI (SPIE) [Google Scholar]
 Greco, J. P., & Brandt, T. D. 2016, ApJ, 833, 134 [NASA ADS] [CrossRef] [Google Scholar]
 Griffith, C. A., & Yelle, R. V. 2000, ApJ, 532, L59 [NASA ADS] [CrossRef] [Google Scholar]
 Guillot, T. 2010, A & A, 520, A27 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Haffert, S. Y., Bohn, A. J., de Boer, J., et al. 2019, Nat. Astron., 3 [Google Scholar]
 Hoch, K. K. W., Konopacky, Q. M., Theissen, C. A., et al. 2022, 2023, AJ, 166, 85 [NASA ADS] [CrossRef] [Google Scholar]
 Hoeijmakers, H. J., Schwarz, H., Snellen, I. A. G., et al. 2018, A & A, 617, A144 [CrossRef] [EDP Sciences] [Google Scholar]
 Konrad, B. S., Alei, E., Quanz, S. P., et al. 2022, A & A, 664, A23 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Lagrange, A.M., Bonnefoy, M., Chauvin, G., et al. 2010, Science, 329, 57 [Google Scholar]
 Lagrange, A.M., Rubini, P., Nowak, M., et al. 2020, A & A, 642, A18 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Langlois, M., Gratton, R., Lagrange, A.M., et al. 2021, A & A, 651, A71 [Google Scholar]
 Larkin, J., Barczys, M., Krabbe, A., et al. 2006, in SPIE Proc., 6269: Groundbased and Airborne Instrumentation for Astronomy (SPIE) [Google Scholar]
 Lavie, B., Mendonça, J. M., Mordasini, C., et al. 2017, ApJ, 154, 91 [CrossRef] [Google Scholar]
 Lee, J.M., Fletcher, L. N., & Irwin, P. G. J. 2012, MNRAS, 420, 170 [NASA ADS] [CrossRef] [Google Scholar]
 Line, M. R., Zhang, X., Vasisht, G., et al. 2012, ApJ, 749, 93 [Google Scholar]
 Line, M. R., Wolf, A. S., Zhang, X., et al. 2013, ApJ, 775, 137 [NASA ADS] [CrossRef] [Google Scholar]
 Line, M. R., Marley, M. S., Liu, M. C., et al. 2017, ApJ, 848, 83 [NASA ADS] [CrossRef] [Google Scholar]
 Macintosh, B., Graham, J. R., Ingraham, P., et al. 2014, PNAS, 111, 12661 [NASA ADS] [CrossRef] [Google Scholar]
 Madhusudhan, N. 2018, in Handbook of Exoplanets (Springer International Publishing), 2153 [CrossRef] [Google Scholar]
 Madhusudhan, N., Harrington, J., Stevenson, K. B., et al. 2011, Nature, 469, 64 [NASA ADS] [CrossRef] [Google Scholar]
 Males, J. R., Close, L. M., Morzinski, K. M., et al. 2014, ApJ, 786 [Google Scholar]
 Miles, B. E., Biller, B. A., Patapis, P., et al. 2022, 2023, ApJ, 946, L6 [NASA ADS] [CrossRef] [Google Scholar]
 Mollière, P., Van Boekel, R., Bouwman, J., et al. 2017, A & A, 600, A10 [CrossRef] [EDP Sciences] [Google Scholar]
 Mollière, P., Wardenier, J. P., Van Boekel, R., et al. 2019, A & A, 627, A67 [CrossRef] [EDP Sciences] [Google Scholar]
 Mollière, P., Stolker, T., Lacour, S., et al. 2020, A & A, 640, 131 [Google Scholar]
 Mollière, P., Molyarova, T., Bitsch, B., et al. 2022, ApJ, 934, 74 [CrossRef] [Google Scholar]
 Mordasini, C., van Boekel, R., Mollière, P., et al. 2016, ApJ, 832, 41 [NASA ADS] [CrossRef] [Google Scholar]
 Moses, J. I., Visscher, C., Fortney, J. J., et al. 2011, ApJ, 737, 15 [Google Scholar]
 Nowak, M., Lacour, S., Mollière, P., et al. 2020, A & A, 633, A110 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Öberg, K. I., & Bergin, E. A. 2016, ApJ, 831, L19 [Google Scholar]
 Öberg, K. I., MurrayClay, R., & Bergin, E. A. 2011, ApJ, 743, L16 [Google Scholar]
 Ohno, K., & Fortney, J. J. 2022, AAS J., submitted, [arXiv:2211.16877] [Google Scholar]
 PalmaBifani, P., Chauvin, G., Bonnefoy, M., et al. 2023, A & A, 670, A90 [CrossRef] [EDP Sciences] [Google Scholar]
 Patapis, P., Nasedkin, E., Cugno, G., et al. 2022, A & A, 658, A72 [CrossRef] [EDP Sciences] [Google Scholar]
 Petit dit de la Roche, D. J. M., Hoeijmakers, H. J., & Snellen, I. A. G. 2018, A & A, 616, A146 [Google Scholar]
 Petrus, S., Bonnefoy, M., Chauvin, G., et al. 2021, A & A, 648, A59 [CrossRef] [EDP Sciences] [Google Scholar]
 Piskunov, N. E., Kupka, F., Ryabchikova, T. A., Weiss, W. W., & Jeffery, C. S. 1995, A & AS, 112, 525 [NASA ADS] [Google Scholar]
 Piso, A.M. A., Pegues, J., & Öberg, K. I. 2016, ApJ, 833, 203 [Google Scholar]
 Quirrenbach, A., Larkin, J. E., Krabbe, A., Barczys, M., & LaFreniere, D. 2003, in SPIE Proc., 4841: Instrument Design and Performance for Optical/Infrared Groundbased Telescopes, eds. M. Iye, & A. F. M. Moorwood, 1493 [NASA ADS] [CrossRef] [Google Scholar]
 Richard, C., Gordon, I., Rothman, L., et al. 2012, J. Quant. Spec. Radiat. Transf., 113, 1276 [NASA ADS] [CrossRef] [Google Scholar]
 Rothman, L., Gordon, I., Barber, R., et al. 2010, J. Quant. Spec. Radiat. Transf., 111, 2139 [NASA ADS] [CrossRef] [Google Scholar]
 Rothman, L., Gordon, I., Babikov, Y., et al. 2013, J. Quant. Spec. Radiat. Transf., 130, 4 [NASA ADS] [CrossRef] [Google Scholar]
 Ruffio, J.B., Macintosh, B., Konopacky, Q. M., et al. 2019, AJ, 158, 200 [Google Scholar]
 Ruffio, J.B., Konopacky, Q. M., Barman, T., et al. 2021, AJ, 162, 290 [NASA ADS] [CrossRef] [Google Scholar]
 Seager, S., Richardson, L. J., Hansen, B. M. S., et al. 2005, ApJ, 632, 1122 [NASA ADS] [CrossRef] [Google Scholar]
 Skilling, J. 2006, Bayesian Anal., 1, 833 [Google Scholar]
 Sparks, W. B., & Ford, H. C. 2002, ApJ, 578, 543 [Google Scholar]
 Stolker, T., Bonse, M. J., Quanz, S. P., et al. 2019, A & A, 621, A59 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Stolker, T., Quanz, S. P., Todorov, K. O., et al. 2020, A & A, 635, A182 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
 Todorov, K. O., Line, M. R., Pineda, J. E., et al. 2016, ApJ, 823, 14 [CrossRef] [Google Scholar]
 van der Marel, N., Bosman, A. D., Krijt, S., et al. 2021, A & A, 653, A9 [Google Scholar]
 Waldmann, I. P., Tinetti, G., Rocchetto, M., et al. 2015, ApJ, 802, 107 [CrossRef] [Google Scholar]
 Wang, J., Wang, J. J., Ma, B., et al. 2020, AJ, 160, 150 [NASA ADS] [CrossRef] [Google Scholar]
 Wang, J. J., Ruffio, J.B., Morris, E., et al. 2021, AJ, 162, 148 [NASA ADS] [CrossRef] [Google Scholar]
 Wang, J., Wang, J. J., Ruffio, J.B., et al. 2023, AJ, 165, 4 [NASA ADS] [CrossRef] [Google Scholar]
 Wells, M., Pel, J.W., Glasse, A., et al. 2015, PASP, 127, 646 [NASA ADS] [CrossRef] [Google Scholar]
 Xuan, J. W., Wang, J., Ruffio, J.B., et al. 2022, ApJ, 937, 54 [NASA ADS] [CrossRef] [Google Scholar]
 Yurchenko, S. N., & Tennyson, J. 2014, MNRAS, 440, 1649 [Google Scholar]
 Zurlo, A., Vigan, A., Galicher, R., et al. 2016, A & A, 587, A57 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
CROCODILE is available as a Python package at https://github.com/JHayoz/CROCODILE
The documentation of petitRADTRANS is available at https://petitRADTRANS.readthedocs.io/en/latest/content/available_opacities.html
A grid of molecular abundances precomputed by easy CHEM can be found at https://petitradtrans.readthedocs.io/en/latest/content/notebooks/poor_man.html
All Tables
List of posterior median and 68th percentile credible interval for the retrievals based only on MRCCS data.
List of posterior median and 68th percentile credible interval for the retrievals based only on LRSP data.
List of posterior median and 68th percentile credible interval for the retrievals based on both MRCCS and LRSP data.
All Figures
Fig. 1 Schematic of our atmospheric retrieval framework CROCODILE. The input of the routine is the data – which can be photometry, low, or mediumresolution spectroscopy –, the forward model, the opacity database, and the priors. CROCODILE starts by sampling parameters from the prior, from which the forward model is evaluated. It is then compared to the data by computing the sum of the loglikelihood functions corresponding to the three observing techniques, from which new parameters are sampled and the loop restarts. The output is the posterior distribution of the forward model. 

In the text 
Fig. 2 Resulting posterior distributions for our preliminary test that controls our implementation of CROCODILE using the same chemistry as input as our forward model, namely vertically constant abundances of H_{2}O and CO, and reduced random scatter. The orange posterior shows the results of the retrieval using only MRCCS data to test the implementation of the crosscorrelation likelihood function, the blue posterior is for the regular framework using only LRSP, and the green posterior shows the result for the combination of MRCCS and LRSP data. The dashed red line shows the ground truth used as input. 

In the text 
Fig. 3 Results of the three retrievals of the same simulated atmosphere using different combinations of the synthetic spectra of β Pic b: the SINFONI MRCCSonly retrieval in orange, the LRSPonly retrieval in blue, and the retrieval using all techniques in green. The offaxis panels show the projected 2D posterior distributions of the main model parameters (readers can refer to the description in Sect. 2.4). The indicated values above each histogram on the diagonal refer to the median and 1sigma uncertainty for CROCODILE. For the molecular abundances, we omit the logarithm in front of the names of the molecules for the sake of readability. The dashed red line represents the true parameter value, and it is not shown for the abundances given the different chemical treatment between simulated observations and forward model. 

In the text 
Fig. 4 Retrieved posterior distributions of all the molecular abundances included in the three retrievals of the experiment described in 3.4. The solid red lines represent the profile (in log scale) of each molecular abundance along the vertical extent of the atmosphere between 100 bar at the bottom and 10^{−6} bar at the top. 

In the text 
Fig. 5 Results of the experiment described in 3.4. Top panel: thermal profiles retrieved for each technique (MRCCS: orange, LRSP: blue, CROCODILE: green). The two levels of opacity correspond to the 16 to 84 and the 2 to 98 percentile envelopes of the thermal profile. The legend reports the median and 64 % confidence interval retrieved for the equilibrium temperature. Middle and bottom panel: posterior distributions of the C/O ratio and metallicity Fe/H inferred from the retrieved abundances for each technique. The dashed lines represent the posterior median. The dashed red line shows the ground truth in every panel. 

In the text 
Fig. 6 Scores of the three metrics summarising the results of our exploration of the parameter space of gas giants using LRSP, MRCCS, or LRSP+MRCCS (CROCODILE). Each column of panels represents a different metric, and the three rows refer to different parameters grouped together. Within each panel, the score is shown for each technique and for each simulated atmosphere (cf. Table 2 for the input parameters), and averaged over the group of parameters. The dotted lines show the average score over the seven cases. 

In the text 
Fig. 7 Posterior distributions obtained for the retrievals of our simulated atmosphere 5 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The input values are shown as dashed red lines. 

In the text 
Fig. 8 Comparison of the fitted spectra to our simulated datasets 1 to 7 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The fitted spectra were calculated using our forward model from the posterior median of the retrieved distributions shown in Fig. E.1. The 11 colourful crosses represent the simulated photometric values and their corresponding uncertainty and equivalent width; the orange and blue error bars show the simulated YHband VLT/SPHERE/IFS and Kband VLTI/GRAVITY spectra and corresponding uncertainties; and the contiuumsubtracted light green spectrum shows the simulated VLT/SINFONI MRCCS data. 

In the text 
Fig. 9 Posterior thermal profiles obtained for the retrievals of our simulated atmospheres 1 to 7 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The different envelopes of each colour represent the 68th, 95th, and 99th percentile credible intervals – corresponding to 1σ 2σ and 3σ of a normal distribution – of the posterior distributions of the retrieved thermal profiles. The input pT profiles are shown as dashed red lines. 

In the text 
Fig. 10 Absolute difference between the spectrum calculated with the full chemical model used for our simulated atmosphere 1 (as described in Table 2) and spectra where each molecule included in our framework was removed one after the other. For example water shows a large difference due to its absorption features throughout the whole range. The black lines indicate the spectral coverage of the different spectroscopic instruments included in our tests. 

In the text 
Fig. 11 Posterior distributions obtained when removing the synthetic VLTI/GRAVITY data from the atmospheric fit. The input atmosphere is the same as case 1 (cf. Table 2), and as in Fig. 7 we also compare the results when applying CROCODILE to MRCCS data alone (orange), LRSP data (blue), and on both MRCCS+LRSP combined (green). However, this time the lowresolution spectroscopy only included synthetic VLT/SPHERE data in the Y – H bands. The input values are shown as dashed red lines. 

In the text 
Fig. C.1 Scores for different sizes of the Gaussian filter used to remove the continuum of the synthetic mediumresolution spectroscopic data used in this work. 

In the text 
Fig. D.1 Resulting posterior distributions of our preliminary test on MRCCS data when fixing the radius to its simulated value (orange), compared to the previous results with the radius included as free parameter (grey: old results on MRCCS data, blue: only LRSP, green: MRCCS and LRSP). The dashed red line shows the ground truth used as input. 

In the text 
Fig. E.1 Comparison of the posterior distributions obtained for the retrievals of our simulated atmospheres 1 to 7 (cf. Table 2) using CROCODILE on MRCCS data alone (orange), on LRSP data (blue), and on both MRCCS+LRSP combined (green). The posterior distributions are shown as histograms of the sampled parameters and are to be read row by row for each model parameter and column by column to compare to the other input atmospheres. The input values are shown as dashed red lines. The input molecular abundances – which were calculated at chemical equilibrium and therefore vary vertically with the pressure and the temperature – are shown as functions of the atmospheric layers between 10^{2} and 10^{−6} bar in logarithmic scale. The posterior distributions of the C/O ratio and metallicity Fe/H are inferred from the posterior distributions of the molecular abundances. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.