Open Access

Table D.1

List of the hyperparameters tuned for the grid optimisation of the data. The ones not mentioned kept their default values.

CatBoost hyperparmeters Description
learning_rate Controls the step size at each iteration of the gradient-boosting process.

depth Specifies the maximum depth of each decision tree in the ensemble.

reg_lambda Also known as L2 regularisation, it adds a penalty term to the loss function to prevent overfitting.

l2_leaf_reg It is another regularisation term that applies L2 regularisation specifically to the leaf weights of the trees.

iterations Determines the number of boosting iterations or the number of decision trees to be built in the ensemble.

random_strength Controls the randomness of feature selection during tree construction.

rsm Stands for row subsampling rate. Determines the portion of training data randomly sampled for each tree.

subsample Similar to rsm but operates at the level of the entire dataset rather than individual trees.

border_count Determines the number of discrete values for numerical features. Allows for more accurate splits.

bagging_temperature Controls the intensity of the internal bootstrap aggregation procedure.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.