Press Release
Free Access

Table 2.

Functions which can be computed on N dimensional grids and exploit the fast binning algorithm, which are readily available in vaex.

Statistic Description
count Counts the number of rows, or non-missing values of an expression.
sum Sum of non-missing values of an expression.
mean The sample mean of an expression.
var The sample variance of an expression, using a non-stable algorithm.
std The sample standard deviation of an expression using a non-stable algorithm.
min The minimum value of an expression.
max The maximum value of an expression.
minmax The minimum and maximum value of an expression (faster than min and max seperately).
covar The sample covariance between two expressions.
correlation The sample correlation coefficient between two expressions, i.e. .
cov The full covariance matrix for a list of expressions.
percentile_approx Estimates the percentile of an expression. Since the true value requires sorting of values, we implement an approximation by interpolation over a cumulative histogram.
median_approx Approximation of the median, based on the percentile statistic.
mode Estimates the mode of an expression by calculating the peak of its histogram.
mutual_information Calculates the mutual information for two or more expression, see Sect. 3.5.3 for details.
nearest Finds the nearest row to a particular point for given a metric.

Notes. All statistics can be computed for the full dataset, a subset using selections or multiple selections at the same time. For all calculations, missing values or NaN’s are ignored.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.