Measures of dispersion

Two summary statistics are provided, both of which are generalizations of the sample variance to multivariate data.

The first is the generalized variance of Wilks (1960), which provides a scalar measure of multidimensional scatter. For a random vector $X$, the generalized variance is defined as the determinant of the sample variance-covariance matrix of $X$, i.e. $|\text{Cov}(X)|$. Note that this can be 0 if there is linear dependence among the columns of $X$.

The second is the total variance. For a random vector $X$, the total variance is the matrix trace of the sample variance-covariance matrix of $X$, i.e. the sum of the sample variances of the columns of $X$. This can only be 0 if there is only a single observation.

When $X$ has only one column, both of these measures are equivalent to the sample variance of the column.

API

MultivariateTests.genvar — Function.

genvar(X)

Compute the generalized sample variance of X. If X is a vector, one-column matrix, or other one-dimensional iterable, this is equivalent to the sample variance. Otherwise if X is a matrix, this is equivalent to the determinant of the covariance matrix of X.

Note

The generalized sample variance will be 0 if the columns of the matrix of deviations are linearly dependent.

source

MultivariateTests.totalvar — Function.

totalvar(X)

Compute the total sample variance of X. If X is a vector, one-column matrix, or other one-dimensional iterable, this is equivalent to the sample variance. Otherwise if X is a matrix, this is equivalent to the sum of the diagonal elements of the covariance matrix of X.

source

References

Wilks, S.S. (1960). "Multidimensional Statistical Scatter." In Contributions to Probability and Statistics, I. Olkin et al., ed. Stanford University Press, Stanford, CA, pp. 486-503.