Kybernetika 34 no. 5, 515-534, 1998

Detection of influential points by convex hull volume minimization

Petr Tichavský and Pavel Boček

Abstract:

A method of geometrical characterization of multidimensional data sets, including construction of the convex hull of the data and calculation of the volume of the convex hull, is described. This technique, together with the concept of minimum convex hull volume, can be used for detection of influential points or outliers in multiple linear regression. An approximation to the true concept is achieved by ordering the data into a linear sequence such that the volume of the convex hull of the first $n$ terms in the sequence grows as slowly as possible with $n$. The performance of the method is demonstrated on four well known data sets. The average computational complexity needed for the ordering is estimated by $O(N^{2+(p-1)/(p+1)})$ for large $N$, where $N$ is the number of observations and $p$ is the data dimension, i. e. the number of predictors plus 1.