Kybernetika 34 no. 4, 393-398, 1998

Fuzzy clustering of spatial binary data

Mô Dang and Gérard Govaert

Abstract:

An iterative fuzzy clustering method is proposed to partition a set of multivariate binary observation vectors located at neighboring geographic sites. The method described here applies in a binary setup a recently proposed algorithm, called Neighborhood EM, which seeks a partition that is both well clustered in the feature space and spatially regular. This approach is derived from the EM algorithm applied to mixture models [A.P. Dempster, N.M. Laird and D.B. Rubin: Maximum likelihood from incomplete data via the \mbox{EM} algorithm. J. Roy. Statist. Soc. 39 (1977), 1-38.], viewed as an alternate optimization method [R.J. Hathaway: Another interpretation of the \mbox{EM} algorithm for mixture distributions. Statist. Probab. Lett. 4 (1986), 53-56.]. The criterion optimized by EM is penalized by a spatial smoothing term that favors classes having many neighbors. The resulting algorithm has a structure similar to EM, with an unchanged M-step and an iterative E-step. The criterion optimized by Neighborhood EM is closely related to a posterior distribution with a multilevel logistic Markov random field as prior. The application of this approach to binary data relies on a mixture of multivariate Bernoulli distributions. Experiments on simulated spatial binary data yield encouraging results.