Kybernetika 48 no. 4, 690-713, 2012

phi-divergences, sufficiency, Bayes sufficiency, and deficiency

Friedrich Liese


\vspace{-1.6cm} The paper studies the relations between $\phi$-divergences and fundamental concepts of decision theory such as sufficiency, Bayes sufficiency, and LeCam's deficiency. A new and considerably simplified approach is given to the spectral representation of $\phi $-divergences already established in Österreicher and Feldman \cite{OestFeld} under restrictive conditions and in Liese and Vajda \cite{LiV06}, \cite{LiV08} in the general form. The simplification is achieved by a new integral representation of convex functions in terms of elementary convex functions which are strictly convex at one point only. Bayes sufficiency is characterized with the help of a binary model that consists of the joint distribution and the product of the marginal distributions of the observation and the parameter, respectively. LeCam's deficiency is expressed in terms of $\phi $-divergences where $\phi $ belongs to a class of convex functions whose curvature measures are finite and satisfy a normalization condition.


divergences, sufficiency, Bayes sufficiency, deficiency


62B05, 62B10, 62B15, 62G10


  1. M. S. Ali and D. Silvey: A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. Ser. B 28 (1966), 131-140.   CrossRef
  2. S. Arimoto: Information-theoretical considerations on estimation problems. Inform. Control. 19 (1971), 181-194.   CrossRef
  3. A. R. Barron, L. Györfi and E. C. van der Meulen: Distribution estimates consistent in total variation and two types of information divergence. IEEE Trans. Inform. Theory 38 (1990), 1437-1454.   CrossRef
  4. A. Berlinet, I. Vajda and E. C. van der Meulen: About the asymptotic accuracy of Barron density estimates. IEEE Trans. Inform. Theory 44 (1990), 999-1009.   CrossRef
  5. A. Bhattacharyya: On some analogues to the amount of information and their uses in statistical estimation. Sankhya 8 (1946), 1-14.   CrossRef
  6. H. Chernoff: A measure of asymptotic efficiency for test of a hypothesis based on the sum of observations. Ann. Math. Statist. 23 (1952), 493-507.   CrossRef
  7. B. S. Clarke and A. R. Barron: Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inform. Theory 36 (1990), 453-471.   CrossRef
  8. I. Csiszár: Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffscher Ketten. Publ. Math. Inst. Hungar. Acad. Sci.8 (1963), 84-108.   CrossRef
  9. I. Csiszár: Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar. 2, (1967), 299-318.   CrossRef
  10. T. Cover and J. Thomas: Elements of Information Theory. Wiley, New York 1991.   CrossRef
  11. M. H. De Groot: Optimal Statistical Decisions. McGraw Hill, New York 1970.   CrossRef
  12. D. Feldman and F. Österreicher: A note on $f$-divergences. Studia Sci. Math. Hungar. 24 (1989), 191-200.   CrossRef
  13. A. Guntuboyina: Lower bounds for the minimax risk using $f$-divergences, and applications. IEEE Trans. Inform. Theory 57 (2011), 2386-2399.   CrossRef
  14. C. Guttenbrunner: On applications of the representation of $f$-divergences as averaged minimal Bayesian risk. In: Trans. 11th Prague Conf. Inform. Theory, Statist. Dec. Funct., Random Processes A, 1992, pp. 449-456.   CrossRef
  15. L. Jager and J. A. Wellner: Goodness-of-fit tests via phi-divergences. Ann. Statist. 35 (2007), 2018-2053.   CrossRef
  16. T. Kailath: The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans. Commun. Technol. 15 (1990), 52-60.   CrossRef
  17. S. Kakutani: On equivalence of infinite product measures. Ann. Math. 49 (1948), 214-224.   CrossRef
  18. S. Kullback and R. Leibler: On information and sufficiency. Ann. Math. Statist. 22 (1951), 79-86.   CrossRef
  19. L. LeCam: Locally asymptotically normal families of distributions. Univ. Calif. Publ. 3, (1960), 37-98.   CrossRef
  20. L. LeCam: Asymptotic Methods in Statistical Decision Theory. Springer, Berlin 1986.   CrossRef
  21. F. Liese and I. Vajda: Convex Statistical Distances. Teubner, Leipzig 1987.   CrossRef
  22. F. Liese and I. Vajda: On divergence and informations in statistics and information theory. IEEE Trans. Inform. Theory 52 (2006), 4394-4412.   CrossRef
  23. F. Liese and I. Vajda: $f$-divergences: Sufficiency, deficiency and testing of gypotheses. In: Advances in Inequalities from Probability Theory and Statistics. (N. S. Barnett and S. S. Dragomir, eds.), Nova Science Publisher, Inc., New York 2008, pp. 113-149.   CrossRef
  24. F. Liese and K. J. Miescke: Statistical Decision Theory, Estimation, Testing and Selection. Springer, New York 2008.   CrossRef
  25. K. Matusita: Decision rules based on the distance, for problems of fit, two samples and estimation. Ann. Math. Statist. 26 (1955), 613-640.   CrossRef
  26. D. Mussmann: Decision rules based on the distance, for problems of fit, two samples and estimation. Studia Sci. Math. Hungar. 14 (1979), 37-41.   CrossRef
  27. X. Nguyen, M. J. Wainwright and M. I. Jordan: On surrogate loss functions and $f$-divergences. Ann. Statist. 37 (2009), 2018-2053.   CrossRef
  28. F. Österreicher and D. Feldman: Divergenzen von Wahrscheinlichkeitsverteilungen - integralgeometrisch betrachtet. Acta Math. Sci. Hungar. 37 (1981), 329-337.   CrossRef
  29. F. Österreicher and I. Vajda: Statistical information and discrimination. IEEE Trans. Inform. Theory 39 (1993), 1036-1039.   CrossRef
  30. J. Pfanzagl: A characterization of sufficiency by power functions. Metrika 21 (1974), 197-199.   CrossRef
  31. H. V. Poor: Robust decision design using a distance criterion. IEEE Trans. Inform. Theory 26 (1980), 578-587.   CrossRef
  32. M. R. C. Read and N. A. C. Cressie: Goodness-of-Fit Statistics for Discrete Multivariate Data. Springer, Berlin 1988.   CrossRef
  33. A. Rényi: On measures of entropy and information. In: Proc. 4th Berkeley Symp. on Probab. Theory and Math. Statist. Berkeley Univ. Press, Berkeley 1961, pp. 547-561.   CrossRef
  34. A. W. Roberts and D. E. Varberg: Convex Functions. Academic Press, New York 1973.   CrossRef
  35. M. J. Schervish: Theory of Statistics. Springer, New York 1995.   CrossRef
  36. C. E. Shannon: A mathematical theory of communication. Bell. Syst. Tech. J. 27 (1948), 379-423, 623-656.   CrossRef
  37. H. Strasser: Mathematical Theory of Statistics. De Gruyter, Berlin 1985.   CrossRef
  38. F. Topsøe: Information-theoretical optimization techniques. Kybernetika 15 (1979), 7-17.   CrossRef
  39. F. Topsøe: Some inequalities for information divergence and related measures of discrimination. IEEE Trans. Inform. Theory 46 (2000), 1602-1609.   CrossRef
  40. E. Torgersen: Comparison of Statistical Experiments. Cambridge Univ. Press, Cambridge 1991.   CrossRef
  41. I. Vajda: On the $f$-divergence and singularity of probability measures. Periodica Math. Hungar. 2 (1972), 223-234.   CrossRef
  42. I. Vajda: Theory of Statistical Inference and Information. Kluwer Academic Publishers, Dordrecht - Boston - London 1989.   CrossRef
  43. I. Vajda: On metric divergences of probability measures. Kybernetika 45 (2009), 885-900.   CrossRef
  44. I. Vajda: On convergence of information contained in quantized observations. IEEE Trans. Inform. Theory. 48 (1980) 2163-2172.   CrossRef
  45. I. Vincze: On the concept and measure of information contained in an observation. In: Contribution to Probability. (J. Gani and V. F. Rohatgi, eds.) Academic Press, New York 1981, pp. 207-214.   CrossRef