We propose a simple method of construction of new families of ϕ-divergences. This method called convex standardization is applicable to convex and concave functions ψ(t) twice continuously differentiable in a neighborhood of t=1 with nonzero second derivative at the point t=1. Using this method we introduce several extensions of the LeCam, power, χa and Matusita divergences. The extended families are shown to connect smoothly these divergences with the Kullback divergence or they connect various pairs of these particular divergences themselves. We investigate also the metric properties of divergences from these extended families.
This paper deals with Bayesian models given by statistical experiments and standard loss functions. Bayes probability of error and Bayes risk are estimated by means of classical and generalized information criteria applicable to the experiment. The accuracy of the estimation is studied. Among the information criteria studied in the paper is the class of posterior power entropies which include the Shannon entropy as special case for the power α=1. It is shown that the most accurate estimate is in this class achieved by the quadratic posterior entropy of the power α=2. The paper introduces and studies also a new class of alternative power entropies which in general estimate the Bayes errors and risk more tightly than the classical power entropies. Concrete examples, tables and figures illustrate the obtained results.
Point estimators based on minimization of information-theoretic divergences between empirical and hypothetical distribution induce a problem when working with continuous families which are measure-theoretically orthogonal with the family of empirical distributions. In this case, the ϕ-divergence is always equal to its upper bound, and the minimum ϕ-divergence estimates are trivial. Broniatowski and Vajda \cite{IV09} proposed several modifications of the minimum divergence rule to provide a solution to the above mentioned problem. We examine these new estimation methods with respect to consistency, robustness and efficiency through an extended simulation study. We focus on the well-known family of power divergences parametrized by α∈R in the Gaussian model, and we perform a comparative computer simulation for several randomly selected contaminated and uncontaminated data sets, different sample sizes and different ϕ-divergence parameters.
The paper summarizes and extends the theory of generalized ϕ-entropies Hϕ(X) of random variables X obtained as ϕ-informations Iϕ(X;Y) about X maximized over random variables Y. Among the new results is the proof of the fact that these entropies need not be concave functions of distributions pX. An extended class of power entropies Hα(X) is introduced, parametrized by α∈R, where Hα(X) are concave in pX for α≥0 and convex for α<0. It is proved that all power entropies with α≤2 are maximal ϕ-informations Iϕ(X;X) for appropriate ϕ depending on α. Prominent members of this subclass of power entropies are the Shannon entropy H1(X) and the quadratic entropy H2(X). The paper investigates also the tightness of practically important previously established relations between these two entropies and errors e(X) of Bayesian decisions about possible realizations of X. The quadratic entropy is shown to provide estimates which are in average more than 100 \than those based on the Shannon entropy, and this tightness is shown to increase even further when α increases beyond α=2. Finally, the paper studies various measures of statistical diversity and introduces a general measure of anisotony between them. This measure is numerically evaluated for the entropic measures of diversity H1(X) and H2(X).
Standard properties of ϕ-divergences of probability measures are widely applied in various areas of information processing. Among the desirable supplementary properties facilitating employment of mathematical methods is the metricity of ϕ-divergences, or the metricity of their powers. This paper extends the previously known family of ϕ-divergences with these properties. The extension consists of a continuum of ϕ-divergences which are squared metric distances and which are mostly new but include also some classical cases like e. g. the Le Cam squared distance. The paper establishes also basic properties of the ϕ-divergences from the extended class including the range of values and the upper and lower bounds attained under fixed total variation.
The paper investigates generalized linear models (GLM's) with binary responses such as the logistic, probit, log-log, complementary log-log, scobit and power logit models. It introduces a median estimator of the underlying structural parameters of these models based on statistically smoothed binary responses. Consistency and asymptotic normality of this estimator are proved. Examples of derivation of the asymptotic covariance matrix under the above mentioned models are presented. Finally some comments concerning a method called enhancement and robustness of median estimator are given and results of simulation experiment comparing behavior of median estimator with other robust estimators for GLM's known from the literature are reported.
This paper deals with four types of point estimators based on minimization of information-theoretic divergences between hypothetical and empirical distributions. These were introduced
\begin{enumerate} \item[(i)] by Liese and Vajda \cite{9} and independently Broniatowski and Keziou \cite{3}, called here \textsl{power superdivergence estimators, } \item[(ii)] by Broniatowski and Keziou \cite{4} , called here \textsl{power subdivergence estimators, } \item[(iii)] by Basu et al. \cite{2}, called here \textsl{power pseudodistance estimators, }and \item[(iv)] by Vajda \cite{18} called here \textsl{Rényi pseudodistance estimators.} \end{enumerate}
These various criterions have in common to eliminate all need for grouping or smoothing in statistical inference. The paper studies and compares general properties of these estimators such as Fisher consistency and influence curves, and illustrates these properties by detailed analysis of the applications to the estimation of normal location and scale.
The paper solves the problem of minimization of the Kullback divergence between a partially known and a completely known probability distribution. It considers two probability distributions of a random vector (u1,x1,...,uT,xT) on a sample space of 2T dimensions. One of the distributions is known, the other is known only partially. Namely, only the conditional probability distributions of xτ given u1,x1,...,uτ−1,xτ−1,uτ are known for τ=1,...,T. Our objective is to determine the remaining conditional probability distributions of uτ given u1,x1,...,uτ−1,xτ−1 such that the Kullback divergence of the partially known distribution with respect to the completely known distribution is minimal. Explicit solution of this problem has been found previously for Markovian systems in Karný \cite{Karny:96a}. The general solution is given in this paper.