In this paper we present a result that relates merging of closed convex sets of discrete probability functions respectively by the squared Euclidean distance and the Kullback-Leibler divergence, using an inspiration from the Rényi entropy. While selecting the probability function with the highest Shannon entropy appears to be a convincingly justified way of representing a closed convex set of probability functions, the discussion on how to represent several closed convex sets of probability functions is still ongoing. The presented result provides a perspective on this discussion. Furthermore, for those who prefer the standard minimisation based on the squared Euclidean distance, it provides a connection to a probabilistic merging operator based on the Kullback-Leibler divergence, which is closely connected to the Shannon entropy.
This work studies the standard exponential families of probability measures on Euclidean spaces that have finite supports. In such a family parameterized by means, the mean is supposed to move along a segment inside the convex support towards an endpoint on the boundary of the support. Limit behavior of several quantities related to the exponential family is described explicitly. In particular, the variance functions and information divergences are studied around the boundary.
The information divergence of a probability measure P from an exponential family E over a finite set is defined as infimum of the divergences of P from Q subject to Q∈E. All directional derivatives of the divergence from E are explicitly found. To this end, behaviour of the conjugate of a log-Laplace transform on the boundary of its domain is analysed. The first order conditions for P to be a maximizer of the divergence from E are presented, including new ones when P is not projectable to E
.
Several algorithms have been developed for time series forecasting. In this paper, we develop a type of algorithm that makes use of the numerical methods for optimizing on objective function that is the Kullbak-Leibler divergence between the joint probability density function of a time series xi, X2, Xn and the product of their marginal distributions. The Grani-charlier expansion is ušed for estimating these distributions.
Using the weights that have been obtained by the neural network, and adding to them the Kullback-Leibler divergence of these weights, we obtain new weights that are ušed for forecasting the new value of Xn+k.