Issue No. 12 - December (1995 vol. 17)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/34.476512
<p><it>Abstract</it>—In this paper, we apply the leaving-one-out concept to the estimation of ’small’ probabilities, i.e., the case where the number of training samples is much smaller than the number of possible classes. After deriving the Turing-Good formula in this framework, we introduce several specific models in order to avoid the problems of the original Turing-Good formula. These models are the constrained model, the absolute discounting model and the linear discounting model. These models are then applied to the problem of bigram-based stochastic language modeling. Experimental results are presented for a German and an English corpus.</p>
Stochastic language modeling, leaving-one-out, zero-frequency problem, maximum likelihood estimation, generalization capability.
R. Kneser, U. Essen and H. Ney, "On the Estimation of 'Small' Probabilities by Leaving-One-Out," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 17, no. , pp. 1202-1212, 1995.