This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Recognizing Sources of Random Strings
April 1991 (vol. 13 no. 4)
pp. 386-394

The identification of a source given a sequence of random strings is discussed. Two modes of random string generation are analyzed. In the first mode, arbitrary strings are generated in which the individual symbols occur exactly once in each random string. The latter case corresponds to the situation in which the sources generate random permutations. In both cases, the best match to the distribution being used by each source can be obtained by maintaining an exponential number of statistics. This being infeasible, a simple parameterization of the distributions is proposed. For arbitrary strings, the simple unigram-based model (U-model) is proposed. For the case of permutations, a new model called the S-model is proposed, and it is used to analyze and/or approximate unknown distributions of permutations. The relevant estimation procedures, together with the applications to source recognition, are presented. The method presents a unique blend of syntactic and statistical pattern recognition.

[1] D. E. Denning,Encryption and Data Security. Reading, MA: Addison-Wesley, 1983.
[2] R. O. Duda and P. E. Hart,Pattern Classification and Scene Analysis. New York: Wiley Interscience, 1973.
[3] P. A. V. Hall and G. R. Dowling, "Approximate string matching,"ACM Comput. Surveys, vol. 12, pp. 381-402, 1980.
[4] R. L. Kashyap and B. J. Oommen, "Spelling correction using probabilistic methods,"Pattern Recognition Lett., vol. 2, pp. 147-154, 1984.
[5] R. Lawrence and R.A. Wagner, "An extension of the string to string correction problem,"ACM, vol. 22, pp. 177-183, 1975.
[6] B. J. Oommen and D. T. H. Ng, "Arbitrarily distributed random permutation generation," inProc. 1989 ACM Comput. Sci. Conf., Louisville, KY, Feb. 1989, pp. 27-32; also available as Tech. Rep. SCS-TR-138, School Comput. Sci., Carleton Univ., Ottawa, Ont., Canada.
[7] S. S. Rao,Optimization Methods: Theory and Applications. New York: Wiley, 1980.
[8] R.S. Valiveti and B. J. Oommen, "Recognizing sources of random strings," School Comput. Sci., Carleton Univ., Ottawa, Ont., Canada, Tech. Rep. SCS-TR-161.
[9] R. S. Valiveti, Ph.D. dissertation, in preparation.

Index Terms:
random string sources recognition; syntactic pattern recognition; estimation theory; identification; permutations; statistics; unigram-based model; U-model; S-model; statistical pattern recognition; estimation theory; pattern recognition; statistical analysis
Citation:
R.S. Valiveti, B.J. Oommen, "Recognizing Sources of Random Strings," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 4, pp. 386-394, April 1991, doi:10.1109/34.88575
Usage of this product signifies your acceptance of the Terms of Use.