The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January/March (2009 vol.31)
pp: 8-31
R. Mahesh K. Sinha , Indian Institute of Technology, Kanpur
ABSTRACT
<p>This overview examines the historical development of mechanizing Indian scripts and the computer processing of Indian languages. While examining possible solutions, the author describes the challenges involved in their design and in exploiting their structural similarity that lead to a unified solution. The focus is on the Devanagari script and Hindi language, and on the technological solutions for processing them.</p><kwd><p>Keywords: Indian language keyboarding, Indian language coding, Indian language display, Indian language processing</p></kwd>
CITATION
R. Mahesh K. Sinha, "A Journey from Indian Scripts Processing to Indian Language Processing", IEEE Annals of the History of Computing, vol.31, no. 1, pp. 8-31, January/March 2009, doi:10.1109/MAHC.2009.1
REFERENCES
1. J. Beames, A Comparative Grammar of the Modern Aryan Languages of India: Hindi, Panjabi, Sindhi, Gujarati, Marathi, Oriya, and Bangali, 3 vols.,TrÜbner, 1872–1879.
2. M.B. Emeneu, Language and Linguistic Area, Stanford Univ. Press, 1980.
3. S.R. Hill and P.G. Harrison, Dhatu-Patha: The Roots of Language, Munshiram Manoharlal Publishers Pvt. Ltd., 1997.
4. S.K. Chatterji, The Origin And Development of Bengali Language, Rupa &Co., 2002.
5. B. Kachru, The Alchemy of English: The Spread, Functions and Models of Non-Native Englishes, Pergamon Press, 1986.
6. J. Baldridge, "Linguistic and Social Characteristics of Indian English," Language In India, vol. 2, no. 4, June-July 2002.
7. Devanagari through the Ages, pub. no. 8/67, Central Hindi Directorate, New Delhi, 1967.
8. H. Scharfe, "Kharoṣṭīand Brāhmī," J. Am. Oriental Soc., vol. 122, no. 2, 2002, pp. 391-393.
9. A.S. Mahmud, "Crisis and Need: Information and Communication Technology in Development Initiatives Runs through a Paradox," ITU Document WSIS/PC-2/CONTR/17-E, World Summit on Information Society, Int'l Telecommunication Union (ITU), Geneva, 2003.
10. R.M.K. Sinha, "Multilinguality and Global Digital Divide," presented at the Joint IAMCR/ICA Int'l Symp.: The Digital Divide, 2001.
11. "ITU's Asia-Pacific Telecommunication and ICT Indicators Report Focuses on Broadband Connectivity: Too Much or Too Little?"; 1 Sept. 2008; http://www.itu.int/newsroom/press_ releases/ 200825.html.
12. M. Schwartz, "Fastap Hindi Language Platform Slated to Revolutionise India Mobile Market," 7 Mar. 2008; http://www.developingtelecoms.com/content/ view/116526/.
13. C.A. Arnaldo,"A Holistic Approach to the Computerization of Asian Scripts," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 1-24.
14. Hanzix Work Group, "Open Systems Environment for Hanzi Input Methods," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 49-58.
15. P. Lofting et al., , "Handwriting: From Bamboo to Laser," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 93-112.
16. T.C. Chen,, "Hanzi Characters and Their Computerizations," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 34-48.
17. Y.S. Moon, "Digital Fonts for Oriental Ideographical Languages," Proc. Workshop Computer Processing of Asian Languages,, Asian Inst. of Technology, Bangkok, Thailand (AIT), 1989, pp. 168-174.
18. P.H. Noncarrow, "48,000 Characters in Search of a System," presented at Symp. Linguistic Implications of Computer-based Information Systems, New Delhi, 1978.
19. K. Hensch, "IBM History of Far Eastern Languages in Computing, Part 1: Requirements and Initial Phonetic Product Solutions in the 1960s," IEEE Annals of the History of Computing, vol. 27, no. 1, 2005, pp. 17-26.
20. K. Hensch, "IBM History of Far Eastern Languages in Computing, Part 2: Initial efforts for Full Kanji Solutions, 1970s," IEEE Annals of the History of Computing, vol. 27, no. 1, 2005, pp. 27-37.
21. K. Hensch, "IBM: History of Far Eastern Languages in Computing, Part 3: IBM Japan Taking the Lead, Accomplishments through the 1990s," IEEE Annals of the History of Computing, vol. 27, no. 1, 2005, pp. 38-55.
22. J. Stevens, Sacred Calligraphy of the East, 3rd ed., Shambhala, 1996.
23. L.S. Wakankar, Ganesh Vidya: The Traditional Indian Approach to Phonetic Writing, Tata Press, 1968.
24. In Unicode this halantsymbol has been incorrectly called a viram. Viramactually represents a full stop, and Devanagari uses a vertical line (a danda) for this.
25. R.M.K. Sinha, "Computer Processing of Indian Languages and Scripts—Potentialities and Problems," J. Institution of Electronics and Telecommunication Engineers, vol. 30, no. 6, 1984, pp. 133-149.
26. R.M.K. Sinha, "Rule Based Contextual Post-Processing for Devanagari Text Recognition," Pattern Recognition, vol. 20, no. 5, 1987, pp. 475-485.
27. S.K. Das, A History of Indian Literature: 1800–1910, Sahitya Akademy, New Delhi, 1991.
28. J. Gilchrist, Grammar of the Hindoostanee Language, or Part Third of Volume First, of a System of Hindoostanee Philology, Chronicle Press, Calcutta, 1796.
29. W. Franklin,Introduction to The Bhǎgvǎt-Gēētā; The HěětŌpǎdēs of Veěshnōō-Sǎrmā, C. Wilkins trans., Ganesha, 2001, pp. xxiv-xxv.
30. B.S. Naik, Typography of Devanagari, Bombay, Directorate of Languages, government of Maharashtra, 1971, vol. 2, pp. 636-639; http://listserv.linguistlist.org/cgi-bin wa?A2= ind0001&L=indology&D=1&P=20160 , . While researching the history of the Devanagari typewriter, I found information on the history of Rudraprayag (currently part of the state of Uttarakhand) noting that the king of Rudrapayag, Kirti Shah, invented a typewriter for Hindi around 1892 and gave the copyright to an unnamed company (http://rudraprayag.nic.inhistory.htm); further information is not traceable thus far.
31. Vrunda (archivist), Godrej Archives; http:/www.archives.godrej.com, personal communication, Sept. 2008.
32. P.V.H.M.L. Narasimham, B. Prasad, and V. Rajaraman, "Code Based Keyboard for Indian Languages," J. Computer Soc. of India, vol. 2, 1971, pp. 33-37.
33. R.M.K. Sinha and H.N. Mahabala, "Machine Oriented Devanagari Script," J. Institution of Electronics and Telecommunication Engineers, vol. 19, 1973, pp. 623-628.
34. R.M.K. Sinha, "Syntactic Pattern Analysis and its application to Devanagari Script Recognition," PhD dissertation, Electrical Eng. Dept., IIT Kanpur, 1973.
35. R.M.K. Sinha principal investigator, Integrated Devanagari Computer (IDC), tech. report IDC-84-1, Dept. of Electrical Eng., IIT Kanpur, 1984.
36. R.M.K. Sinha, "Teaching Script on a Digital Computer," J. Institution of Electronics and Telecommunication Engineers, vol. 22, 1976, pp. 720-722.
37. R.M.K. Sinha, "Computer Processing of Indian Languages," presented at 4th Int'l Conf. Computing in Humanities, 1979.
38. R.M.K. Sinha and A. Raman, "A Modular Indian Language Data Terminal," Computer Graphics (ACM SIGGRAPH), vol. 14,ACM Press, 1980, pp. 39-72.
39. R.M.K. Sinha, "Computers for Indian Languages," Proc. Ann. Convention of Computer Soc. of India, Computer Soc. of India, Madras, 1982, pp. 163-174.
40. R.M.K. Sinha and B. Srinivasan, "Machine Transliteration from Roman to Devanagari and Devanagari to Roman," J. Institution of Electronics and Telecommunication Engineers, vol. 30, no. 6, 1984, pp. 243-245.
41. R.M.K. Sinha and K.S. Singh, "A Program for Correction of Single Errors in Hindi Words," J. Institution of Electronics and Telecommunication Engineers, vol. 30, no. 6, 1984, pp. 249-251.
42. R.M.K. Sinha, Data Representations for Indian Language Databases, tech. report TRCS-84-22, Dept. of Computer Science, IIT Kanpur, 1984.
43. R.M.K. Sinha,"Non-Latin Information Systems: Some Basic Issues," Proc. Conf. Information Processing, H. Kugler ed., Elsevier Science, 1986.
44. R.M.K. Sinha, "A Sanskrit Based Word-expert Model for Machine Translation among Indian Languages," Proc. Workshop Computer Processing of Asian Languages, AIT, 1989, pp. 82-91.
45. R.M.K. Sinha et al., "AnglaBharti: A Multi-lingual Machine Aided Translation Project on Translation from English to Hindi," , IEEE Int'l Conf. Systems, Man and Cybernetics, IEEE Press, 1995, pp. 1609-1614.
46. R.M.K. Sinha, "Hybridizing Rule-Based and Example-Based Approaches in Machine Aided Translation System," 2000 Int'l Conf. Artificial Intelligence (IC-AI 2000), CSREA Press, 2000, pp. 1247-1252.
47. R.M.K. Sinha, "An Engineering Perspective of Machine Translation: AnglaBharti-II and AnuBharti-II Architectures," Proc. Int'l Symp. Machine Translation, NLP and Translation Support System (iSTRANS-2004), Tata McGraw Hill, 2004, pp. 10-17.
48. R.M.K. Sinha and A. Thakur, "Machine Translation of Bi-lingual Hindi-English (Hinglish) Text," MT Summit X, Proc.: The Tenth Machine Translation Summit, Phuket, Thailand, 2005, pp. 149-156.
49. R.M.K. Sinha, "A Hybridized EBMT System for Hindi to English Transaction," CSI J., vol. 37, no. 4, 2007, pp. 3-9.
50. M. Kulkarni personal communication; http://www.cdac.in/html/gistabout.asp.
51. K. Sivaraman, "AnglaBharati: A Machine Aided Translation System from English to Indian Languages—English to Tamil Version," M.Tech thesis, Dept. of Computer Science &Eng., IIT Kanpur, 1993.
52. P.V.H.M.L. Narasimham et al., Design Information Report on Text Composition in Telugu, Computer Maintenance Corp., Secunderabad, 1981.
53. O. Vikas, "Summary Report of the Symposium on Linguistic Implications of Computer Based Information Systems," Electronics Information &Planning, New Delhi, government of India, 1978, pp. 801-804.
54. A. Bharati et al., Anusaaraka: Machine Translation in Stages, report no. IIIT/TR/1997/1, IIIT Hyderabad, 1997.
55. A.K. Pathak, "An Input/Output Terminal for Indian Languages," M.Tech thesis, Dept. of Electrical Eng., IIT, Kanpur, 1978.
56. M.P. Sastri, "A Universal Script Generator for Indian Languages," M.Tech. thesis, Dept. of Electrical Eng., IIT Kanpur, 1978.
57. J. Institution of Electronics and Telecommunication Engineers, vol. 30, no. 6, 1984, special issue on computer processing of Indian languages and scripts, R.M.K. Sinha guest ed.
58. A. Mathur and F. Fowler, "Design of a Dynamically Reconfigurable Keyboard," Proc. Int'l Conf. Chinese and Oriental Language Computing, IEEE CS Press, 1987, pp. 20-23.
59. B. Nag, "Information Technology for Indian Scripts: Problems and Prospects," Proc. Workshop Computer Processing of Asian Languages, AIT, 1989, pp. ks-2-15.
60. K.P.S. Menon, "High Speed, Visual Direct Indian Language Data Entry," Indian Linguistics, 1974, vol. 35, pp. 97-111.
61. N. Mate,"Keyboard Overview—An Accommodative Approach for Devanagari Keyboard," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 291-298.
62. S.P. Mudur, An Alphabetization Procedure for Devanagari Words,, tech. report, Nat'l Centre for Software Development and Computing Technology, 1978.
63. J.N. Tripathi, "Statistical Studies of Printed Devanagari Text (Hindi)," J. Institution of Telecommunication Engineers, 1971.
64. O. Vikas, "Standardizing Representation of Indian Languages for Information Processing," Proc. Int'l Symp. Machine Translation, NLP and Translation Support System (iSTRANS-2004), Tata McGraw Hill, 2004, pp. 313-314.
65. Standardisation of Indian Script Code for Information Interchange and Keyboard Layout, Dept. of Electronics, government of India, 1983.
66. "Report of the Committee on Standardization for Indian Scripts and Keyboard Layout," IPAG J., Ministry of Communication and Information Technology, New Delhi, Oct. 1986.
67. R.M.K. Sinha,"Standardizing Linguistic Information—An Overview," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 272-290.
68. M., Chandrasekaran, "Machine Recognition of the Modern Tamil Script," PhD dissertation, Univ. of Madras, India, 1982.
69. R. Chandrasekaran, "Computer Recognition of Certain Ancient and Modern Indian Scripts," PhD dissertation, Univ. of Madras, India, 1982.
70. U. Pal and B.B. Chaudhuri, "Indian Script Character Recognition: A Survey," Pattern Recognition, vol. 37, 2004, pp. 243-245.
71. V. Bansal, "Role of Knowledge in Document Recognition—A Case Study for Devanagari Script," PhD dissertation, Dept. of Computer Science and Eng., IIT Kanpur, 1999.
72. R.M.K. Sinha, "Methodology for Computer Recognition of Devanagari Scripts," IEEE-SMC Int'l Conf., IEEE Press, 1984, pp. 1220-1224.
73. H. Nomura,"Role of AI in Natural Language Processing for Asian Languages," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 147-152.
74. R. Narasimhan,, "Technology Support for Asian Language Studies and Applications," Computer Processing of Asian Languages: CPAL-2 Proc., R.M.K. Sinha ed., Tata McGraw Hill, 1992, pp. 25-33.
75. R.M.K. Sinha, and G.C. Pathak, "A Heuristic Based Question Answering System in Natural Hindi," IEEE-SMC Int'l Conf., IEEE Press, 1984, pp. 1009-1013.
76. P.C. Ganeshsundaram, "The P-Structure C-Structure Grammar (PCG) for the Contrastive Study of Two or More Languages," J. Indian Inst. of Science, 1978, pp. 167-191.
77. Proc. Workshop on Computer Processing of Asian Languages, AIT, 1989.
78. R.M.K. Sinha ed., Second Regional Workshop on Computer Processing of Asian Languages: CPAL-2 Proc., Tata McGraw Hill, 1992.
79. Report and Recommendations of Second Regionalo Workshop on Computer Processing of Asian Languages: CPAL-2, IIT Kanpur, India, pp. 19-21.
80. S. Dave, J. Parikh, and P. Bhattacharyya, "Interlingua Based English Hindi Machine Translation and Language Divergence," J. Machine Translation (JMT), vol. 17, Sept. 2002, pp. 251-304.
81. P. Bhattacharyya, "Knowledge Extraction into Universal Networking Language Expressions," Proc. Universal Networking Language Workshop, 2001.
82. D. Narayan et al., "An Experience in Building the Indo WordNet—A WordNet for Hindi," Proc. Int'l Conf. Global WordNet (GWC 02), 2002.
22 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool