This Article 
 Bibliographic References 
 Add to: 
Semantic Modeling and Knowledge Representation in Multimedia Databases
January/February 1999 (vol. 11 no. 1)
pp. 64-80

Abstract—In this paper, we present the current state of the art in semantic data modeling of multimedia data. Semantic conceptualization can be performed at several levels of information granularity, leading to multilevel indexing and searching mechanisms. Various models at different levels of granularity are compared. At the finest level of granularity, multimedia data can be indexed based on image contents, such as identification of objects and faces. At a coarser level of granularity, indexing of multimedia data can be focused on events and episodes, which are higher level abstractions. In light of the above, we also examine modeling and indexing techniques of multimedia documents.

[1] ISO/IEC 10744, Information Technology—Hypermedia/Time-Based Structuring Language (HyTime), International Organization for Standardization, 1992.
[2] ISO 8613, Information Processing—Text and Office Systems—Office Document Architecture (ODA) and Interchange Format, International Organization for Standardization, 1993.
[3] A. Abella and J.R. Kender, "Qualitatively Describing Objects Using Spatial Prepositions," Proc. 11th Nat'l Conf. Artificial Intelligence, pp. 536-540, July 1993.
[4] A.A. Alatan, E. Tuncel, and L. Onural, "A Rule-Based Method for Object Segmentation in Video Sequences," Proc. Int'l Conf. Image Processing, vol. 2, pp. 522-525,Santa Barbara, Calif., Oct. 1997.
[5] J.F. Allen, “Maintaining Knowledge about Temporal Intervals,” Comm. ACM, vol. 26, no. 11, pp. 832–843, 1983.
[6] A.D. Bimbo, E. Vicario, and D. Zingoni, “Symbolic Description and Visual Querying of Image Sequences Using Spatio-Temporal Logic,” IEEE Trans. Knowledge and Data Eng., vol. 7, no. 4, pp. 609-621, Aug. 1995.
[7] A. Brink, S. Marcus, and V.S. Subrahmanian, "Heterogeneous Multimedia Reasoning," Computer, Vol. 28, No. 9, Sept. 1995, pp. 33-39.
[8] S.K. Chang, Q.Y. Shi, and C.W. Yan, “Iconic Indexing by 2-D Strings,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 3, pp. 413-427, July 1987.
[9] R. Chellappa, C. Wilson, and S. Sirohey, "Human and Machine Recognition of Faces: A Survey," Proc. IEEE, vol. 83, no. 5, pp. 705-740, 1995.
[10] J.Y. Chen, C. Taskiran, E.J. Delp, and C.A. Bouman, "ViBE: A New Paradigm for Video Database Browsing and Search," Proc. IEEE Workshop Content-Based Access of Image and Video Libraries, pp. 96-100, 1998.
[11] W.W. Chu, C.C. Hsu, A.F. Cardenas, and R.K. Taira, “Knowledge-Based Image Retrieval with Spatial and Temporal Constructs,” IEEE Trans. Knowledge and Data Eng., vol. 10, no. 6, Nov./Dec. 1998.
[12] S. Dagtas, W. Al-Khatib, A. Khokhar, and A. Ghafoor, "A Hybrid Content-Based Retrieval Approach for Video Data," Technical Report TR-ECE 98-13, Purdue Univ., Sept. 1998.
[13] Y.F. Day, S. Dagstas, and A. Ghafoor, “Spatio-Temporal Modeling of Video Data for On-Line Object-Oriented Query Processing,” Proc. IEEE Int'l Conf. Multimedia (ICMCS '95), pp. 98-105, 1995.
[14] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by Image and Video Content: The QBIC System,” IEEE Computer, 1995.
[15] F. Golshani and N. Dimitrova, "Retrieval and Delivery of Information in Multimedia Database Systems," Information and Software Technology, vol. 36, no. 4, pp. 235-242, May 1994.
[16] J. Hafner, H.S. Sawhney, W. Equitz, M. Flickner, and W. Niblack, “Efficient Color Histogram Indexing for Quadratic Form Distance Functions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 7, pp. 729-736, July 1995.
[17] C.C. Hsu, W.W. Chu, and R.K. Taira, “A Knowledge-Based Approach for Retrieving Images by Content,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 4, pp. 522-532, 1996.
[18] M. Iino, Y.F. Day, and A. Ghafoor, "An Object-Oriented Model for Spatiotemporal Synchronization of Multimedia Information," Proc. of the IEEE Multimedia Conf., IEEE CS Press, Los Alamitos, Calif., 1994, pp. 110-119.
[19] A.K. Jain, Y. Zhong, and S. Lakshmanan, Object Matching Using Deformable Templates IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 3, pp. 267-278, Mar. 1996.
[20] R. Kasturi and R. Jain, "Dynamic Vision," R. Kasturi and R. Jain, eds., Computer Vision, pp. 469-480, IEEE CS Press, 1991.
[21] W. Klas, E.J. Neuhold, and M. Schrefl, “Using an Object-Oriented Approach to Model Multimedia Data,” Computer Comm. vol. 13, no. 4, pp. 204-216, May 1990.
[22] W. Krattenthaler, K.J. Mayer, and M. Zeiller, "Point Correlation: A Reduced-Cost Template Matching Technique," Proc. ICIP, pp. 208-212, 1994.
[23] K. Lee, Y.K. Lee, and P.B. Berra, "Management of Multi-Structured Hypermedia Documents: A Data Model, Query Language, and Indexing Scheme," Multimedia Tools and Applications, vol. 4, no. 2, pp. 199-223, Mar. 1997.
[24] J.H. Lim, H.H. Teh, H.C. Lui, and P.Z. Wang, "Stochastic Topology with Elastic Matching for Off-Line Handwritten Character Recognition," Pattern Recognition Letters, vol. 17, no. 2, pp. 149-154, Feb. 1996.
[25] T.D.C. Little and A. Ghafoor, “Interval-Based Conceptual Models for Time-Dependent Multimedia Data,” IEEE Trans. Knowledge and Data Eng., vol. 5, no. 4, pp. 551-563, Aug. 1993.
[26] B.M. Mehtre, M.S. Kankanhalli, A.D. Narasimhalu, and G.C. Man, “Color Matching for Image Retrieval,” Pattern Recognition Letters, vol. 16, pp. 325-331, 1995.
[27] J. Meng, Y. Juan, and S.-F. Chang, "Scene Change Detection in a MPEG Compressed Video Sequence," Proc. SPIE, vol. 2,419, pp. 14-25, 1995.
[28] M. Misra and V.K. Prasanna, "Parallel Computations of Wavelet Transforms," Proc. Int'l Conf. Pattern Recognition, Sept. 1992.
[29] A. Nagasaka and Y. Tanaka, "Automatic Video Indexing and Full Video Search for Object Appearances," Proc. Second Working Conf. Visual Database Systems, pp. 119-133, IFIP WG 2.6, Oct. 1991.
[30] V.E. Ogle, “CHABOT—Retrieval from a Relational Database of Images,” Computer, vol. 28, no. 9, pp. 40-48, Sept. 1995.
[31] E. Oomoto, “Design and Implementation of a Video-Object Database System,” IEEE Trans. Knowledge and Data Eng., vol. 5, no. 4, pp. 629-643, Aug. 1993.
[32] M.T. Özsu, D. Duane, G. El-Medani, and C. Vittal, “An Object-Oriented Multimedia Database System for a News-on-Demand Application,” ACM Multimedia Systems J., vol. 3, pp. 182-203, Nov. 1995.
[33] S. Ravela, R. Manmatha, and E.M. Riseman, "Image Retrieval Using Scale-Space Matching," Proc. Fourth European Conf. Computer Vision, pp. 273-282, 1996.
[34] E. Remias et al., "Supporting Content-Based Retrieval in Large Image Database Systems," The Int'l J. Multimedia Tools and Applications, Vol. 4, No. 2, March 1997, pp. 153-170.
[35] L.A. Rowe, J. Boreczky, and C. Eads, "Indexes for User Access to Large Video Databases," Proc. IS and T/SPIE Int'l Symp. Electronic Imaging: Science and Technology,San Jose, Calif., Feb. 1994.
[36] T. Sakai, M. Nagao, and S. Fujibayashi, "Line Extraction and Pattern Detection in a Photograph," Pattern Recognition, vol. 1, no. 3, pp. 233-248, Mar. 1969.
[37] H. Samet and A. Soffer, “MARCO: MAp Retrieval by Content,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 783-798, Aug. 1996.
[38] S. Santini and R. Jain, “Similarity Queries in Image Databases,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, (CVPR '96), June 1996.
[39] S.W. Smoliar and H.J. Zhang, “Content-Based Video Indexing and Retrieval,” IEEE Multimedia, vol. 1, no. 2, pp. 62-72, 1994.
[40] R.K. Srihari, "Automatic Indexing and Content-Based Retrieval of Captioned Images," Computer, vol. 28, no. 9, Sept. 1995, pp. 49-56.
[41] M.J. Swain and D.H. Ballard,“Indexing via color histograms,” Proc. ICCV 90, pp. 390–393.
[42] S.L. Tanimoto, Elements of Artificial Intelligence Using Common LISP, Computer Science Press, 1990.
[43] R. Weiss, A. Duda, and D.K. Gifford, "Composition and Search with a Video Algebra," IEEE MultiMedia, vol. 2, no. 1, Spring 1995, pp. 12-25.
[44] M.M. Yeung, B.-L. Yeo, W. Wolf, and B. Liu, "Video Browsing Using Clustering and Scene Transitions on Compressed Sequences," Proc. IS and T/SPIE Multimedia Computing and Networking, 1995.
[45] A. Yoshitaka, S. Kishida, M. Hirakawa, and T. Ichikawa, "Knowledge-Assisted Content Based Retrieval for Multimedia Databases," IEEE Multimedia, pp. 12-21, Winter 1994.
[46] H. Zhang, A. Kankanhalli, and S. Smoliar, "Automatic Partitioning of Full-Motion Video," Multimedia Systems, Vol. 1, No. 1, 1993, pp. 10-28.
[47] H.J. Zhang, C.Y. Low, Y. Gong, and S.W. Smoliar, "Video Pars-ing Using Compressed Data," Proc. SPIE, vol. 2,182, pp. 142-149, 1994.

Index Terms:
Multimedia data modeling, multimedia document modeling, semantic modeling, knowledge representation, content-based retrieval, image databases, video databases, video indexing, query formulation, query processing.
Wasfi Al-Khatib, Y. Francis Day, Arif Ghafoor, P. Bruce Berra, "Semantic Modeling and Knowledge Representation in Multimedia Databases," IEEE Transactions on Knowledge and Data Engineering, vol. 11, no. 1, pp. 64-80, Jan.-Feb. 1999, doi:10.1109/69.755616
Usage of this product signifies your acceptance of the Terms of Use.