Sudeep Sarkar

University of South Florida
4202 E Fowler Ave., ENB 118
Computer Science and Engineering
Tampa, Florida 33620
Phone: (813) 974 2113
Fax: (813) 974 5456
Email: sarkar@cse.usf.edu
URL: http://www.cse.usf.edu/~sarkar/


DVP term expires December 2012

Sudeep Sarkar (Senior Member, IEEE) received the B.Tech degree in Electrical Engineering from the Indian Institute of Technology, Kanpur, in 1988. He received the M.S. and Ph.D. degrees in Electrical Engineering, on a University Presidential Fellowship, from The Ohio State University, Columbus, in 1990 and 1993, respectively. Since 1993, he has been with the Computer Science and Engineering Department at the University of South Florida, Tampa, where he is currently a Professor and the Research Administration Faculty Fellow in the university's Office of Research and Innovation. His research interests include perceptual organization in single images and multiple image sequences, automated sign language recognition, biometrics and nano-computing.
He is the co-author of the book "Computing Perceptual Organization in Computer Vision," published by World Scientific. He also the co-editor of the book "Perceptual Organization for Artificial Vision Systems" published by Kluwer Publishers. He is the recipient of the National  Science Foundation CAREER award in 1994, the USF Teaching Incentive Program Award for undergraduate teaching excellence in 1997, the Outstanding Undergraduate Teaching Award in 1998, and the Ashford Distinguished Scholar Award in 2004.  He served on the editorial boards for the IEEE Transactions on Pattern Analysis and Machine Intelligence (1999-2003) and Pattern Analysis & Applications Journal during (2000-2001). He is currently serving on the editorial boards of the Pattern Recognition journal, IEEE Transactions on Systems, Man, and Cybernetics, Part-B, Image and Vision Computing, and IET Computer Vision. He is a fellow of the IAPR.


Human Computer Communication Using Sign Language
Sign languages are complex, abstract linguistic systems, with their own grammars. This talk will introduce you to automated algorithms that can take sign language video of and recognize the signs performed. This kind of ability would be useful in facilitating the communication between Deaf and hearing persons, mediated by a computing device coupled with cameras. The scientific goal is to ultimately go beyond the recognition of isolated signs or continuous signs in short sentences based on video, without the use of special equipment such as data gloves or magnetic markers. The focus of the talk will be on the design of scalable formalisms for representation, model learning, and matching methods that are robust to image segmentation errors.

Guided by audience interest, the talk explore a subset of the representations and approaches that

1.    Capture the global (Gestalt) configuration of hand and face relationship using relational distributions. It is somewhat robust to segmentation errors and does not require part tracking. See figure above.

2.    Learn, without supervision, sign models from examples using automated common motif extraction using Markov Chain Monte Carlo methods. We formulated an unsupervised approach to both extract and learn models for continuous basic units of signs, which we term as signemes, from continuous sentences. Given a set of sentences with a common sign, we can automatically learn the model for part of the sign, or signeme, that is least affected by movement epenthesis effects.

3.    Distinguish true signs from the transitional movements made by the signer as s/he moves from one sign to the next called Movement Epenthesis. Movement Epenthesis (ME) is a serious hurdle in the design of Continuous Sign Language recognition systems. This problem is further compounded by the ambiguity of feature detection and occlusions, resulting in propagation of errors to higher levels. We have formulated a novel framework that can address both these problems in an overall dynamic programming (DP) approach.

4.    Automatically segment an ASL sentence into signs using Conditional Random Fields.

5.    Match signs and gestures in the presence of segmentation noise using fragment-Hidden Markov Models (frag-HMM)

Perceptual Organization: The Search for Structure and Organization in Images
Humans and animals use their visual abilities to navigate the world, forage for food, and survive. Is it possible to replicate some of these abilities on a computer so that they can assist us and enhance our quality of life by being an active components in our day-to-day life? Within this context, the problem of object recognition is concerned with the problem of recognizing objects in images. The source for its high combinatorics is twofold. First, an object can appear in different poses (viewpoint) and under different illumination conditions; the possible appearances of any one given object is many. Second, the object (foreground) has to be separated from the surrounding scene (background) even before it can be submitted for recognition. This is further complicated by the fact that the surrounding context of an object can vary widely in different situations. To control the combinatorics, and there is evidence that animal vision also has similar solutions, the process involve three basic sub-processes (i) extraction of robust, illumination and pose invariant features from the given image, (ii) grouping and organization of these low-level features into possible object hypotheses, and (iii) matching of these groups to object models. In this talk we will explore and analyze graph based solutions to the second type of sub-processes, i.e. grouping and organization of the image features. This is one of the largely unsolved, fundamental problems in vision and is central to the design of scalable artificial vision systems. It has been shown that the combinatorics of the recognition process in cluttered environments using constrained search reduces from an exponential to a low order polynomial if we use an intermediate grouping process. What is remarkable is that, unlike for the indexing case, this grouping process need not be perfect!
Perceptual organization offers an elegant framework to group low-level features that are likely to come from a single object. In recent years, one of the effective engines for perceptual organization of low-level image features is based on the partitioning of a graph representation that captures human perception inspired local structures, such as similarity, proximity, continuity, parallelism, and perpendicularity, over the low-level image features. I this talk I will summarize our experience with approaches based on graphs, graph spectra, and machine learning to solve this problem.

Gait Recognition: I know You from the Way You Walk

It has been folklore that humans can identify others based on their biological movement from a distance. This observation was somewhat bolstered by experiments with light point displays by human perception researchers in the 70s and have been confirmed by recent human perception experiments. However, it is only recently that computer vision based gait biometrics has received much attention. Recent research on this topic, much of it facilitated by the structure of the DARPA HumanID Gait Challenge Problem, has brought into light interesting capabilities and limits of this modality. Recognition is possible from gait. In this walk, I will start by describing how this challenge framework, consisting of data sets, challenge experiments, and a baseline performance, has helped jump start the gait recognition area. I will also summarize some of the lessons learnt in terms of what are the sources of gait variations that are easy to overcome and what are still the outstanding ones. Perhaps from a vision point of view one of the important observations that some researchers have made is that gait shapes offer more stable cues for recognition, across different covariates, than gait dynamics. Building on these observations, I will summarize an approach that first performs gait dynamics normalization using population HMM and then computes distances between gait shapes in a space that maximizes differences between individuals. This algorithm statistically improves recognition over all covariates in the DARPA HumanID Gait Challenge Problem.  I will end the talk with some ideas about future data collection that will help move gait recognition technology forward into long distances, say 300m and 24/7 operational scenarios. The developed technology can also be used for making coarser distinctions other than identity, such as gender and race from a distance.