Search For:

Displaying 1-50 out of 90 total
Compositional Boosting for Computing Hierarchical Image Structures
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Tian-Fu Wu, Gui-Song Xia, Song-Chun Zhu
Issue Date:June 2007
pp. 1-8
In this paper, we present a compositional boosting algorithm for detecting and recognizing 17 common image structures in low-middle level vision tasks. These structures, called
 
Learning Generic Prior Models for Visual Computation
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Song Chun Zhu, David Mumford
Issue Date:June 1997
pp. 463
This paper presents a novel theory for learning generic prior models from a set of observed natural images based on a minimax entropy theory that the authors studied in texture modeling. We start by studying the statistics of natural images including the s...
 
Learning reconfigurable scene representation by tangram model
Found in: Applications of Computer Vision, IEEE Workshop on
By Jun Zhu, Tianfu Wu,Song-Chun Zhu, Xiaokang Yang, Wenjun Zhang
Issue Date:January 2012
pp. 449-456
This paper proposes a method to learn reconfigurable and sparse scene representation in the joint space of spatial configuration and appearance in a principled way. We call it the tangram model, which has three properties: (1) Unlike fixed structure of the...
 
Joint Video and Text Parsing for Understanding Events and Answering Queries
Found in: IEEE MultiMedia
By Kewei Tu,Meng Meng,Mun Wai Lee,Tae Eun Choe,Song-Chun Zhu
Issue Date:April 2014
pp. 42-70
This article proposes a multimedia analysis framework to process video and text jointly for understanding events and answering user queries. The framework produces a parse graph that represents the compositional structures of spatial information (objects a...
 
Inferring "Dark Matter" and "Dark Energy" from Videos
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Dan Xie,Sinisa Todorovic,Song-Chun Zhu
Issue Date:December 2013
pp. 2224-2231
This paper presents an approach to localizing functional objects in surveillance videos without domain knowledge about semantic object classes that may appear in the scene. Functional objects do not have discriminative appearance and shape, but they affect...
 
Cosegmentation and Cosketch by Unsupervised Learning
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Jifeng Dai,Ying Nian Wu,Jie Zhou,Song-Chun Zhu
Issue Date:December 2013
pp. 1305-1312
Co segmentation refers to the problem of segmenting multiple images simultaneously by exploiting the similarities between the foreground and background regions in these images. The key issue in co segmentation is to align common objects between these image...
 
Human Attribute Recognition by Rich Appearance Dictionary
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Jungseock Joo,Shuo Wang,Song-Chun Zhu
Issue Date:December 2013
pp. 721-728
We present a part-based approach to the problem of human attribute recognition from a single image of a human body. To recognize the attributes of human from the body parts, it is important to reliably detect the parts. This is a challenging task due to th...
 
Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Tianfu Wu,Song-Chun Zhu
Issue Date:December 2013
pp. 753-760
Many object detectors, such as AdaBoost, SVM and deformable part-based models (DPM), compute additive scoring functions at a large number of windows scanned over image pyramid, thus computational efficiency is an important consideration beside accuracy per...
 
Modeling 4D Human-Object Interactions for Event and Object Recognition
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Ping Wei,Yibiao Zhao,Nanning Zheng,Song-Chun Zhu
Issue Date:December 2013
pp. 3272-3279
Recognizing the events and objects in the video sequence are two challenging tasks due to the complex temporal structures and the large appearance variations. In this paper, we propose a 4D human-object interaction model, where the two tasks jointly boost ...
 
Monte Carlo Tree Search for Scheduling Activity Recognition
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Mohamed R. Amer,Sinisa Todorovic,Alan Fern,Song-Chun Zhu
Issue Date:December 2013
pp. 1353-1360
This paper addresses recognition of human activities with stochastic structure, characterized by variable space-time arrangements of primitive actions, and conducted by a variable number of actors. Our approach classifies the activity of interest as well a...
 
Modeling Occlusion by Discriminative AND-OR Structures
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Bo Li,Wenze Hu,Tianfu Wu,Song-Chun Zhu
Issue Date:December 2013
pp. 2560-2567
Occlusion presents a challenge for detecting objects in real world applications. To address this issue, this paper models object occlusion with an AND-OR structure which (i) represents occlusion at semantic part level, and (ii) captures the regularities of...
 
Concurrent Action Detection with Structural Prediction
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Ping Wei,Nanning Zheng,Yibiao Zhao,Song-Chun Zhu
Issue Date:December 2013
pp. 3136-3143
Action recognition has often been posed as a classification problem, which assumes that a video sequence only have one action class label and different actions are independent. However, a single human body can perform multiple concurrent actions at the sam...
 
Learning AND-OR Templates for Object Recognition and Detection
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Zhangzhang Si, Song-Chun Zhu
Issue Date:September 2013
pp. 2189-2205
This paper presents a framework for unsupervised learning of a hierarchical reconfigurable image template - the AND-OR Template (AOT) for visual objects. The AOT includes: 1) hierarchical composition as
 
Reconfigurable templates for robust vehicle detection and classification
Found in: Applications of Computer Vision, IEEE Workshop on
By Yang Lv, Benjamin Yao, Yongtian Wang,Song-Chun Zhu
Issue Date:January 2012
pp. 321-328
In this paper, we learn a reconfigurable template for detecting vehicles and classifying their types. We adopt a popular design for the part based model that has one coarse template covering entire object window and several small high-resolution templates ...
 
Learning Hybrid Image Templates (HIT) by Information Projection
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Zhangzhang Si,Song-Chun Zhu
Issue Date:July 2012
pp. 1354-1367
This paper presents a novel framework for learning a generative image representation—the hybrid image template (HIT) from a small number (i.e., 3 \sim 20) of image examples. Each learned template is composed of, typically, 50 \sim 500 image patches whose g...
 
Image representation by active curves
Found in: Computer Vision, IEEE International Conference on
By Wenze Hu,Ying Nian Wu,Song-Chun Zhu
Issue Date:November 2011
pp. 1808-1815
This paper proposes a sparse image representation using deformable templates of simple geometric structures that are commonly observed in images of natural scenes. These deformable templates include active curve templates and active corner templates. An ac...
 
Video Primal Sketch: A generic middle-level representation of video
Found in: Computer Vision, IEEE International Conference on
By Zhi Han, Zongben Xu,Song-Chun Zhu
Issue Date:November 2011
pp. 1283-1290
This paper presents a middle-level video representation named Video Primal Sketch (VPS), which integrates two regimes of models: i) sparse coding model using static or moving primitives to explicitly represent moving corners, lines, feature points, etc., i...
 
Parsing video events with goal inference and intent prediction
Found in: Computer Vision, IEEE International Conference on
By Mingtao Pei, Yunde Jia,Song-Chun Zhu
Issue Date:November 2011
pp. 487-494
In this paper, we present an event parsing algorithm based on Stochastic Context Sensitive Grammar (SCSG) for understanding events, inferring the goal of agents, and predicting their plausible intended actions. The SCSG represents the hierarchical composit...
 
Unsupervised learning of event AND-OR grammar and semantics from video
Found in: Computer Vision, IEEE International Conference on
By Zhangzhang Si,Mingtao Pei,Benjamin Yao,Song-Chun Zhu
Issue Date:November 2011
pp. 41-48
We study the problem of automatically learning event AND-OR grammar from videos of a certain environment, e.g. an office where students conduct daily activities. We propose to learn the event grammar under the information projection and minimum description...
 
C^4: Exploring Multiple Solutions in Graphical Models by Cluster Sampling
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Jake Porway, Song-Chun Zhu
Issue Date:September 2011
pp. 1713-1727
This paper presents a novel Markov Chain Monte Carlo (MCMC) inference algorithm called C^4—Clustering with Cooperative and Competitive Constraints—for computing multiple solutions from posterior probabilities defined on graphical models, including Markov r...
 
Discovering scene categories by information projection and cluster sampling
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Dengxin Dai, Tianfu Wut, Song-Chun Zhu
Issue Date:June 2010
pp. 3455-3462
This paper presents a method for unsupervised scene categorization. Our method aims at two objectives: (1) automatic feature selection for different scene categories. We represent images in a heterogeneous feature space to account for the large variabiliti...
 
Learning a probabilistic model mixing 3D and 2D primitives for view invariant object recognition
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Wenze Hu, Song-Chun Zhu
Issue Date:June 2010
pp. 2273-2280
This paper presents a method learning mixed templates for view invariant object recognition. The template is composed of 3D and 2D primitives which are stick-like elements defined in 3D and 2D spaces respectively. The primitives are allowed to perturb with...
 
Layered Graph Matching with Composite Cluster Sampling
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Liang Lin, Xiaobai Liu, Song-Chun Zhu
Issue Date:August 2010
pp. 1426-1442
This paper presents a framework of layered graph matching for integrating graph partition and matching. The objective is to find an unknown number of corresponding graph structures in two images. We extract discriminative local primitives from both images ...
 
Program chairs' introduction to the first international workshop on stochastic image grammars (SIG-09) in conjunction with IEEE CVPR 2009
Found in: Computer Vision and Pattern Recognition Workshop
By S. Todorovic, Song-Chun Zhu
Issue Date:June 2009
pp. xviii-xx
The major theme of the program is to identify challenges facing the work toward a unifying theoretical foundation of stochastic image grammars. The program has research topics that deals with stochastic image grammars. The topics include learning vocabular...
   
Flow mosaicking: Real-time pedestrian counting without scene-specific learning
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Yang Cong, Haifeng Gong, Song-Chun Zhu, Yandong Tang
Issue Date:June 2009
pp. 1093-1100
In this paper, we present a novel algorithm based on flow velocity field estimation to count the number of pedestrians across a detection line or inside a specified region. We regard pedestrians across the line as fluid flow, and design a novel model to es...
 
Learning mixed templates for object recognition
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Zhangzhang Si, Haifeng Gong, Ying Nian Wu, Song-Chun Zhu
Issue Date:June 2009
pp. 272-279
This article proposes a method for learning object templates composed of local sketches and local textures, and investigates the relative importance of the sketches and textures for different object categories. Local sketches and local textures in the obje...
 
Trajectory parsing by cluster sampling in spatio-temporal graph
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Xiaobai Liu, Liang Lin, Song-Chun Zhu, Hai Jin
Issue Date:June 2009
pp. 739-746
The objective of this paper is to parse object trajectories in surveillance video against occlusion, interruption, and background clutter. We present a spatio-temporal graph (ST-Graph) representation and a cluster sampling algorithm via deferred inference....
 
Layered graph matching by composite cluster sampling with collaborative and competitive interactions
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Liang Lin, Kun Zeng, Xiaobai Liu, Song-Chun Zhu
Issue Date:June 2009
pp. 1351-1358
This paper studies a framework for matching an unknown number of corresponding structures in two images (shapes), motivated by detecting objects in cluttered background and learning parts from articulated motion. Due to the large distortion between shapes ...
 
A Compositional and Dynamic Model for Face Aging
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Jinli Suo, Song-Chun Zhu, Shiguang Shan, Xilin Chen
Issue Date:March 2010
pp. 385-401
In this paper, we present a compositional and dynamic model for face aging. The compositional model represents faces in each age group by a hierarchical And-Or graph, in which And nodes decompose a face into parts to describe details (e.g., hair, wrinkles,...
 
SAVE: A framework for semantic annotation of visual events
Found in: Computer Vision and Pattern Recognition Workshop
By Mun Wai Lee, Asaad Hakeem, Niels Haering, Song-Chun Zhu
Issue Date:June 2008
pp. 1-8
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data minin...
 
A hierarchical and contextual model for aerial image understanding
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Jake Porway, Kristy Wang, Benjamin Yao, Song Chun Zhu
Issue Date:June 2008
pp. 1-8
In this paper we present a novel method for parsing aerial images with a hierarchical and contextual model learned in a statistical framework. We learn hierarchies at the scene and object levels to handle the difficult task of representing scene elements a...
 
An integrated background model for video surveillance based on primal sketch and 3D scene geometry
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Wenze Hu, Haifeng Gong, Song-Chun Zhu, Yontian Wang
Issue Date:June 2008
pp. 1-8
This paper presents a novel integrated background model for video surveillance. Our model uses a primal sketch representation for image appearance and 3D scene geometry to capture the ground plane and major surfaces in the scene. The primal sketch model di...
 
A Hierarchical Compositional Model for Face Representation and Sketching
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Zijian Xu, Hong Chen, Song-Chun Zhu, Jiebo Luo
Issue Date:June 2008
pp. 955-969
We present a hierarchical-compositional face model as a three-layer And-Or graph to account for the structural variabilities over multiple resolutions. In the And-Or graph, an And-node represents a decomposition of certain graphical structure expanding to ...
 
Deformable Template As Active Basis
Found in: Computer Vision, IEEE International Conference on
By Ying Nian Wu, Zhangzhang Si, Chuck Fleming, Song-Chun Zhu
Issue Date:October 2007
pp. 1-8
This article proposes an active basis model and a shared pursuit algorithm for learning deformable templates from image patches of various object categories. In our generative model, a deformable template is in the form of an active basis, which consists o...
 
An Empirical Study of Object Category Recognition: Sequential Testing with Generalized Samples
Found in: Computer Vision, IEEE International Conference on
By Liang Lin, Shaowu Peng, Jake Porway, Song-Chun Zhu, Yongtian Wang
Issue Date:October 2007
pp. 1-8
In this paper we present an empirical study of object category recognition using generalized samples and a set of sequential tests. We study 33 categories, each consisting of a small data set of 30 instances. To increase the amount of training data we have...
 
A Two-Level Generative Model for Cloth Representation and Shape from Shading
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Feng Han, Song-Chun Zhu
Issue Date:July 2007
pp. 1230-1243
In this paper, we present a two-level generative model for representing the images and surface depth maps of drapery and clothes. The upper level consists of a number of folds which will generate the high contrast (ridge) areas with a dictionary of shading...
 
Mapping Natural Image Patches by Explicit and Implicit Manifolds
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Kent Shi, Song-Chun Zhu
Issue Date:June 2007
pp. 1-7
Image patches are fundamental elements for object modeling and recognition. However, there has not been a panoramic study of the structures of the whole ensemble of natural image patches in the literature. In this article, we study the structures of this e...
 
Layered Graph Match with Graph Editing
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Liang Lin, Song-Chun Zhu, Yongtian Wang
Issue Date:June 2007
pp. 1-8
Many vision tasks are posed as either graph partitioning (coloring) or graph matching (correspondence) problems. The former include segmentation and grouping, and the latter include wide baseline stereo, large motion, object tracking and recognition. In th...
 
Composite Templates for Cloth Modeling and Sketching
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Hong Chen, Zi Jian Xu, Zi Qiang Liu, Song Chun Zhu
Issue Date:June 2006
pp. 943-950
Cloth modeling and recognition is an important and challenging problem in both vision and graphics tasks, such as dressed human recognition and tracking, human sketch and portrait. In this paper, we present a context sensitive grammar in an And-Or graph re...
 
Perceptual Scale Space and its Applications
Found in: Computer Vision, IEEE International Conference on
By Yizhou Wang, Siavosh Bahrami, Song-Chun Zhu
Issue Date:October 2005
pp. 58-65
In this paper, we study a perceptual scale space by constructing a so-called sketch pyramid which augments the Gaussian and Laplacian pyramid representations in traditional image scale space theory. Each level of this sketch pyramid is a generic attributed...
 
Bottom-up/Top-Down Image Parsing by Attribute Graph Grammar
Found in: Computer Vision, IEEE International Conference on
By Feng Han, Song-Chun Zhu
Issue Date:October 2005
pp. 1778-1785
In this paper, we present an attribute graph grammar for image parsing on scenes with man-made objects, such as buildings, hallways, kitchens, and living rooms. We choose one class of primitives — 3D planar rectangles projected on images, and six graph gra...
 
Incorporating Visual Knowledge Representation in Stereo Reconstruction
Found in: Computer Vision, IEEE International Conference on
By Adrian Barbu, Song-Chun Zhu
Issue Date:October 2005
pp. 572-579
In this paper, we present a two-layer generative model that incorporates generic middle-level visual knowledge for dense stereo reconstruction. The visual knowledge is represented by a dictionary of surface primitives including various categories of bounda...
 
Generalizing Swendsen-Wang to Sampling Arbitrary Posterior Probabilities
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Adrian Barbu, Song-Chun Zhu
Issue Date:August 2005
pp. 1239-1253
Many vision tasks can be formulated as graph partition problems that minimize energy functions. For such problems, the Gibbs sampler [9] provides a general solution but is very slow, while other methods, such as Ncut [24] and graph cuts [4], [22], are comp...
 
A Generative Model of Human Hair for Hair Sketching
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Hong Chen, Song Chun Zhu
Issue Date:June 2005
pp. 74-81
Human hair is a very complex visual pattern whose representation is rarely studied in the vision literature despite its important role in human recognition. In this paper, we propose a generative model for hair representation and hair sketching, which is f...
 
Cloth Representation by Shape from Shading with Shading Primitives
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Feng Han, Song-Chun Zhu
Issue Date:June 2005
pp. 1203-1210
Cloth is a complex visual pattern with flexible 3D shape and illumination variations. Computing the 3D shape of cloth from a single image is of great interest to both computer graphics and vision researches. However, the acquisition of 3D cloth shape by Sh...
 
Analysis and Synthesis of Textured Motion: Particles and Waves
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Yizhou Wang, Song-Chun Zhu
Issue Date:October 2004
pp. 1348-1363
Natural scenes contain a wide range of textured motion phenomena which are characterized by the movement of a large amount of particle and wave elements, such as falling snow, wavy water, and dancing grass. In this paper, we present a generative model for ...
 
Range Image Segmentation by an Effective Jump-Diffusion Method
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Feng Han, Zhuowen Tu, Song-Chun Zhu
Issue Date:September 2004
pp. 1138-1153
This paper presents an effective jump-diffusion method for segmenting a range image and its associated reflectance image in the Bayesian framework. The algorithm works on complex real-world scenes (indoor and outdoor), which consist of an unknown number of...
 
Multigrid and Multi-Level Swendsen-Wang Cuts for Hierarchic Graph Partition
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Adrian Barbu, Song-Chun Zhu
Issue Date:July 2004
pp. 731-738
Many vision tasks can be formulated as partitioning an adjacency graph through optimizing a Bayesian posterior probability p defined on the partition-space. In this paper two approaches are proposed to generalize the Swendsen-Wang cut algorithm [1] for sam...
 
Modeling Complex Motion by Tracking and Editing Hidden Markov Graphs
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Yizhou Wang, Song Chun Zhu
Issue Date:July 2004
pp. 856-863
In this paper, we propose a generative model for representing complex motion, such as wavy river, dancing fire and dangling cloth. Our generative method consists of four components: (1) A photometric model using primal sketch[8] which transfers an image in...
 
Automatic Single View Building Reconstruction by Integrating Segmentation
Found in: Computer Vision and Pattern Recognition Workshop
By Feng Han, Song-Chun Zhu
Issue Date:July 2004
pp. 53
In this paper, we propose a stochastic algorithm using Markov chain Monte Carlo (MCMC) to automatically reconstruct buildings from a single image of architectural scenes by integrating segmentation and reconstruction. Buildings are modelled by two families...
 
 1  2 Next >>