Search For:

Displaying 1-41 out of 41 total
Social Image Tagging by Mining Sparse Tag Patterns from Auxiliary Data
Found in: 2012 IEEE International Conference on Multimedia and Expo (ICME)
By Jie Lin,Junsong Yuan,Ling-Yu Duan,Siwei Luo,Wen Gao
Issue Date:July 2012
pp. 7-12
User-given tags associated with social images from photo-sharing websites (e.g., Flickr) are valuable auxiliary resources for the image tagging task. However, social images often suffer from noisy and incomplete tags, heavily degrading the effectiveness of...
 
Randomized visual phrases for object search
Found in: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
By Yuning Jiang, Jingjing Meng, Junsong Yuan
Issue Date:June 2012
pp. 3100-3107
Accurate matching of local features plays an essential role in visual object search. Instead of matching individual features separately, using the spatial context, e.g., bundling a group of co-located features into a visual phrase, has shown to enable more...
 
Salient region detection and its application to video retargeting
Found in: Multimedia and Expo, IEEE International Conference on
By Ye Luo, Junsong Yuan, Ping Xue, Qi Tian
Issue Date:July 2011
pp. 1-6
In spite of extensive studies on visual saliency, e.g., generating a saliency map from an image, less work has been addressed how to crop salient regions from saliency maps. We present a new approach to detect salient regions with maximum saliency density ...
 
Discovering the Thematic Object in Commercial Videos
Found in: IEEE Multimedia
By Gangqiang Zhao,Junsong Yuan,Jiang Xu,Ying Wu
Issue Date:July 2011
pp. 56-65
The thematic object in a commercial video is representative of its content. The authors propose a data-mining method for thematic object discovery in commercials by finding spatially collocated visual features.
 
Efficient search of Top-K video subvolumes for multi-instance action detection
Found in: Multimedia and Expo, IEEE International Conference on
By Norberto A. Goussies, Zicheng Liu, Junsong Yuan
Issue Date:July 2010
pp. 328-333
Action detection was formulated as a subvolume mutual information maximization problem in [8], where each subvolume identifies where and when the action occurs in the video. Despite the fact that the proposed branch-and-bound algorithm can find the best su...
 
Bipolar grouping
Found in: Multimedia and Expo, IEEE International Conference on
By Jiang Xu, Junsong Yuan, Ying Wu
Issue Date:July 2010
pp. 54-59
Most affinity-based grouping methods only model the inclusive relation among the data. When the data set contains a significant amount of noise data that should not be included in any clusters, these methods are likely to lead to undesired results. To addr...
 
Learning Actionlet Ensemble for 3D Human Action Recognition
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Jiang Wang,Zicheng Liu,Ying Wu,Junsong Yuan
Issue Date:May 2014
pp. 1-1
Human action recognition is an important yet challenging task. Human actions usually involve human-object interactions, highly articulated motions, high intra-class variations, and complicated temporal structures. The recently developed commodity depth sen...
 
Thematic Saliency Detection Using Spatial-Temporal Context
Found in: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW)
By Ye Luo,Gangqiang Zhao,Junsong Yuan
Issue Date:December 2013
pp. 347-353
We propose a new measurement of video saliency termed thematic video saliency}. Video saliency is detected in terms of finding the thematic objects that frequently appear at the salient positions in the video scenes. By representing all image segments in t...
 
Mining actionlet ensemble for action recognition with depth cameras
Found in: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
By Jiang Wang, Zicheng Liu, Ying Wu, Junsong Yuan
Issue Date:June 2012
pp. 1290-1297
Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are v...
 
Combining Feature Context and Spatial Context for Image Pattern Discovery
Found in: Data Mining, IEEE International Conference on
By Hongxing Wang,Junsong Yuan,Yap-Peng Tan
Issue Date:December 2011
pp. 764-773
Once an image is decomposed into a number of visual primitives, e.g., local interest points or salient image regions, it is of great interests to discover meaningful visual patterns from them. Conventional clustering (e.g., k-means) of visual primitives, h...
 
Discovering Thematic Patterns in Videos via Cohesive Sub-graph Mining
Found in: Data Mining, IEEE International Conference on
By Gangqiang Zhao,Junsong Yuan
Issue Date:December 2011
pp. 1260-1265
One category of videos usually contains the same thematic pattern, e.g., the spin action in skating videos. The discovery of the thematic pattern is essential to understand and summarize the video contents. This paper addresses two critical issues in minin...
 
Minimum near-convex decomposition for robust shape representation
Found in: Computer Vision, IEEE International Conference on
By Zhou Ren, Junsong Yuan, Chunyuan Li, Wenyu Liu
Issue Date:November 2011
pp. 303-310
Shape decomposition is a fundamental problem for part-based shape representation. We propose a novel shape decomposition method called Minimum Near-Convex Decomposition (MNCD), which decomposes 2D and 3D arbitrary shapes into minimum number of
 
Grassmann Hashing for approximate nearest neighbor search in high dimensional space
Found in: Multimedia and Expo, IEEE International Conference on
By Xinchao Wang, Zhu Li, Lei Zhang,Junsong Yuan
Issue Date:July 2011
pp. 1-6
Locality-Sensitive Hashing (LSH) approximates nearest neighbors in high dimensions by projecting original data into low-dimensional subspaces. The basic idea is to hash data samples to ensure that the probability of collision is much higher for samples tha...
 
Optimal spatio-temporal path discovery for video event detection
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Du Tran, Junsong Yuan
Issue Date:June 2011
pp. 3321-3328
We propose a novel algorithm for video event detection and localization as the optimal path discovery problem in spatio-temporal video space. By finding the optimal spatio-temporal path, our method not only detects the starting and ending points of the eve...
 
Mining discriminative co-occurrence patterns for visual recognition
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Junsong Yuan, Ming Yang, Ying Wu
Issue Date:June 2011
pp. 2777-2784
The co-occurrence pattern, a combination of binary or local features, is more discriminative than individual features and has shown its advantages in object, scene, and action recognition. We discuss two types of co-occurrence patterns that are complementa...
 
Unsupervised random forest indexing for fast action search
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Gang Yu, Junsong Yuan, Zicheng Liu
Issue Date:June 2011
pp. 865-872
Despite recent successes of searching small object in images, it remains a challenging problem to search and locate actions in crowded videos because of (1) the large variations of human actions and (2) the intensive computational cost of searching the vid...
 
Sparse reconstruction cost for abnormal event detection
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Yang Cong, Junsong Yuan, Ji Liu
Issue Date:June 2011
pp. 3449-3456
We propose to detect abnormal events via a sparse reconstruction over the normal bases. Given an over-complete normal basis set (e.g., an image sequence or a collection of local spatio-temporal patches), we introduce the sparse reconstruction cost (SRC) ov...
 
Discriminative Video Pattern Search for Efficient Action Detection
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Junsong Yuan, Zicheng Liu, Ying Wu
Issue Date:September 2011
pp. 1728-1743
Actions are spatiotemporal patterns. Similar to the sliding window-based object detection, action detection finds the reoccurrences of such spatiotemporal patterns through pattern matching, by handling cluttered and dynamic backgrounds and other types of a...
 
Discriminative subvolume search for efficient action detection
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Junsong Yuan, Zicheng Liu, Ying Wu
Issue Date:June 2009
pp. 2442-2449
Actions are spatio-temporal patterns which can be characterized by collections of spatio-temporal invariant features. Detection of actions is to find the re-occurrences (e.g. through pattern matching) of such spatio-temporal patterns. This paper addresses ...
 
Context-aware clustering
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Junsong Yuan, Ying Wu
Issue Date:June 2008
pp. 1-8
Most existing methods of semi-supervised clustering introduce supervision from outside, e.g., manually label some data samples or introduce constrains into clustering results. This paper studies an interesting problem: can the supervision come from inside,...
 
Mining compositional features for boosting
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Junsong Yuan, Jiebo Luo, Ying Wu
Issue Date:June 2008
pp. 1-8
The selection of weak classifiers is critical to the success of boosting techniques. Poor weak classifiers do not perform better than random guess, thus cannot help decrease the training error during the boosting process. Therefore, when constructing the w...
 
Spatial Random Partition for Common Visual Pattern Discovery
Found in: Computer Vision, IEEE International Conference on
By Junsong Yuan, Ying Wu
Issue Date:October 2007
pp. 1-8
Automatically discovering common visual patterns from a collection of images is an interesting but yet challenging task, in part because it is computationally prohibiting. Although representing images as visual documents based on discrete visual words offe...
 
Spatial selection for attentional visual tracking
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Ming Yang, Junsong Yuan, Ying Wu
Issue Date:June 2007
pp. 1-8
Long-duration tracking of general targets is quite challenging for computer vision, because in practice target may undergo large uncertainties in its visual appearance and the unconstrained environments may be cluttered and distractive, although tracking h...
 
Discovery of Collocation Patterns: from Visual Words to Visual Phrases
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Junsong Yuan, Ying Wu, Ming Yang
Issue Date:June 2007
pp. 1-8
A visual word lexicon can be constructed by clustering primitive visual features, and a visual object can be described by a set of visual words. Such a
 
Fast and Robust Search Method for Short Video Clips from Large Video Collection
Found in: Pattern Recognition, International Conference on
By Junsong Yuan, Qi Tian, Surendra Ranganath
Issue Date:August 2004
pp. 866-869
In this paper a fast and robust method is proposed to search a large video collection for given short clips. Compared with existing video searching methods which use visual features only, our scheme performs a two-phase hierarchical matching technique usin...
 
Video Event Detection: From Subvolume Localization to Spatiotemporal Path Search
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Du Tran, Junsong Yuan,David Forsyth
Issue Date:February 2014
pp. 404-416
Although sliding window-based approaches have been quite successful in detecting objects in images, it is not a trivial problem to extend them to detecting events in videos. We propose to search for spatiotemporal paths for video event detection. This new ...
 
Minimum Near-Convex Shape Decomposition
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Zhou Ren, Junsong Yuan, Wenyu Liu
Issue Date:October 2013
pp. 2546-2552
Shape decomposition is a fundamental problem for part-based shape representation. We propose the minimum near-convex decomposition (MNCD) to decompose arbitrary shapes into minimum number of
 
Learning Actionlet Ensemble for 3D Human Action Recognition
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Jiang Wang,Zicheng Liu,Ying Wu,Junsong Yuan
Issue Date:October 2013
pp. 1
Human action recognition is an important yet challenging task. Human actions usually involve human-object interactions, highly articulated motions, high intra-class variations and complicated temporal structures. The recently developed commodity depth sens...
 
Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior
Found in: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
By Gangqiang Zhao,Junsong Yuan,Gang Hua
Issue Date:June 2013
pp. 1602-1609
A topical video object refers to an object that is frequently highlighted in a video. It could be, \eg, the product logo and the leading actor/actress in a TV commercial. We propose a topic model that incorporates a word co-occurrence prior for efficient d...
 
Salient object detection in videos by optimal spatio-temporal path discovery
Found in: Proceedings of the 21st ACM international conference on Multimedia (MM '13)
By Junsong Yuan, Ye Luo
Issue Date:October 2013
pp. 509-512
Many consumer videos focus on and follow salient objects in a scene. Detecting such salient objects is thus of great interests to video analytics and search. Instead of detecting salient object in individual frames separately, we propose to detect and trac...
     
Human-virtual human interaction by upper body gesture understanding
Found in: Proceedings of the 19th ACM Symposium on Virtual Reality Software and Technology (VRST '13)
By Daniel Thalmann, Junsong Yuan, Yang Xiao
Issue Date:October 2013
pp. 133-142
In this paper, a novel human-virtual human interaction system is proposed. This system supports a real human to communicate with a virtual human using natural body language. Meanwhile, the virtual human is capable of understanding the meaning of human uppe...
     
Robust hand gesture recognition with kinect sensor
Found in: Proceedings of the 19th ACM international conference on Multimedia (MM '11)
By Jingjing Meng, Junsong Yuan, Zhengyou Zhang, Zhou Ren
Issue Date:November 2011
pp. 759-760
Hand gesture based Human-Computer-Interaction (HCI) is one of the most natural and intuitive ways to communicate between people and machines, since it closely mimics how human interact with each other. In this demo, we present a hand gesture recognition sy...
     
Real-time human action search using random forest based hough voting
Found in: Proceedings of the 19th ACM international conference on Multimedia (MM '11)
By Gang Yu, Junsong Yuan, Zicheng Liu
Issue Date:November 2011
pp. 1149-1152
Many existing techniques in content based video retrieval treat a video sequence as a whole to match it against a query video or to assign a text label. Such an approach has serious limitations when applied to human action retrieval because an action may o...
     
Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera
Found in: Proceedings of the 19th ACM international conference on Multimedia (MM '11)
By Junsong Yuan, Zhengyou Zhang, Zhou Ren
Issue Date:November 2011
pp. 1093-1096
The recently developed depth sensors, e.g., the Kinect sensor, have provided new opportunities for human-computer interaction (HCI). Although great progress has been made by leveraging the Kinect sensor, e.g. in human body tracking and body gesture recogni...
     
KPB-SIFT: a compact local feature descriptor
Found in: Proceedings of the international conference on Multimedia (MM '10)
By Gangqiang Zhao, Gencai Chen, Junsong Yuan, Ling Chen
Issue Date:October 2010
pp. 1175-1178
Invariant feature descriptors such as SIFT and GLOH have been demonstrated to be very robust for image matching and object recognition. However, such descriptors are typically of high dimensionality, e.g. 128-dimension in the case of SIFT. This limits the ...
     
Interactive visual object search through mutual information maximization
Found in: Proceedings of the international conference on Multimedia (MM '10)
By Jingjing Meng, Junsong Yuan, Nitya Narasimhan, Venu Vasudevan, Ying Wu, Yuning Jiang
Issue Date:October 2010
pp. 1147-1150
Searching for small objects (e.g., logos) in images is a critical yet challenging problem. It becomes more difficult when target objects differ significantly from the query object due to changes in scale, viewpoint or style, not to mention partial occlusio...
     
Mining and cropping common objects from images
Found in: Proceedings of the international conference on Multimedia (MM '10)
By Gangqiang Zhao, Junsong Yuan
Issue Date:October 2010
pp. 975-978
Discovering common objects that appear frequently in a number of images is a challenging problem, due to (1) the appearance variations of the same common object and (2) the enormous computational cost involved in exploring the huge solution space, includin...
     
Speeding up spatio-temporal sliding-window search for efficient event detection in crowded videos
Found in: Proceedings of the 1st ACM international workshop on Events in multimedia (EiMM '09)
By Junsong Yuan, Ying Wu, Zhengyou Zhang, Zicheng Liu
Issue Date:October 2009
pp. 3-8
Despite previous successes of sliding window-based object detection in images, searching desired events in the volumetric video space is still a challenging problem, partially because the pattern search in spatio-temporal video space is much more complicat...
     
Mining GPS traces and visual words for event classification
Found in: Proceeding of the 1st ACM international conference on Multimedia information retrieval (MIR '08)
By Henry Kautz, Jiebo Luo, Junsong Yuan, Ying Wu
Issue Date:October 2008
pp. 1-1
It is of great interest to recognize semantic events (e.g., hiking, skiing, party), in particular when given a collection of personal photos, where each photo is tagged with a timestamp and GPS (Global Positioning System) information at the capture. We add...
     
From frequent itemsets to semantically meaningful visual patterns
Found in: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '07)
By Junsong Yuan, Ming Yang, Ying Wu
Issue Date:August 2007
pp. 864-873
Data mining techniques that are successful in transaction and text data may not be simply applied to image data that contain high-dimensional features and have spatial structures. It is not a trivial task to discover meaningful visual patterns in image dat...
     
Fast and robust short video clip search using an index structure
Found in: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval (MIR '04)
By Changsheng Xu, Junsong Yuan, Ling-Yu Duan, Qi Tian
Issue Date:October 2004
pp. 61-68
In this paper, we present an index structure-based method to fast and robustly search short video clips in large video collections. First we temporally segment a given long video stream into overlapped matching windows, then map extracted features from the...
     
 1