Search For:

Displaying 1-50 out of 51 total
A Structured Probabilistic Model for Recognition
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Cordelia Schmid
Issue Date:June 1999
pp. 2485
In this paper we derive a probabilistic model for recognition based on local descriptors and spatial relations between these descriptors. Our model takes into account the variability of local descriptors, their saliency as well as the probability of spatia...
 
Stable Hyper-pooling and Query Expansion for Event Detection
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Matthijs Douze,Jerome Revaud,Cordelia Schmid,Herve Jegou
Issue Date:December 2013
pp. 1825-1832
This paper makes two complementary contributions to event retrieval in large collections of videos. First, we propose hyper-pooling strategies that encode the frame descriptors into a representation of the video sequence in a stable manner. Our best choice...
 
DeepFlow: Large Displacement Optical Flow with Deep Matching
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Philippe Weinzaepfel,Jerome Revaud,Zaid Harchaoui,Cordelia Schmid
Issue Date:December 2013
pp. 1385-1392
Optical flow computation is a key component in many computer vision systems designed for tasks such as action detection or activity recognition. However, despite several major advances over the last decade, handling large displacement in optical flow remai...
 
Action Recognition with Improved Trajectories
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Heng Wang,Cordelia Schmid
Issue Date:December 2013
pp. 3551-3558
Recently dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets. This paper improves their performance by taking into account camera motion to correct them...
 
Estimating Human Pose with Flowing Puppets
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Silvia Zuffi,Javier Romero,Cordelia Schmid,Michael J. Black
Issue Date:December 2013
pp. 3312-3319
We address the problem of upper-body human pose estimation in uncontrolled monocular video sequences, without manual initialization. Most current methods focus on isolated video frames and often fail to correctly localize arms and hands. Inferring pose ove...
 
Segmentation Driven Object Detection with Fisher Vectors
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Ramazan Gokberk Cinbis,Jakob Verbeek,Cordelia Schmid
Issue Date:December 2013
pp. 2968-2975
We present an object detection system based on the Fisher vector (FV) image representation computed over SIFT and color descriptors. For computational and storage efficiency, we use a recent segmentation-based method to generate class-independent object de...
 
Action and Event Recognition with Fisher Vectors on a Compact Feature Set
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Dan Oneata,Jakob Verbeek,Cordelia Schmid
Issue Date:December 2013
pp. 1817-1824
Action recognition in uncontrolled video is an important and challenging computer vision problem. Recent progress in this area is due to new local features and models that capture spatio-temporal structure between local features, or human-object interactio...
 
Towards Understanding Action Recognition
Found in: 2013 IEEE International Conference on Computer Vision (ICCV)
By Hueihan Jhuang,Juergen Gall,Silvia Zuffi,Cordelia Schmid,Michael J. Black
Issue Date:December 2013
pp. 3192-3199
Although action recognition in videos is widely studied, current methods often fail on real-world datasets. Many recent approaches improve accuracy and robustness to cope with challenging video sequences, but it is often unclear what affects the results mo...
 
Unsupervised metric learning for face identification in TV video
Found in: Computer Vision, IEEE International Conference on
By Ramazan Gokberk Cinbis,Jakob Verbeek,Cordelia Schmid
Issue Date:November 2011
pp. 1559-1566
The goal of face identification is to decide whether two faces depict the same person or not. This paper addresses the identification problem for face-tracks that are automatically collected from uncontrolled TV video data. Face-track identification is an ...
 
Multi-view object class detection with a 3D geometric model
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Joerg Liebelt, Cordelia Schmid
Issue Date:June 2010
pp. 1688-1695
This paper presents a new approach for multi-view object class detection. Appearance and geometry are treated as separate learning tasks with different training data. Our approach uses a part model which discriminatively learns the object appearance with s...
 
Aggregating local descriptors into a compact image representation
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Herve Jegou, Matthijs Douze, Cordelia Schmid, Patrick Perez
Issue Date:June 2010
pp. 3304-3311
We address the problem of image search on a very large scale, where three constraints have to be considered jointly: the accuracy of the search, its efficiency, and the memory usage of the representation. We first propose a simple yet efficient way of aggr...
 
Multimodal semi-supervised learning for image classification
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Matthieu Guillaumin, Jakob Verbeek, Cordelia Schmid
Issue Date:June 2010
pp. 902-909
In image categorization the goal is to decide if an image belongs to a certain category or not. A binary classifier can be learned from manually labeled images; while using more labeled examples improves performance, obtaining the image labels is a time co...
 
Product Quantization for Nearest Neighbor Search
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Hervé Jégou, Matthijs Douze, Cordelia Schmid
Issue Date:January 2011
pp. 117-128
This paper introduces a product quantization-based approach for approximate nearest neighbor search. The idea is to decompose the space into a Cartesian product of low-dimensional subspaces and to quantize each subspace separately. A vector is represented ...
 
Accurate Image Search Using the Contextual Dissimilarity Measure
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Hervé Jegou, Cordelia Schmid, Hedi Harzallah, Jakob Verbeek
Issue Date:January 2010
pp. 2-11
This paper introduces the contextual dissimilarity measure, which significantly improves the accuracy of bag-of-features-based image search. Our measure takes into account the local distribution of the vectors and iteratively estimates distance update term...
 
Viewpoint-independent object class detection using 3D Feature Maps
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Joerg Liebelt, Cordelia Schmid, Klaus Schertler
Issue Date:June 2008
pp. 1-8
This paper presents a 3D approach to multi-view object class detection. Most existing approaches recognize object classes for a particular viewpoint or combine classifiers for a few discrete views. We propose instead to build 3D representations of object c...
 
Learning realistic human actions from movies
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Ivan Laptev, Marcin Marszalek, Cordelia Schmid, Benjamin Rozenfeld
Issue Date:June 2008
pp. 1-8
The aim of this paper is to address recognition of natural human actions in diverse and realistic video settings. This challenging but important subject has mostly been ignored in the past due to several problems one of which is the lack of realistic and a...
 
Automatic face naming with caption-based supervision
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, Cordelia Schmid
Issue Date:June 2008
pp. 1-8
We consider two scenarios of naming people in databases of news photos with captions: (i) finding faces of a single person, and (ii) assigning names to all faces. We combine an initial text-based step, that restricts the name assigned to a face to the set ...
 
Vector Quantizing Feature Space with a Regular Lattice
Found in: Computer Vision, IEEE International Conference on
By Tinne Tuytelaars, Cordelia Schmid
Issue Date:October 2007
pp. 1-8
Most recent class-level object recognition systems work with visual words, i.e., vector quantized local descriptors. In this paper we examine the feasibility of a data-independent approach to construct such a visual vocabulary, where the feature space is d...
 
Using High-Level Visual Information for Color Constancy
Found in: Computer Vision, IEEE International Conference on
By Joost van de Weijer, Cordelia Schmid, Jakob Verbeek
Issue Date:October 2007
pp. 1-8
We propose to use high-level visual information to improve illuminant estimation. Several illuminant estimation approaches are applied to compute a set of possible illuminants. For each of them an illuminant color corrected image is evaluated on the likeli...
 
Accurate Object Localization with Shape Masks
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Marcin Marszalek, Cordelia Schmid
Issue Date:June 2007
pp. 1-8
This paper proposes an approach for object class localization which goes beyond bounding boxes, as it also determines the outline of the object. Unlike most current localization methods, our approach does not require any hypothesis parameter space to be de...
 
Learning Color Names from Real-World Images
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Joost van de Weijer, Cordelia Schmid, Jakob Verbeek
Issue Date:June 2007
pp. 1-8
Within a computer vision context color naming is the action of assigning linguistic color labels to image pixels. In general, research on color naming applies the following paradigm: a collection of color chips is labelled with color names within a well-de...
 
Accurate Object Detection with Deformable Shape Models Learnt from Images
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Vittorio Ferrari, Frederic Jurie, Cordelia Schmid
Issue Date:June 2007
pp. 1-8
We present an object class detection approach which fully integrates the complementary strengths offered by shape matchers. Like an object detector, it can learn class models directly from images, and localize novel instances in the presence of intra-class...
 
Semantic Hierarchies for Visual Object Recognition
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Marcin Marszalek, Cordelia Schmid
Issue Date:June 2007
pp. 1-7
In this paper we propose to use lexical semantic networks to extend the state-of-the-art object recognition techniques. We use the semantics of image labels to integrate prior knowledge about inter-class relationships into the visual appearance learning. W...
 
Flexible Object Models for Category-Level 3D Object Recognition
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Akash Kushal, Cordelia Schmid, Jean Ponce
Issue Date:June 2007
pp. 1-8
Today's category-level object recognition systems largely focus on fronto-parallel views of objects with characteristic texture patterns. To overcome these limitations, we propose a novel framework for visual object recognition where object classes are rep...
 
A contextual dissimilarity measure for accurate and efficient image search
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Herve Jegou, Hedi Harzallah, Cordelia Schmid
Issue Date:June 2007
pp. 1-8
In this paper we present two contributions to improve accuracy and speed of an image search system based on bag-of-features: a contextual dissimilarity measure (CDM) and an efficient search structure for visual word vectors. Our measure (CDM) takes into ac...
 
Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Fred Rothganger, Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:March 2007
pp. 477-491
This paper presents a novel representation for dynamic scenes composed of multiple rigid objects that may undergo different motions and are observed by a moving camera. Multiview constraints associated with groups of affine-covariant scene patches and a no...
 
Spatial Weighting for Bag-of-Features
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Marcin Marsza³ek, Cordelia Schmid
Issue Date:June 2006
pp. 2118-2125
This paper presents an extension to category classification with bag-of-features, which represents an image as an orderless distribution of features. We propose a method to exploit spatial relations between features by utilizing object boundaries provided ...
 
Combining Regions and Patches for Object Class Localization
Found in: Computer Vision and Pattern Recognition Workshop
By Caroline Pantofaru, Gyuri Dorko, Cordelia Schmid, Martial Hebert
Issue Date:June 2006
pp. 23
We introduce a method for object class detection and localization which combines regions generated by image segmentation with local patches. Region-based descriptors can model and match regular textures reliably, but fail on parts of the object which are t...
 
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:June 2006
pp. 2169-2178
This paper presents a method for recognizing scene categories based on approximate global geometric correspondence. This technique works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside ea...
 
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study
Found in: Computer Vision and Pattern Recognition Workshop
By Jianguo Zhang, Marcin Marszalek, Svetlana Lazebnik, Cordelia Schmid
Issue Date:June 2006
pp. 13
Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracte...
 
A Performance Evaluation of Local Descriptors
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Krystian Mikolajczyk, Cordelia Schmid
Issue Date:October 2005
pp. 1615-1630
In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. It is unclear which descriptors ar...
 
A Maximum Entropy Framework for Part-Based Texture and Object Recognition
Found in: Computer Vision, IEEE International Conference on
By Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:October 2005
pp. 832-838
This paper presents a probabilistic part-based approach for texture and object recognition. Textures are represented using a part dictionary found by quantizing the appearance of scale- or affine-invariant keypoints. Object classes are represented using a ...
 
A Sparse Texture Representation Using Local Affine Regions
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:August 2005
pp. 1265-1278
This paper introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and nonrigid deformations. At the feature extraction stage, a sparse set of affine Harris...
 
Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Fred Rothganger, Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:July 2004
pp. 914-921
This paper presents a novel representation for dynamic scenes composed of multiple rigid objects that may undergo different motions and be observed by a moving camera. Multi-view constraints associated with groups of affine-invariant scene patches and a no...
 
Scale-Invariant Shape Features for Recognition of Object Categories
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Frédéric Jurie, Cordelia Schmid
Issue Date:July 2004
pp. 90-96
We introduce a new class of distinguished regions based on detecting the most salient convex local arrangements of contours in the image. The regions are used in a similar way to the local interest points extracted from gray-level images, but they capture ...
 
Face Detection and Tracking in a Video by Propagating Detection Probabilities
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Ragini Choudhury Verma, Cordelia Schmid, Krystian Mikolajczyk
Issue Date:October 2003
pp. 1215-1228
<p><b>Abstract</b>—This paper presents a new probabilistic method for detecting and tracking multiple faces in a video sequence. The proposed method integrates the information of face probabilities provided by the detector and the tempora...
 
Affine-Invariant Local Descriptors and Neighborhood Statistics for Texture Recognition
Found in: Computer Vision, IEEE International Conference on
By Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:October 2003
pp. 649
This paper presents a framework for texture recognition based on local affine-invariant descriptors and their spatial layout. At modeling time, a generative model of local descriptors is learned from sample images using the EM algorithm. The EM framework a...
 
A Sparse Texture Representation Using Affine-Invariant Regions
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:June 2003
pp. 319
This paper introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. At the feature extraction stage, a sparse set of affine-invar...
 
3D Object Modeling and Recognition Using Affine-Invariant Patches and Multi-View Spatial Constraints
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Fredrick Rothganger, Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Issue Date:June 2003
pp. 272
This paper presents a novel representation for three-dimensional objects in terms of affine-invariant image patches and their spatial relationships. Multi-view constraints associated with groups of patches are combined with a normalized representation of t...
 
Constructing models for content-based image retrieval
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Cordelia Schmid
Issue Date:December 2001
pp. 39
This paper presents a new method for constructing models from a set of positive and negative sample images; the method requires no manual extraction of significant objects or features. Our model representation is based on two layers. The first one consists...
 
Indexing based on scale invariant interest points
Found in: Computer Vision, IEEE International Conference on
By Krystian Mikolajczyk, Cordelia Schmid
Issue Date:July 2001
pp. 525
This paper presents a new method for detecting scale invariant interest points. The method is based on two recent results on scale space: 1) Interest points can be adapted to scale and give repeatable results (geometrically stable). 2) Local extrema over s...
 
Face Detection Based on Generic Local Descriptors and Spatial Constraints
Found in: Pattern Recognition, International Conference on
By Veronika Vogelhuber, Cordelia Schmid
Issue Date:September 2000
pp. 5084
In this paper, we present an algorithm for face detection that is based on generic local descriptors (e.g. eyes). A generic descriptor captures the distribution of individual descriptors over a set of samples (training images). This distribution is assumed...
 
Matching Images with Different Resolutions
Found in: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on
By Yves Dufournaud, Cordelia Schmid, Radu Horaud
Issue Date:June 2000
pp. 1612
In this paper, we address the problem of matching two images with two different resolutions: a high-resolution image and a low-resolution one. On the premise that changes in resolution act as a smoothing equivalent to changes in scale, a scale-space repres...
 
Building and Using Hypervideos
Found in: Applications of Computer Vision, IEEE Workshop on
By Pascal Bertolino, Roger Mohr, Cordelia Schmid, Patrick Bouthemy, Marc Gelgon, Fabien Spindler, Serge Benayoun, Helene Bernard
Issue Date:October 1998
pp. 276
This paper presents the first version of our platform for automatically building the structure of a video sequence. The first application uses semi-automatic tools based only on image analysis for building interactive videos: decomposing the video into sho...
   
Local Grayvalue Invariants for Image Retrieval
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Cordelia Schmid, Roger Mohr
Issue Date:May 1997
pp. 530-535
<p><b>Abstract</b>—This paper addresses the problem of retrieving images from large image databases. The method is based on local grayvalue invariants which are computed at automatically detected interest points. A voting algorithm and se...
 
Good Practice in Large-Scale Learning for Image Classification
Found in: IEEE Transactions on Pattern Analysis and Machine Intelligence
By Zeynep Akata,Florent Perronnin,Zaid Harchaoui,Cordelia Schmid
Issue Date:March 2014
pp. 507-520
We benchmark several SVM objective functions for large-scale image classification. We consider one-versus-rest, multiclass, ranking, and weighted approximate ranking SVMs. A comparison of online and batch methods for optimizing the objectives shows that on...
 
Event Retrieval in Large Video Collections with Circulant Temporal Encoding
Found in: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
By Jerome Revaud,Matthijs Douze,Cordelia Schmid,Herve Jegou
Issue Date:June 2013
pp. 2459-2466
This paper presents an approach for large-scale event retrieval. Given a video clip of a specific event, eg, the wedding of Prince William and Kate Middleton, the goal is to retrieve other videos representing the same event from a dataset of over 100k vide...
 
Label-Embedding for Attribute-Based Classification
Found in: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
By Zeynep Akata,Florent Perronnin,Zaid Harchaoui,Cordelia Schmid
Issue Date:June 2013
pp. 819-826
Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space o...
 
Expanded Parts Model for Human Attribute and Action Recognition in Still Images
Found in: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
By Gaurav Sharma,Frederic Jurie,Cordelia Schmid
Issue Date:June 2013
pp. 652-659
We propose a new model for recognizing human attributes (e.g. wearing a suit, sitting, short hair) and actions (e.g. running, riding a horse) in still images. The proposed model relies on a collection of part templates which are learnt discriminatively to ...
 
Comparing and Evaluating Interest Points
Found in: Computer Vision, IEEE International Conference on
By Cordelia Schmid, Roger Mohrand, Christian Bauckhage
Issue Date:January 1998
pp. 230
<p>Many computer vision tasks rely on feature extraction. Inter est points are such features. This paper shows that interest points are geometrically stable under different transformations and have high information content (distinctiveness). These tw...
 
 1  2 Next >>