Towards true 3D object recognition

Jean Ponce 1 Svetlana Lazebnik 1 Fred Rothganger 1, 2 Cordelia Schmid 3, *
* Auteur correspondant
3 LEAR - Learning and recognition in vision
GRAVIR - IMAG - Graphisme, Vision et Robotique, Inria Grenoble - Rhône-Alpes, CNRS - Centre National de la Recherche Scientifique : FR71
Abstract : This talk addresses the problem of recognizing three-dimensional (3D) objects in photographs and image sequences, revisiting viewpoint invariants as a -local- representation of shape and appearance. The key insight is that, although smooth surfaces are almost never planar in the large, and thus do not (in general) admit global invariants, they are always planar in the small---that is, sufficiently small surface patches can always be thought of as being comprised of coplanar points---and thus can be represented locally by planar invariants. This is the basis for a new, unified approach to object recognition where object models consist of a collection of small (planar) patches, their invariants, and a description of their 3D spatial relationship. Specifically, the local invariants used in this proposal are the affine-invariant descriptions of the image brightness pattern in the neighborhood of salient image features ("interest points") recently developed by Lindeberg and Garding and by Mikolajczyk and Schmid. These affine-invariant patches provide a normalized representation of the local object appearance, invariant under viewpoint and illumination changes, that can be used as a local measure of image, part, or object similarity. The spatial relationship between local invariants is used to represent the global object structure and drive the recognition process. I will illustrate our approach with two fundamental instances of the 3D object recognition problem: (1) modeling rigid 3D objects from a small set of unregistered pictures and recognizing them in cluttered photographs taken from unconstrained viewpoints; and (2) representing, learning, and recognizing non-uniform texture patterns under non-rigid transformations. If time permits, I will conclude with a brief discussion of our current work in 3D photography using shape, texture, and motion cues.
Type de document :
Communication dans un congrès
14ème Congrès de Reconnaissance des Formes et Intelligence Artificielle (RFIA '04), Jan 2004, Toulouse, France. 2004
Liste complète des métadonnées
Contributeur : Thoth Team <>
Soumis le : lundi 20 décembre 2010 - 09:09:28
Dernière modification le : jeudi 11 janvier 2018 - 06:20:04


  • HAL Id : inria-00548535, version 1




Jean Ponce, Svetlana Lazebnik, Fred Rothganger, Cordelia Schmid. Towards true 3D object recognition. 14ème Congrès de Reconnaissance des Formes et Intelligence Artificielle (RFIA '04), Jan 2004, Toulouse, France. 2004. 〈inria-00548535〉



Consultations de la notice