Skip to Main content Skip to Navigation
Conference papers

Towards true 3D object recognition

Jean Ponce 1 Svetlana Lazebnik 1 Fred Rothganger 1, 2 Cordelia Schmid 3, * 
* Corresponding author
3 LEAR - Learning and recognition in vision
GRAVIR - IMAG - Laboratoire d'informatique GRAphique, VIsion et Robotique de Grenoble, Inria Grenoble - Rhône-Alpes, CNRS - Centre National de la Recherche Scientifique : FR71
Abstract : This talk addresses the problem of recognizing three-dimensional (3D) objects in photographs and image sequences, revisiting viewpoint invariants as a -local- representation of shape and appearance. The key insight is that, although smooth surfaces are almost never planar in the large, and thus do not (in general) admit global invariants, they are always planar in the small---that is, sufficiently small surface patches can always be thought of as being comprised of coplanar points---and thus can be represented locally by planar invariants. This is the basis for a new, unified approach to object recognition where object models consist of a collection of small (planar) patches, their invariants, and a description of their 3D spatial relationship. Specifically, the local invariants used in this proposal are the affine-invariant descriptions of the image brightness pattern in the neighborhood of salient image features ("interest points") recently developed by Lindeberg and Garding and by Mikolajczyk and Schmid. These affine-invariant patches provide a normalized representation of the local object appearance, invariant under viewpoint and illumination changes, that can be used as a local measure of image, part, or object similarity. The spatial relationship between local invariants is used to represent the global object structure and drive the recognition process. I will illustrate our approach with two fundamental instances of the 3D object recognition problem: (1) modeling rigid 3D objects from a small set of unregistered pictures and recognizing them in cluttered photographs taken from unconstrained viewpoints; and (2) representing, learning, and recognizing non-uniform texture patterns under non-rigid transformations. If time permits, I will conclude with a brief discussion of our current work in 3D photography using shape, texture, and motion cues.
Document type :
Conference papers
Complete list of metadata
Contributor : THOTH Team Connect in order to contact the contributor
Submitted on : Monday, December 20, 2010 - 9:09:28 AM
Last modification on : Wednesday, February 2, 2022 - 3:58:34 PM


  • HAL Id : inria-00548535, version 1



Jean Ponce, Svetlana Lazebnik, Fred Rothganger, Cordelia Schmid. Towards true 3D object recognition. 14ème Congrès de Reconnaissance des Formes et Intelligence Artificielle (RFIA '04), Jan 2004, Toulouse, France. ⟨inria-00548535⟩



Record views