Usage based indexing of web resources with natural language processing

Armelle Brun 1 Anne Boyer 2
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
2 MAIA - Autonomous intelligent machine
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Due to the huge amount of available information via Internet, the identification of reliable and interesting items becomes more and more difficult and time consuming. This paper is a position paper describing our intended work in the framework of multimedia information retrieval by browsing techniques within web navigation. It relies on a usage-based indexing of resources: we ignore the nature, the content and the structure of resources. We describe a new approach taking advantage of the similarity between statistical modeling of language and document retrieval systems. A syntax of usage is computed that designs a Statistical Grammar of Usage (SGU). A SGU enables resources classification to perform a personalized navigation assistant tool. It relies both on collaborative filtering to compute virtual communities of users and a new distance dependent trigger model. The resulting SGU is a community dependent SGU.
Document type :
Conference papers
Complete list of metadatas

Cited literature [12 references]  Display  Hide  Download

https://hal.inria.fr/inria-00172234
Contributor : Armelle Brun <>
Submitted on : Friday, September 14, 2007 - 3:14:38 PM
Last modification on : Thursday, January 11, 2018 - 6:19:56 AM
Long-term archiving on: Thursday, April 8, 2010 - 9:54:45 PM

File

WebistBrunBoyer.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00172234, version 1

Collections

Citation

Armelle Brun, Anne Boyer. Usage based indexing of web resources with natural language processing. 3rd International Conference on Web Information Systems and Technologies - Webist 07, INSTICC - Institute for Systems and Technologies of Information, Control and Communication ; Open University of Catalonia, Mar 2007, Barcelone, Spain. ⟨inria-00172234⟩

Share

Metrics

Record views

289

Files downloads

208