3532 articles – 5253 references  [version française]

hal-00654576, version 1

Cheating to achieve Formal Concept Analysis over a large formal context

Victor Codocedo () 12, Carla Taramasco 3, Hernan Astudillo 1

The Eighth International Conference on Concept Lattices and their Applications - CLA 2011 (2011) 349-362

Abstract: Researchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents).

  • 1:  Departamento de Informatica [Valparaíso, Chile]
  • Universidad Técnica Federico Santa María (UTFSM)
  • 2:  ORPAILLEUR (INRIA Lorraine - LORIA)
  • INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
  • 3:  Centre de recherche en épistémologie appliquée (CREA)
  • CNRS : UMR7656 – Polytechnique - X
  • Domain : Computer Science/Document and Text Processing
    Computer Science/Information Retrieval
 
  • hal-00654576, version 1
  • oai:hal.archives-ouvertes.fr:hal-00654576
  • From: 
  • Submitted on: Thursday, 22 December 2011 12:38:30
  • Updated on: Thursday, 22 December 2011 13:28:28