Extracting scientific results from research articles - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Mémoires D'étudiants -- Hal-Inria+ Année : 2020

Extracting scientific results from research articles

Résumé

The basic unit of information of use by researchers in theoretical fields are the mathematical results. We aim to build a knowledge base of these results, using information extraction techniques on scholarly documents. We present an algorithm which extracts mathematical results and references to mathematical results from scientific papers, using their PDF or LATEX sources. We analyse the results of our algorithm on the whole arXiv database of scientific papers and explore the resulting graph of mathematical results, which contains more than 6 million results and 4.5 million edges. We present attempts to link theorems of different papers using a TFIDF vectorizer or an autoencoder.
Fichier principal
Vignette du fichier
Rapport_de_stage_M2.pdf (2.14 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02956526 , version 1 (02-10-2020)

Identifiants

  • HAL Id : hal-02956526 , version 1

Citer

Lucas Pluvinage. Extracting scientific results from research articles. Artificial Intelligence [cs.AI]. 2020. ⟨hal-02956526⟩
216 Consultations
1290 Téléchargements

Partager

Gmail Facebook X LinkedIn More