Skip to Main content Skip to Navigation
Master thesis

Extracting scientific results from research articles

Lucas Pluvinage 1, 2
2 VALDA - Value from Data
DI-ENS - Département d'informatique de l'École normale supérieure, Inria de Paris
Abstract : The basic unit of information of use by researchers in theoretical fields are the mathematical results. We aim to build a knowledge base of these results, using information extraction techniques on scholarly documents. We present an algorithm which extracts mathematical results and references to mathematical results from scientific papers, using their PDF or LATEX sources. We analyse the results of our algorithm on the whole arXiv database of scientific papers and explore the resulting graph of mathematical results, which contains more than 6 million results and 4.5 million edges. We present attempts to link theorems of different papers using a TFIDF vectorizer or an autoencoder.
Complete list of metadata

Cited literature [15 references]  Display  Hide  Download

https://hal.inria.fr/hal-02956526
Contributor : Lucas Pluvinage Connect in order to contact the contributor
Submitted on : Friday, October 2, 2020 - 9:42:24 PM
Last modification on : Friday, October 15, 2021 - 1:41:22 PM
Long-term archiving on: : Monday, January 4, 2021 - 8:47:37 AM

File

Rapport_de_stage_M2.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02956526, version 1

Collections

Citation

Lucas Pluvinage. Extracting scientific results from research articles. Artificial Intelligence [cs.AI]. 2020. ⟨hal-02956526⟩

Share

Metrics

Record views

116

Files downloads

490