Skip to Main content Skip to Navigation
Master thesis

Apprentissage par transfert pour l’extraction de relations pharmacogénomiques à partir de textes

Walid Hafiane 1, 2
1 ORPAILLEUR - Knowledge representation, reasonning
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : The extraction of relationships between named entities is a required task for text mining, particularly in the biomedical field, this task allows to synthesize the available knowledge in a lot of publications. Recently, deep learning approaches have significantly improved performance in relation extraction task. However, the biomedical data complexity is a major challenge facing this approach. Current architectures do not fully perform in particular relation extraction (i.e., pharmacogenomics, protein-molecule). In this work, we propose architectures that bring in a significant improvement to this task. On the other hand, obtaining a large amount of annotated data in the biomedical field is challenging and expensive, in attempt to cover this lack of data transfer learning is used. In this context, we have opted for two transfer learning strategies: frozen and fine-tuning. Our BERT-CNN-segmentation architecture with the fine-tuning strategy achieve the new state-of-the-art results on two benchmark biomedical corpora with 32.77 % absolute improvement in F- macro on PGxCorpus and 1.73% absolute improvement in F-micro on the ChemProt. These results show the usefulness of transfer learning and the improved performance of the BERT transformers through the exploitation of local latent information in the representation vectors by reinforcing this information with the structural information resulting from sentence segmentation.
Complete list of metadatas

Cited literature [53 references]  Display  Hide  Download

https://hal.inria.fr/hal-02939161
Contributor : Walid Hafiane <>
Submitted on : Thursday, September 17, 2020 - 12:25:01 PM
Last modification on : Friday, September 25, 2020 - 3:11:18 AM

File

Rapport_Stage_HAFIANE_Walid.pd...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02939161, version 1

Citation

Walid Hafiane. Apprentissage par transfert pour l’extraction de relations pharmacogénomiques à partir de textes. Informatique [cs]. 2020. ⟨hal-02939161⟩

Share

Metrics

Record views

66

Files downloads

133