Text Simplification of Patent Documents - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Text Simplification of Patent Documents

Résumé

This paper represents an automatic text simplification system for patent documents. The simplification system is embedded in the broader context of an information retrieval system which extracts IDM related knowledge from patent documents. Extracting elements of IDM ontology from patents involves training machine-learning model. However, an accuracy of the model is compromised when the given text is too long, hence the need of simplifying the texts to improve machine learning. There have been precedent studies on automatic text simplification based on hand-written rules or statistical approach. However, few researches addressed simplifying patent documents. Patent document has its particularity in its lengthy sentences and multiword expression terminology, which often hinder accurate parsing. Therefore, in this research, we present our method to automatically simplify texts of patent documents and scientific papers by analyzing their syntactic and lexical patterns.
Fichier principal
Vignette du fichier
474537_1_En_19_Chapter.pdf (613.76 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02279758 , version 1 (05-09-2019)

Licence

Paternité

Identifiants

Citer

Jeongwoo Kang, Achille Souili, Denis Cavallucci. Text Simplification of Patent Documents. 18th TRIZ Future Conference (TFC), Oct 2018, Strasbourg, France. pp.225-237, ⟨10.1007/978-3-030-02456-7_19⟩. ⟨hal-02279758⟩
95 Consultations
117 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More