Skip to Main content Skip to Navigation
Conference papers

A Hybrid Bi-LSTM-CRF Model for Sequence Labeling Applied to the Sourcing Domain

Hasnaa Daoud Molka Tounsi Dhouib 1, 2 Jerôme Rancati Catherine Faron 1 Andrea Tettamanzi 1
1 WIMMICS - Web-Instrumented Man-Machine Interactions, Communities and Semantics
CRISAM - Inria Sophia Antipolis - Méditerranée , Laboratoire I3S - SPARKS - Scalable and Pervasive softwARe and Knowledge Systems
Abstract : In a number of areas, companies are often faced with the task of dealing with large amounts of textual customers' requests. Automating information extraction like key phrases from customers' requests can help to accelerate the processing process. Silex France is currently facing this challenge in the context of processing sourcing requests. In this article, we share our sequence labeling results based on a hybrid method Bi-LSTM-CRF, in an industrial context. This work was integrated in the B2B Silex platform for service providers recommendation. Experiments with the B2B Silex platform data show that, with a good choice of features to extract and optimal choice of hyper-parameters, the combination of the Bi-LSTM and CRF helps to achieve good results even in a context of small data. Indeed, the textual content processed is in the form of complete sentences generated by users, and thus is subject to typing errors. To handle this type of data we combine several types of extracted features describing the textual content such as: (i) semantics, (ii) syntax, (iii) word characters, (iv) position of words.
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal.inria.fr/hal-02932095
Contributor : Molka Tounsi Dhouib <>
Submitted on : Monday, September 7, 2020 - 3:07:13 PM
Last modification on : Tuesday, September 8, 2020 - 3:25:09 AM

File

APIA_2020_final_version.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02932095, version 1

Collections

Citation

Hasnaa Daoud, Molka Tounsi Dhouib, Jerôme Rancati, Catherine Faron, Andrea Tettamanzi. A Hybrid Bi-LSTM-CRF Model for Sequence Labeling Applied to the Sourcing Domain. 5ème Conférence Nationale sur les Applications Pratiques de l’Intelligence Artificielle (APIA 2020), Jul 2020, Angers, France. ⟨hal-02932095⟩

Share

Metrics

Record views

38

Files downloads

85