Skip to Main content Skip to Navigation
Theses

Matching and mining in knowledge graphs of the Web of data - Applications in pharmacogenomics

Pierre Monnin 1
1 ORPAILLEUR - Knowledge representation, reasonning
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In the Web of data, an increasing number of knowledge graphs are concurrently published, edited, and accessed by human and software agents. Their wide adoption makes key the two tasks of matching and mining. First, matching consists in identifying equivalent, more specific, or somewhat similar units within and across knowledge graphs. This task is crucial since concurrent publication and edition may result in coexisting and complementary knowledge graphs. However, this task is challenging because of the inherent heterogeneity of knowledge graphs, e.g., in terms of granularities, vocabularies, and completeness. Motivated by an application in pharmacogenomics, we propose two approaches to match n-ary relationships represented in knowledge graphs: a symbolic rule-based approach and a numeric approach using graph embedding. We experiment on PGxLOD, a knowledge graph that we semi-automatically built by integrating pharmacogenomic relationships from three distinct sources of this domain. Second, mining consists in discovering new and useful knowledge units from knowledge graphs. Their increasing size and combinatorial nature entail scalability issues, which we address in the mining of path patterns. We also propose Concept Annotation, a refinement approach extending Formal Concept Analysis, a mathematical framework that groups entities based on their common attributes. Throughout all our works, we particularly focus on taking advantage of domain knowledge in the form of ontologies that can be associated with knowledge graphs. We show that, when considered, such domain knowledge alleviates heterogeneity and scalability issues in matching and mining approaches.
Complete list of metadata

https://hal.inria.fr/tel-03122326
Contributor : Pierre Monnin Connect in order to contact the contributor
Submitted on : Tuesday, January 26, 2021 - 10:39:45 PM
Last modification on : Wednesday, November 3, 2021 - 7:57:51 AM
Long-term archiving on: : Tuesday, April 27, 2021 - 7:53:49 PM

File

thesis-pmonnin.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-03122326, version 1

Citation

Pierre Monnin. Matching and mining in knowledge graphs of the Web of data - Applications in pharmacogenomics. Databases [cs.DB]. Université de Lorraine, 2020. English. ⟨NNT : 2020LORR0212⟩. ⟨tel-03122326⟩

Share

Metrics

Les métriques sont temporairement indisponibles