The central role of data repositories and data models in Data Science and Advanced Analytics - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue Future Generation Computer Systems Année : 2022

The central role of data repositories and data models in Data Science and Advanced Analytics

Résumé

In the age of “Data Science and Advanced Analytics”, we are witnessing a race for developing data-driven smart systems in various domains such as business, finance, healthcare, environment, cybersecurity, etc. due the explosion of the data issued by various providers. This development contributes in getting added value for companies and citizens. Two complementary ingredients are required for ensuring valuable systems: data and models. The data dimension is mainly related to Data Science that unifies machine learning, statistics, data mining, databases, and distributed systems. The achievement of this value may pass through the augmentation of input data by resources such as Knowledge Graphs. The success of the above techniques strongly depends on the quality of the input data and the consideration of other non-functional properties related to legal, ethical, and economical aspects. On the other hand, modeling plays a crucial role in Data Science since it covers all steps of Data Science workflow. Regarding data provenance and its quality, models contribute to providing vendor-independent solutions. At the algorithmic level, models help in explaining the inner working of the used methods/algorithms to system designers, users, regulators, and citizens to achieve trust and accountability. Therefore, the success of Data Science depends on our skill to use it a smart way and simultaneously exploiting data and modeling capabilities.
Fichier non déposé

Dates et versions

hal-03904787 , version 1 (17-12-2022)

Identifiants

Citer

Ladjel Bellatreche, Carlos Ordonez, Dominique Méry, Matteo Golfarelli, El Hassan Abdelwahed. The central role of data repositories and data models in Data Science and Advanced Analytics. Future Generation Computer Systems, 2022, 129, pp.13-17. ⟨10.1016/j.future.2021.11.027⟩. ⟨hal-03904787⟩
57 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More