An Architecture for Data Warehousing in Big Data Environments

Abstract : Recent advances in Information Technologies facilitate the increasing capacity to collect and store data, being the Big Data term often mentioned. In this context, many challenges need to be addressed, being Data Warehousing one of them. In this sense, the main purpose of this work is to propose an architecture for Data Warehousing in Big Data, taking as input a data source stored in a traditional Data Warehouse, which is transformed into a Data Warehouse in Hive. Before proposing and implementing the architecture, a benchmark was conducted to verify the processing times of Hive and Impala, understanding how these technologies could be integrated in an architecture where Hive plays the role of a Data Warehouse and Impala is the driving force for the analysis and visualization of data. After the proposal of the architecture, it was implemented using tools like the Hadoop ecosystem, Talend and Tableau, and validated using a data set with more than 100 million records, obtaining satisfactory results in terms of processing times.
Type de document :
Communication dans un congrès
A Min Tjoa; Li Da Xu; Maria Raffai; Niina Maarit Novak. 10th International Conference on Research and Practical Issues of Enterprise Information Systems (CONFENIS), Dec 2016, Vienna, Austria. Springer International Publishing, Lecture Notes in Business Information Processing, LNBIP-268, pp.237-250, 2016, Research and Practical Issues of Enterprise Information Systems. 〈10.1007/978-3-319-49944-4_18〉
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01630532
Contributeur : Hal Ifip <>
Soumis le : mardi 7 novembre 2017 - 17:26:51
Dernière modification le : jeudi 9 novembre 2017 - 01:16:28

Fichier

 Accès restreint
Fichier visible le : 2019-01-01

Connectez-vous pour demander l'accès au fichier

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Bruno Martinho, Maribel Santos. An Architecture for Data Warehousing in Big Data Environments. A Min Tjoa; Li Da Xu; Maria Raffai; Niina Maarit Novak. 10th International Conference on Research and Practical Issues of Enterprise Information Systems (CONFENIS), Dec 2016, Vienna, Austria. Springer International Publishing, Lecture Notes in Business Information Processing, LNBIP-268, pp.237-250, 2016, Research and Practical Issues of Enterprise Information Systems. 〈10.1007/978-3-319-49944-4_18〉. 〈hal-01630532〉

Partager

Métriques

Consultations de la notice

48