Scaling HDFS with a Strongly Consistent Relational Model for Metadata

Abstract : The Hadoop Distributed File System (HDFS) scales to store tens of petabytes of data despite the fact that the entire file system’s metadata must fit on the heap of a single Java virtual machine. The size of HDFS’ metadata is limited to under 100 GB in production, as garbage collection events in bigger clusters result in heartbeats timing out to the metadata server (NameNode).In this paper, we address the problem of how to migrate the HDFS’ metadata to a relational model, so that we can support larger amounts of storage on a shared-nothing, in-memory, distributed database. Our main contribution is that we show how to provide at least as strong consistency semantics as HDFS while adding support for a multiple-writer, multiple-reader concurrency model. We guarantee freedom from deadlocks by logically organizing inodes (and their constituent blocks and replicas) into a hierarchy and having all metadata operations agree on a global order for acquiring both explicit locks and implicit locks on subtrees in the hierarchy. We use transactions with pessimistic concurrency control to ensure the safety and progress of metadata operations. Finally, we show how to improve performance of our solution by introducing a snapshotting mechanism at NameNodes that minimizes the number of roundtrips to the database.
Type de document :
Communication dans un congrès
David Hutchison; Takeo Kanade; Bernhard Steffen; Demetri Terzopoulos; Doug Tygar; Gerhard Weikum; Kostas Magoutis; Peter Pietzuch; Josef Kittler; Jon M. Kleinberg; Alfred Kobsa; Friedemann Mattern; John C. Mitchell; Moni Naor; Oscar Nierstrasz; C. Pandu Rangan. 4th International Conference on Distributed Applications and Interoperable Systems (DAIS), Jun 2014, Berlin, Germany. Springer, Lecture Notes in Computer Science, LNCS-8460, pp.38-51, 2014, Distributed Applications and Interoperable Systems. 〈10.1007/978-3-662-43352-2_4〉
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01287731
Contributeur : Hal Ifip <>
Soumis le : lundi 14 mars 2016 - 10:48:49
Dernière modification le : jeudi 12 mai 2016 - 10:49:50
Document(s) archivé(s) le : dimanche 13 novembre 2016 - 18:04:47

Fichier

326177_1_En_4_Chapter.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Kamal Hakimzadeh, Hooman Peiro Sajjad, Jim Dowling. Scaling HDFS with a Strongly Consistent Relational Model for Metadata. David Hutchison; Takeo Kanade; Bernhard Steffen; Demetri Terzopoulos; Doug Tygar; Gerhard Weikum; Kostas Magoutis; Peter Pietzuch; Josef Kittler; Jon M. Kleinberg; Alfred Kobsa; Friedemann Mattern; John C. Mitchell; Moni Naor; Oscar Nierstrasz; C. Pandu Rangan. 4th International Conference on Distributed Applications and Interoperable Systems (DAIS), Jun 2014, Berlin, Germany. Springer, Lecture Notes in Computer Science, LNCS-8460, pp.38-51, 2014, Distributed Applications and Interoperable Systems. 〈10.1007/978-3-662-43352-2_4〉. 〈hal-01287731〉

Partager

Métriques

Consultations de la notice

39

Téléchargements de fichiers

16