Skip to Main content Skip to Navigation
New interface
Conference papers

Active Data: A Data-Centric Approach to Data Life-Cycle Management

Anthony Simonet 1, * Gilles Fedak 1 Matei Ripeanu 2 Samer Al-Kiswany 2 
* Corresponding author
1 AVALON - Algorithms and Software Architectures for Distributed and HPC Platforms
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : Data-intensive science offers new opportunities for innovation and discoveries, provided that large datasets can be handled efficiently. Data management for data-intensive science applications is challenging; requiring support for complex data life cycles, coordination across multiple sites, fault tolerance, and scalability to support tens of sites and petabytes of data. In this paper, we argue that data management for data-intensive science applications requires a fundamentally different management approach than the current ad-hoc task centric approach. We propose Active Data, a fundamentally novel paradigm for data life cycle management. Active Data follows two principles: data-centric and event-driven. We report on the Active Data programming model and its preliminary implementation, and discuss the benefits and limitations of the approach on recognized challenging data-intensive science use-cases.
Complete list of metadata

Cited literature [20 references]  Display  Hide  Download
Contributor : Anthony Simonet Connect in order to contact the contributor
Submitted on : Thursday, December 19, 2013 - 4:08:06 PM
Last modification on : Tuesday, October 25, 2022 - 4:21:54 PM
Long-term archiving on: : Thursday, March 20, 2014 - 12:00:42 PM


Files produced by the author(s)



Anthony Simonet, Gilles Fedak, Matei Ripeanu, Samer Al-Kiswany. Active Data: A Data-Centric Approach to Data Life-Cycle Management. PDSW '13 - 8th Parallel Data Storage Workshop, Nov 2013, Denver, United States. pp.39-44, ⟨10.1145/2538542.2538566⟩. ⟨hal-00921080⟩



Record views


Files downloads