Skip to Main content Skip to Navigation
New interface
Conference papers

DataStates: Towards Lightweight Data Models for Deep Learning

Abstract : A key emerging pattern in deep learning applications is the need to capture intermediate DNN model snapshots and preserve or clone them to explore a large number of alternative training and/or inference paths. However, with increasing model complexity and new training approaches that mix data, model, pipeline and layer-wise parallelism, this pattern is challenging to address in a scalable and efficient manner. To this end, this position paper advocates for rethinking how to represent and manipulate DNN learning models. It relies on a broader notion of data states, a collection of annotated, potentially distributed data sets (tensors in the case of DNN models) that AI applications can capture at key moments during the runtime and revisit/reuse later. Instead explicitly interacting with the storage layer (e.g., write to a file), users can "tag" DNN models at key moments during runtime with metadata that expresses attributes and persistency/movement semantics. A high-performance runtime is the responsible to interpret the metadata and perform the necessary actions in the background, while offering a rich interface to find data states of interest. Using this approach has benefits at several levels: new capabilities, performance portability, high performance and scalability.
Complete list of metadata

Cited literature [36 references]  Display  Hide  Download
Contributor : Bogdan Nicolae Connect in order to contact the contributor
Submitted on : Wednesday, September 16, 2020 - 10:15:43 PM
Last modification on : Wednesday, October 14, 2020 - 4:11:55 AM
Long-term archiving on: : Friday, December 4, 2020 - 11:03:07 PM


Files produced by the author(s)


  • HAL Id : hal-02941295, version 1


Bogdan Nicolae. DataStates: Towards Lightweight Data Models for Deep Learning. SMC'20: The 2020 Smoky Mountains Computational Sciences and Engineering Conference, Aug 2020, Nashville (virtual conference), United States. ⟨hal-02941295⟩



Record views


Files downloads