Progressive Data Analysis and Visualization

Abstract : We live in an era where data is abundant and growing rapidly; databases storing big data sprawl past memory and computation limits, and across distributed systems. New hardware and software systems have been built to sustain this growth in terms of storage management and predictive computation. However, these infrastructures, while good for data at scale, do not well support exploratory data analysis (EDA) as, for instance, commonly used in Visual Analytics. EDA allows human users to make sense of data with little or no known model on this data and is essential in many application domains, from network security and fraud detection to epidemiology and preventive medicine. Data exploration is done through an iterative loop where analysts interact with data through computations that return results, usually shown with visualizations, which in turn are interacted with by the analyst again. Due to human cognitive constraints, exploration needs highly responsive system response times: at 500 ms, users change their querying behavior; past five or ten seconds, users abandon tasks or lose attention. As datasets grow and computations become more complex, response time suffers. To address this problem, a new computation paradigm has emerged in the last decade under several names: online aggregation in the database community; progressive, incremental, or iterative visualization in other communities. It consists of splitting long computations into a series of approximate results improving with time; in this process, partial or approximate results are then rapidly returned to the user and can be interacted with in a fluent and iterative fashion. With the increasing growth in data, such progressive data analysis approaches will become one of the leading paradigms for data exploration systems, but it also will require major changes in the algorithms, data structures, and visualization tools. This Dagstuhl Seminar was set out to discuss and address these challenges, by bringing together researchers from the different involved research communities: database, visualization, and machine learning. Thus far, these communities have often been divided by a gap hindering joint efforts in dealing with forthcoming challenges in progressive data analysis and visualization. The seminar gave a platform for these researchers and practitioners to exchange their ideas, experience, and visions, jointly develop strategies to deal with challenges, and create a deeper awareness of the implications of this paradigm shift. The implications are technical, but also human--both perceptual and cognitive--and the seminar provided a holistic view of the problem by gathering specialists from all the communities.
Document type :
Directions of work or proceedings
Complete list of metadatas

Cited literature [5 references]  Display  Hide  Download

https://hal.inria.fr/hal-02090121
Contributor : Jean-Daniel Fekete <>
Submitted on : Thursday, April 4, 2019 - 3:43:20 PM
Last modification on : Thursday, October 3, 2019 - 5:48:01 PM
Long-term archiving on: Friday, July 5, 2019 - 2:11:10 PM

File

dagrep_v008_i010_p001_18411.pd...
Files produced by the author(s)

Identifiers

Collections

Citation

Jean-Daniel Fekete, Danyel Fisher, Arnab Nandi, Michael Sedlmair. Progressive Data Analysis and Visualization. Oct 2018, Wadern, Germany. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2019, ⟨10.4230/DagRep.8.10.1⟩. ⟨hal-02090121⟩

Share

Metrics

Record views

306

Files downloads

198