S. Agarwal, B. Mozafari, A. Panda, H. Milner, S. Madden et al., BlinkDB, Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys '13, pp.29-42, 2013.
DOI : 10.1145/2465351.2465355

A. Alawini, D. Maier, K. Tufte, and B. Howe, Helping scientists reconnect their datasets, Proceedings of the 26th International Conference on Scientific and Statistical Database Management, SSDBM '14, pp.1-2912, 2014.
DOI : 10.1145/2618243.2618263

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.465.801

I. Altintas, O. Barney, and E. Jaeger-frank, Provenance Collection Support in the Kepler Scientific Workflow System, IPAW, pp.118-132, 2006.
DOI : 10.1007/11890850_14

Y. Amsterdamer, S. B. Davidson, D. Deutch, T. Milo, J. Stoyanovich et al., Putting lipstick on pig, Proceedings of the VLDB Endowment, vol.5, issue.4, pp.346-357, 2011.
DOI : 10.14778/2095686.2095693

Y. Amsterdamer, D. Deutch, T. Milo, and V. Tannen, On provenance minimization, ACM Trans. Database Syst, vol.37, issue.4, p.30, 2012.
DOI : 10.1145/1989284.1989303

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.5370

E. Bareinboim, J. Tian, and J. Pearl, Recovering from selection bias in causal and statistical inference, Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, AAAI'14, pp.2410-2416, 2014.

P. Anant, S. Bhardwaj, A. Bhattacherjee, A. Chavan, A. J. Deshpande et al., Datahub: Collaborative data science & dataset version management at scale, CIDR, 2015.

M. Boehm, S. Tatikonda, B. Reinwald, P. Sen, Y. Tian et al., Hybrid parallelization strategies for large-scale machine learning in SystemML, Proc. VLDB Endow, pp.553-564, 2014.
DOI : 10.14778/2732286.2732292

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.429.901

P. Buneman, S. Khanna, and W. Tan, Why and Where: A Characterization of Data Provenance, ICDT, pp.316-330, 2001.
DOI : 10.1007/3-540-44503-X_20

Z. Cai, Z. Vagena, L. Perez, S. Arumugam, P. J. Haas et al., Simulation of database-valued markov chains using SimSQL, Proceedings of the 2013 international conference on Management of data, SIGMOD '13, pp.637-648, 2013.
DOI : 10.1145/2463676.2465283

S. Chaudhuri, G. Das, and V. Narasayya, Optimized stratified sampling for approximate query processing, ACM Transactions on Database Systems, vol.32, issue.2, 2007.
DOI : 10.1145/1242524.1242526

F. Chirigati, H. Doraiswamy, T. Damoulas, and J. Freire, Data Polygamy, Proceedings of the 2016 International Conference on Management of Data, SIGMOD '16, pp.1011-1025, 2016.
DOI : 10.1145/2882903.2915245

J. Clause, W. Li, and A. Orso, Dytan, Proceedings of the 2007 international symposium on Software testing and analysis, ISSTA '07, pp.196-206, 2007.
DOI : 10.1145/1273463.1273490

A. Datta, S. Sen, and Y. Zick, Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems, 2016 IEEE Symposium on Security and Privacy (SP), pp.598-617, 2016.
DOI : 10.1109/SP.2016.42

B. Susan, J. Davidson, and . Freire, Provenance and scientific workflows: challenges and opportunities, ACM SIGMOD, pp.1345-1350, 2008.

S. B. Davidson, T. Milo, and S. Roy, A propagation model for provenance views of public/private workflows, Proceedings of the 16th International Conference on Database Theory, ICDT '13, pp.165-176, 2013.
DOI : 10.1145/2448496.2448517

C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold et al., Generalization in adaptive data analysis and holdout reuse, NIPS, pp.2350-2358, 2015.
DOI : 10.1126/science.aaa9375

C. Dwork and A. Roth, The Algorithmic Foundations of Differential Privacy, Foundations and Trends?? in Theoretical Computer Science, vol.9, issue.3-4, pp.211-407, 2014.
DOI : 10.1561/0400000042

T. J. Green, G. Karvounarakis, and V. Tannen, Provenance semirings, Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems , PODS '07, pp.31-40, 2007.
DOI : 10.1145/1265530.1265535

J. M. Hellerstein, C. Ré, F. Schoppmann, D. Z. Wang, E. Fratkin et al., The MADlib analytics library, Proceedings of the VLDB Endowment, vol.5, issue.12, pp.1700-1711, 2012.
DOI : 10.14778/2367502.2367510

T. Herndon, M. Ash, and R. Pollin, Does high public debt consistently stifle economic growth? a critique of reinhart and rogo ff, 2013.

S. Jain, D. Moritz, D. Halperin, B. Howe, and E. Lazowska, SQLShare, Proceedings of the 2016 International Conference on Management of Data, SIGMOD '16, pp.281-293, 2016.
DOI : 10.1145/2882903.2882957

K. Lefevre, R. Agrawal, V. Ercegovac, R. Ramakrishnan, Y. Xu et al., Limiting Disclosure in Hippocratic Databases, VLDB, pp.108-119, 2004.
DOI : 10.1016/B978-012088469-8.50013-9

Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola et al., Distributed GraphLab, Proceedings of the VLDB Endowment, vol.5, issue.8, pp.716-727, 2012.
DOI : 10.14778/2212351.2212354

A. Meliou, S. Roy, and D. Suciu, Causality and explanations in databases, Proceedings of the VLDB Endowment, vol.7, issue.13, pp.1715-1716, 2014.
DOI : 10.14778/2733004.2733070

J. Vera-zaychik-moffitt, S. Stoyanovich, G. Abiteboul, and . Miklau, Collaborative Access Control in WebdamLog, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pp.197-211, 2015.
DOI : 10.1145/2723372.2749433

C. Muñoz and M. Smith, Big data: A report on algorithmic systems, opportunity, and civil rights. The White House, 2016.

F. Olken and D. Rotem, Simple random sampling from relational databases, VLDB, pp.160-169, 1986.

J. Stoyanovich and E. P. Goodman, Revealing algorithmic rankers. Freedom to Tinker, 2016.

F. Tramèr, V. Atlidakis, R. Geambasu, D. J. Hsu, J. Hubaux et al., Discovering unwarranted associations in data-driven applications with the fairtest testing toolkit, 1510.

Q. Wang, T. Yu, N. Li, J. Lobo, E. Bertino et al., On the correctness criteria of fine-grained access control in relational databases, VLDB, pp.555-566, 2007.

R. S. Xin, J. Rosen, M. Zaharia, M. J. Franklin, S. Shenker et al., Shark, Proceedings of the 2013 international conference on Management of data, SIGMOD '13, pp.13-24, 2013.
DOI : 10.1145/2463676.2465288

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: Cluster computing with working sets, HotCloud, 2010.