D. Angluin and P. Laird, Learning from noisy examples, Machine Learning, vol.2, issue.4, pp.343-370, 1988.

B. M. Atkinson, Captology: A critical review, International conference on persuasive technology, pp.171-182, 2006.

P. Brennan and A. Silman, Statistical methods for assessing observer variability in clinical measures, BMJ: British Medical Journal, vol.304, issue.6840, p.1491, 1992.

M. M. Breunig, H. P. Kriegel, and R. T. Ng, Identifying density-based local outliers, SIGMOD Rec, vol.29, issue.2, pp.93-104, 2000.

F. Cabitza, D. Ciucci, and R. Rasoini, A giant with feet of clay: on the validity of the data that feed machine learning in medicine, Organizing for the Digital World, pp.121-136, 2019.

F. Cabitza, L. G. Dui, and G. Banfi, Pros in the wild: Assessing the validity of patient reported outcomes in an electronic registry, Computer methods and programs in biomedicine, 2019.

F. Cabitza, A. Locoro, C. Alderighi, R. Rasoini, D. Compagnone et al., The elephant in the record: on the multiplicity of data recording work, Health informatics journal, 2019.

F. Cabitza, A. Campagner, D. Ciucci, and A. Seveso, Programmed Inefficiencies in DSS-supported Human Decision Making, Proceedings of 16th MDAI International Conference, 2019.

A. Campagner, F. Cabitza, and D. Ciucci, Exploring Medical Data Classification with Three-Way Decision Trees, Proceedings of the 12th BIOSTEC International Joint Conference, vol.5, pp.147-158, 2019.

A. Campagner, F. Cabitza, and D. Ciucci, Three-Way Classification: Ambiguity and Abstention in Machine Learning, Rough Sets -International Joint Conference, IJCRS 2019, vol.11499, pp.280-294, 2019.

F. Doshi-velez and B. Kim, Towards a rigorous science of interpretable machine learning, 2017.

D. Dubois and H. Prade, Possibility Theory and Its Applications: Where Do We Stand?, pp.31-60, 2015.

P. N. Edwards, M. S. Mayernik, and A. L. Batcheller, Science friction: Data, metadata, and collaboration, Social Studies of Science, vol.41, issue.5, pp.667-690, 2011.

A. Esteva, B. Kuprel, and R. A. Novoa, Dermatologist-level classification of skin cancer with deep neural networks, Nature, vol.542, issue.7639, p.115, 2017.

A. R. Feinstein and D. V. Cicchetti, High agreement but low kappa: I. the problems of two paradoxes, Journal of clinical epidemiology, vol.43, issue.6, pp.543-549, 1990.

B. J. Fogg, Persuasive computers: perspectives and research directions, CHI'98, pp.225-232, 1998.

J. Goguen, The dry and the wet, Proceedings of the IFIP TC8/WG8.1 Working Conference on Information System Concepts: Improving the Understanding, pp.1-17, 1992.

R. Goebel, A. Chander, and K. Holzinger, Explainable AI: the new 42, International Cross-Domain Conference for Machine Learning and Knowledge Extraction, pp.295-303, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01934928

V. Gulshan, L. Peng, and M. Coram, Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs, Jama, vol.316, issue.22, pp.2402-2410, 2016.

D. Gur, A. I. Bandos, and C. S. Cohen, The "laboratory" effect: comparing radiologists' performance and variability during prospective clinical and laboratory mammography interpretations, Radiology, vol.249, issue.1, pp.47-53, 2008.

H. Haenssle, C. Fink, and R. Schneiderbauer, Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists, Annals of Oncology, vol.29, issue.8, pp.1836-1842, 2018.

S. S. Han, G. H. Park, and W. Lim, Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis, PloS one, vol.13, issue.1, p.191493, 2018.

S. Heinecke and L. Reyzin, Crowdsourced pac learning under classification noise, 2019.

A. Holzinger, G. Langs, H. Denk, K. Zatloukal, and H. Mueller, Causability and Explainability of AI in Medicine, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol.9, issue.4, 2019.

H. Jiang and O. Nachum, Identifying and correcting label bias in machine learning, 2019.

A. Justel, D. Peña, and R. Zamar, A multivariate kolmogorov-smirnov test of goodness of fit, Statistics & Probability Letters, vol.35, issue.3, pp.251-259, 1997.

H. P. Kriegel, P. Kröger, E. Schubert, and A. Zimek, Interpreting and unifying outlier scores, pp.13-24, 2011.

K. Krippendorff, Content analysis: An introduction to its methodology, Sage publications, 2018.

H. Lakkaraju, E. Kamar, R. Caruana, and J. Leskovec, Interpretable & explorable approximations of black box models, 2017.

J. R. Landis and G. G. Koch, The measurement of observer agreement for categorical data, pp.159-174, 1977.

Z. C. Lipton, The mythos of model interpretability, 2016.

D. J. Mackay, Bayesian methods for adaptive models, 1992.

Z. B. Popovi? and J. D. Thomas, Assessing observer variability: a user's guide. Cardiovascular diagnosis and therapy, vol.7, p.317, 2017.

D. Quarfoot and R. A. Levine, How robust are multirater interrater reliability indices to changes in frequency distribution?, The American Statistician, vol.70, issue.4, pp.373-384, 2016.

L. Ralaivola, F. Denis, and C. N. Magnan, Cn = cpcn. In: ICML '06, ACM, 2006.

J. Stand, The hawthorne effect -what did the original Hawthorne studies actually show, Scand J Work Environ Health, vol.26, issue.4, pp.363-367, 2000.

C. M. Svensson, S. Krusekopf, and J. Lücke, Automated detection of circulating tumor cells with naive bayesian classifiers, Cytometry Part A, vol.85, issue.6, pp.501-511, 2014.

N. Vapnik, V. Ya, and A. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities, Theoretical Probabibility and its Applicactions, vol.17, pp.264-280, 1971.

S. Wachter, B. Mittelstadt, and L. Floridi, Why a right to explanation of automated decision-making does not exist in the general data protection regulation, International Data Privacy Law, vol.7, issue.2, pp.76-99, 2017.

D. Wishart, k-means clustering with outlier detection, mixed variables and missing values, pp.216-226, 2003.

Y. Yao, An outline of a theory of three-way decisions, Lecture Notes in Computer Science, vol.7413, 2012.

L. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems, vol.100, pp.9-34, 1999.