Judging competitions and benchmarks: a candidate election approach - Archive ouverte HAL Access content directly
Conference Papers Year :

Judging competitions and benchmarks: a candidate election approach

(1) , (2) , (1)
1
2

Abstract

Machine learning progress relies on algorithm benchmarks. We study the problem of declaring a winner, or ranking "candidate" algorithms, based on results obtained by "judges" (scores on various tasks). Inspired by social science and game theory on fair elections, we compare various ranking functions, ranging from simple score averaging to Condorcet methods. We devise novel empirical criteria to assess the quality of ranking functions, including the generalization to new tasks and the stability under judge or candidate perturbation. We conduct an empirical comparison on the results of 5 competitions and benchmarks (one artificially generated). While prior theoretical analyses indicate that no single ranking function satisfies all desired properties, our empirical study reveals that the classical "average rank" method fares well. However, some pairwise comparison methods can get better empirical results.
Fichier principal
Vignette du fichier
Judging_Competitions_ESANN_HAL.pdf (362.46 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03367857 , version 1 (06-10-2021)
hal-03367857 , version 2 (02-12-2021)
hal-03367857 , version 3 (06-01-2022)

Identifiers

  • HAL Id : hal-03367857 , version 3

Cite

Adrien Pavao, Michael Vaccaro, Isabelle Guyon. Judging competitions and benchmarks: a candidate election approach. ESANN 2021 - 29th European Symposium on Artificial Neural Networks, Oct 2021, Bruges/Virtual, Belgium. ⟨hal-03367857v3⟩
115 View
108 Download

Share

Gmail Facebook Twitter LinkedIn More