Abstract : In this paper we report on how using sample-dependent coding schemes can lead to poor results in applying Rissanen's Minimum Description Length (MDL) principle [Ris89]. The MDL principle is one of the many known model selection methods in the field of `machine learning', `statistics', or `inductive inference'. We analyze the experimental results presented in [KMNR97] and provide a method to avoid the overfitting. We do so by using a different coding scheme than in [KMNR97].
https://hal.inria.fr/inria-00321522
Contributor : Jakob Verbeek <>
Submitted on : Wednesday, February 16, 2011 - 4:58:50 PM Last modification on : Monday, September 25, 2017 - 10:08:04 AM Long-term archiving on: : Tuesday, May 17, 2011 - 2:40:24 AM
Jakob Verbeek. Using a sample-dependent coding scheme for two-part MDL. Machine Learning & Applications (ACAI '99), The Hellenic Artificial Intelligence Society (ΕΕΤΝ), Jul 1999, Chania, Greece. ⟨inria-00321522⟩