Education
The idea of selecting a model via penalizing a log-likelihood type criterion goes back to the early seventies with the pioneering works of Mallows and Akaike. One can find many consistency results in the literature for such criteria. These results are asymptotic in the sense that one deals with a given number of models and the number of observations tends to infinity. A non asymptotic theory for these type of criteria has been developed these last years that allows the size as well as the number of models to depend on the sample size. For practical relevance of these methods, it is desirable to get a precise expression of the penalty terms involved in the penalized criteria on which they are based. We will discuss some heuristics to design data-driven penalties, review some new results and discuss some open problems.