Statistical learning theory provides the theoretical basis for many of today’s machine learning algorithms and is arguably one of the most beautifully developed branches of arti cial intelligence in general. It originated in Russia in the 1960s and gained wide popularity in the 1990s following the development of the so-called Support Vector Machine (SVM), which has become a standard tool for pattern recognition in a variety of domains ranging from computer vision to computational biology. Providing the basis of new learning algorithms, however, was not the only motivation for developing statistical learning theory. It was just as much a philosophical one, attempting to answer the question of what it is that allows us to draw valid conclusions from empirical data. In this article we attempt to give a gentle, non-technical overview over the key ideas and insights of statistical learning theory. We do not assume that the reader has a deep background in mathematics, statistics, or computer science. Given the nature of the subject matter, however, some familiarity with mathematical concepts and notations and some intuitive understanding of basic probability is required. There exist many excellent references to more technical surveys of the mathematics of statistical learning theory: the monographs by one of the founders of statistical learning theory (Vapnik, 1995, Vapnik, 1998), a brief overview over statistical learning theory in Section 5 of Scholkopf and Smola (2002), more technical overview papers such as Bousquet et al. (2003), Mendelson (2003), Boucheron et al. (2005), Herbrich and Williamson (2002), and the monograph Devroye et al. (1996). Statistical Learning Theory: Models, Concepts, and Results