Data mining is about knowledge and information, but only occasionally about predicting the future. For as long as the field has existed, data miners have worked to explain the difference between data mining and other forms of data analysis. The terms ‘predictive analysis’ and ‘predictive modelling’ have been adopted widely to distinguish data mining and its modelling from other kinds. Unfortunately, this has led to the erroneous belief among non-practitioners that data mining is all about prediction, which it is not. Rather, data mining is about information and knowledge. Take a look at the diagram: On the left, we have the myth which has grown up around data mining: the idea that starting from data we create models which make predictions to guide action. This places a false emphasis on models; a more accurate picture of what really happens is shown on the right. Knowledge is applied to data, producing new knowledge which can again be applied to the data: an iterative process. At any point in this cycle, knowledge and data can be used together to produce new information. This creation of new information is sometimes called ‘prediction’, but it is often not information about the future. It may have some implications for the future, as many pieces of information do, but it is not a prediction in the usual sense of the word. In summary, the left hand diagram is erroneous because it leaves out knowledge, which is both an essential prerequisite and a product of data mining, and is used at every step. Data mining often produces models but these are only one kind of knowledge that it can produce, the other being human knowledge (knowledge in the head). Let’s Debunk the Myths about Data Mining