What is Data Mining?
The central notion behind data mining is this: we look for patterns of behavior in historical data with the aim of exploiting these patterns in the future. For example we might find patterns that identify a certain group of customers (by age, income and education say) as being more likely to respond to the promotion of a certain product. These patterns are found by analyzing historical data and a decision has to me made about the likelihood of this relationship still being true. In fact this is one of the most common applications of data mining technology and there are many technology suppliers providing solutions in this domain.

The term ‘data mining’ embraces a large number of technologies and techniques, but for our purposes we will separate out statistical methods. These are fundamentally different from many of the methods used in data mining. We are all familiar with simple statistical methods – the mean, standard deviation and regression where we fit a line to a set of data. Statistics are predefined metrics of a data set and either they make sense for the data or they do not, and it is often very difficult to establish which is the case. Data mining on the other hand often determines the metrics as a model is built. We’ll get to some of the methods used later, but this is an important point and worth remembering.

