The lift chart provides a visual summary of the usefulness of the information provided by one or more statistical models for predicting a binomial (categorical) outcome variable (dependent variable); for multinomial (multiple-category) outcome variables, lift charts can be computed for each category. Specifically, the chart summarizes the utility that we may expect by using the respective predictive models, as compared to using baseline information only. The lift chart is applicable to most statistical methods that compute predictions (predicted classifications) for binomial or multinomial responses.
Let us start with an example. A marketing agency is planning to send advertisements to selected households with the goal to boost sales of a product. The agency has a list of all households where each household is described by a set of attributes. Each advertisement sent costs a few pennies, but it is well paid off if the customer buys the product. Therefore an agency wants to minimize the number of advertisements sent, while at the same time maximize the number of sold products by reaching only the consumers that will actually buy the product. Therefore it develops a classifier that predicts the probability that a household is a potential customer. To fit this classifier and to express the dependency between the costs and the expected benefit the lift chart can be used. The number off all potential customers P is often unknown, therefore TPrate cannot be computed and the ROC curve cannot used, but the lift chart is useful in such settings. Also the TP is often hard to measure in practice; one might have just a few measurements from a sales analysis. Even in such cases lift chart can help the agency select the amount of most promising households to which an advertisement should be sent. Of course, lift charts are also useful for many other similar problems.
A lift chart, sometimes called a cumulative gains chart, or a banana chart, is a measure of model performance. It shows how responses, (i.e., to a direct mail solicitation, or a surgical treatment for instance) are changed by applying the model. This change ratio, which is hopefully, the increase in response rate, is called the ‘lift’. A lift chart indicates which subset of the dataset contains the greatest possible proportion of positive responses. The higher the lift curve is from the baseline, the better the performance of the model since the baseline represents the null model, which is no model at all. To explain a lift chart, suppose we had a two-class prediction where the outcomes were yes (a positive response) or no (a negative response). To create a lift chart, instances in the dataset are sorted in descending probability order according to the predicted probability of a positive response. When the data is plotted, we can see a graphical depiction of the various probabilities. While the example shown in Figure 10 plots the results of different datasets for a single model, a lift chart can also be used to plot the results of a single dataset for different models. Note that the best model is not the one with the highest lift when it is being built. It is the model that performs the best on unseen, future data.
Lift Chart google