Raj Raina
Raj Raina

Reputation: 99

Creating a lift chart in R

Suppose I have the following data frame consisting of people with some score associated with them:

Score | hasDefaulted
10    | 0
13    | 0
15    | 1
17    | 0
...

I want to make a lift chart in R by first sorting the population by score, then having % of population on the X-axis, and % of Default's on the Y-axis. I cannot find a good package that gives me the control to do this. I have explored Package Lift as well as Package Gains but I cannot figure out how to get enough control over them to do what I described above. For example, when I try using Package Lift, as

plotLift(sort(dataFrame$Score, decreasing=FALSE), dataFrame$hasDefaulted)

I get some strange plot:

But given my desires, the plot should end up looking like a cumulative density function.

Could someone show me how to use such packages properly, or direct me to a package that does the required? Thanks in advance.

Upvotes: 3

Views: 20181

Answers (3)

coding_is_fun
coding_is_fun

Reputation: 127

Even if the question was asked about 5 years ago, I would like to share that I recently have discovered a nice package that helps build GAIN and LIFT charts, and display gain and lift tables: package name is CustomerScoringMetrics.

functions: cumGainsChart() , cumGainsChart(), liftChart(), liftTable(), etc.

Upvotes: 1

Yizhen Hai
Yizhen Hai

Reputation: 61

I think you are searching for a gain chart, not a lift chart. I notice there is some confusion between them. You can refer to Lift Charts for more information.

require(ROCR)
data(ROCR.simple)
pred <- prediction(ROCR.simple$predictions, ROCR.simple$labels)

gain <- performance(pred, "tpr", "rpp")
plot(gain, main = "Gain Chart")

Upvotes: 6

Diego Rodrigues
Diego Rodrigues

Reputation: 864

I always try to build my own code rather than trying something less flexible.

Here's how I think you can tackle the problem:

# Creating the data frame
df <- data.frame("Score"=runif(100,1,100),
                 "hasDefaulted"=round(runif(100,0,1),0))

# Ordering the dataset
df <- df[order(df$Score),]

# Creating the cumulative density
df$cumden <- cumsum(df$hasDefaulted)/sum(df$hasDefaulted)

# Creating the % of population
df$perpop <- (seq(nrow(df))/nrow(df))*100

# Ploting
plot(df$perpop,df$cumden,type="l",xlab="% of Population",ylab="% of Default's")

enter image description here

Is that what you want?

Upvotes: 9

Related Questions