How to read in a pandas data frame to a kaplan meier curve?

Question

I am trying to replicate the Kaplan Meier table that is figure 1 here. The figure is:

This is the code I wrote:

# Python code to create the above Kaplan Meier curve
from lifelines import KaplanMeierFitter
import pandas as pd

df = pd.DataFrame({
                'T':[0,0,0,0,0,0,2.5,2.5,2.5,2.5,2.5,4,4,4,4,4,5,5,5,6,6],
                'E':[0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0],
})
## create a kmf object
kmf = KaplanMeierFitter() 

## Fit the data into the model
kmf.fit(df['T'], df['E'],label='Kaplan Meier Estimate')

## Create an estimate
kmf.plot(ci_show=False)

My output plot is different (see the scale):

When I print the survival function, it is different:

          Kaplan Meier Estimate
timeline                       
0.0                      1.0000
2.5                      0.9375
4.0                      0.7500
5.0                      0.6000
6.0                      0.6000

I presume I didn't translate the data into a dataframe properly (possibly?). I tried to mess around with the dataframe, adding the 1 event to the start and end of the time frame, but it didn't matter. Can someone show me how to replicate the example I'm trying to work on?

How to read in a pandas data frame to a kaplan meier curve?

Answers (1)

Related Questions