Reputation: 217
I'm trying to create a new row for "new confirmed cases" using the current row 'confirmed' - yesterday 'confirmed'. Confirmed is cumulative.
my data is as shown
Country,Date,Confirmed,Deaths,Recovered,Active
China,2020-01-21,10,5,1,100
China,2020-01-22,20,10,2,104
China,2020-01-23,30,15,3,116
France,2020-01-21,20,5,1,100
France,2020-01-22,30,10,2,118
France,2020-01-23,40,15,3,138
Output wanted
Country,Date,Confirmed,Deaths,Recovered,Active,New Confirmed
China,2020-01-21,10,5,1,100,0
China,2020-01-22,20,10,2,104,10
China,2020-01-23,30,15,3,116,10
France,2020-01-21,20,5,1,100,0
France,2020-01-22,30,10,2,118,10
France,2020-01-23,40,15,3,138,10
I know how to add a new row if using data from the same row but not sure how to use data from another row. Any tips or advice would be appreciated.
Upvotes: 0
Views: 46
Reputation: 24322
You can use shift()
method , fillna()
method and astype()
method:
df['New Confirmed']=df['Confirmed']-df['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
Now If You print df
you will get your desired output:
Country Date Confirmed Deaths Recovered Active New Confirmed
0 China 2020-01-21 10 5 1 100 0
1 China 2020-01-22 20 10 2 104 10
2 China 2020-01-23 30 15 3 116 10
For that condition use groupby()
method,shift()
method ,fillna()
method and astype()
method:
df['New Confirmed']=df['Confirmed']-df.groupby('Country')['Confirmed'].shift(1).fillna(df['Confirmed']).astype(int)
Output of above code:
Country Date Confirmed Deaths Recovered Active New Confirmed
0 China 2020-01-21 10 5 1 100 0
1 China 2020-01-22 20 10 2 104 10
2 China 2020-01-23 30 15 3 116 10
3 France 2020-01-21 10 5 1 100 0
4 France 2020-01-22 20 10 2 104 10
5 France 2020-01-23 30 15 3 116 10
Upvotes: 2