Reputation: 169
I'm trying to analyse a covid data set and kind of at a loss on how to fix the data via pandas. The data set looks like the following:
I'm trying to make it look like this:
April 2 | April 3 | April 4
unique_tests total unique tests for april 2 | total unique tests for april 3|total unique tests for april 4
positive total positive for april 2 | total positive for april 3 |total positive for april 4
negative total negative for april 2 | total negative for april 3 |total negative for april 4
remaining total remaining for april 2 | total remaining for april 3 |total remaining for april 4
I have dates up to april 24.
Any ideas on how i can implement this? I can't make it work with pivot table in pandas
Upvotes: 0
Views: 68
Reputation: 862791
Use:
#convert columns to numeric and date to datetimes
df = pd.read_csv(file, thousands=',', parse_dates=['date'])
#create custom format of datetimes and aggregate sum, last transpose
df1 = df.groupby(df['date'].dt.strftime('%d-%b')).sum().T
Or is possible reassign column date
filled by new format of datetimes:
df1 = df.assign(date = df['date'].dt.strftime('%d-%b')).groupby('date').sum().T
Upvotes: 1