Reputation: 2117
I have a pandas dataframe with 3 columns:
df
Date DestUserName Percent
0 2019-01-01 100 1.000000
1 2019-01-01 101 1.000000
2 2019-01-01 102 1.000000
3 2019-01-01 103 1.000000
4 2019-01-02 100 1.000000
5 2019-01-02 101 0.923077
6 2019-01-02 103 0.800000
7 2019-01-02 100 1.000000
8 2019-01-03 103 0.800000
9 2019-01-03 102 1.000000
10 2019-01-03 101 1.000000
11 2019-01-04 100 1.000000
11 2019-01-04 102 1.000000
11 2019-01-04 103 0.972222
df.dtypes
Date object
DestUserName object
Percent float64
dtype: object
I am looking to flip the data around so that the date(string) is either the first column or the index, the username/userid(string) is the column names and the percent(float64) is the data in the cells like the following:
100 101 102 103
2019-01-01 1.000000 1.000000 1.000000 1.000000
2019-01-02 1.000000 0.923077 NaN 0.800000
2019-01-03 NaN 1.000000 1.000000 0.800000
2019-01-04 1.000000 NaN 1.000000 0.972222
What is the best way to accomplish this? I have seen this before, but is it a good idea or bad idea to store the Date(string) as the index?
Upvotes: 1
Views: 248
Reputation: 2117
From the comment Chris left:
df.drop_duplicates().pivot('Date','DestUserName', 'Percent') or
df.drop_duplicates().set_index(['Date', 'DestUserName']).unstack(1)
Upvotes: 1