sectechguy
sectechguy

Reputation: 2117

Pandas How to set the date as the index, and is it a good idea?

I have a pandas dataframe with 3 columns:

df
    Date        DestUserName    Percent
0   2019-01-01  100             1.000000
1   2019-01-01  101             1.000000
2   2019-01-01  102             1.000000
3   2019-01-01  103             1.000000
4   2019-01-02  100             1.000000
5   2019-01-02  101             0.923077
6   2019-01-02  103             0.800000
7   2019-01-02  100             1.000000
8   2019-01-03  103             0.800000
9   2019-01-03  102             1.000000
10  2019-01-03  101             1.000000
11  2019-01-04  100             1.000000
11  2019-01-04  102             1.000000
11  2019-01-04  103             0.972222


df.dtypes
Date            object 
DestUserName    object 
Percent         float64
dtype: object

I am looking to flip the data around so that the date(string) is either the first column or the index, the username/userid(string) is the column names and the percent(float64) is the data in the cells like the following:

             100       101       102       103
2019-01-01   1.000000  1.000000  1.000000  1.000000
2019-01-02   1.000000  0.923077  NaN       0.800000
2019-01-03   NaN       1.000000  1.000000  0.800000
2019-01-04   1.000000  NaN       1.000000  0.972222

What is the best way to accomplish this? I have seen this before, but is it a good idea or bad idea to store the Date(string) as the index?

Upvotes: 1

Views: 248

Answers (1)

sectechguy
sectechguy

Reputation: 2117

From the comment Chris left:

df.drop_duplicates().pivot('Date','DestUserName', 'Percent') or 
df.drop_duplicates().set_index(['Date', 'DestUserName']).unstack(1)

Upvotes: 1

Related Questions