boldbrandywine
boldbrandywine

Reputation: 322

Pivoting/reshaping a dataframe to have dates as columns

Here is my dataframe:

       ID        AMT          DATE
0   1496846   54.76        2015-02-11
1   1496846   195.00       2015-01-09
2   1571558   11350.00     2015-04-30
3   1498812   135.00       2014-07-11
4   1498812   157.00       2014-08-04
5   1498812   110.00       2014-09-23
6   1498812   1428.00      2015-01-28
7   1558450   4355.00      2015-01-26
8   1858606   321.52       2015-03-27
9   1849431   1046.81      2015-03-19

I would like to make this a dataframe consisting of time series data for each ID. That is, each column name is a date (sorted), and it is indexed by ID, and the values are the AMT values corresponding to each date. I can get so far as doing something like

df.set_index("DATE").T

but from here I'm stuck.

I also tried

df.pivot(index='ID', columns='DATE', values='AMT')

but this gave me an error on having duplicate entries (the IDs).

I envision it as transposing DATE, and then grouping by unique ID and melting AMT underneath.

Upvotes: 1

Views: 119

Answers (1)

piRSquared
piRSquared

Reputation: 294506

you want to use pivot_table where there is an aggfunc parameter that handles duplicate indices.

df.pivot_table('AMT', 'DATE', 'ID', aggfunc='sum')

enter image description here


You'll want to choose how to handle the dups. I put 'sum' in there. It defaults to 'mean'

Upvotes: 1

Related Questions