Reputation: 8025
I tried using pivot tables to have more than one values in the 'values' field for the pivot_table function but it doesnt work, so im trying to see if i can do it with crosstabs. Here is my code
table=pandas.pivot_table(xl2, values='Applications', rows='Sub-Product',cols='Application Date',aggfunc=numpy.sum)
i get this when i export to csv.
Sub-Product 11/1/12 11/2/12 11/3/12
GP 190 207 65
GPF 1391 1430 1269
In python, after converting it to a pivot table, the dtype is float64() and using
<class 'pandas.core.frame.DataFrame'>
what i ultimately want is this output in csv:
Row Labels 11/1/2012 11/2/2012 11/3/2012
GP
Acquisitions 164 168 54
Applications 190 207 65
GPF
Acquisitions 1124 1142 992
Applications 1391 1430 1269
with a code along the lines of this (currently it doesnt work :/ ) :
table=pd.pivot_table(xl2, values=['Acquisitions','Applications'], rows=['Sub-Product'],cols=['Application Date'],aggfunc=np.sum)
But I can only get this:
Sub-Product ('Applications', Timestamp('2012-11-01 00:00:00', tz=None)) ('Applications', Timestamp('2012-11-02 00:00:00', tz=None)) ('Applications', Timestamp('2012-11-03 00:00:00', tz=None))
GP 190 207 65
GPF 1391 1430 1269
Any ideas on how crosstab can help? :S
This is the data in the csv file. I'm not sure why i cant get them into the proper dataframe format.
Application Date Sub-Product Applications Acquisitions
11/1/12 GP 1 1
11/1/12 GP 1 1
11/1/12 GP 1 1
11/1/12 GP 1 1
11/1/12 GPF 1 1
11/1/12 GPF 1 1
11/1/12 GPF 1 1
11/1/12 GPF 1 1
Upvotes: 3
Views: 2922
Reputation: 49826
Looks like you're really close to where you want to be. table.stack(0)
will move the first level of the column index to the row index.
In [1]: import pandas as pd
In [2]: from StringIO import StringIO
In [3]: df = pd.read_csv(StringIO("""\
...: Application-Date Sub-Product Applications Acquisitions
...: 11/1/12 GP 1 1
...: 11/1/12 GPF 1 1
...: 11/2/12 GP 1 1
...: 11/2/12 GP 1 1
...: 11/2/12 GPF 1 1
...: 11/2/12 GPF 1 1
...: 11/3/12 GP 1 1
...: 11/3/12 GP 1 1
...: 11/3/12 GP 1 1
...: 11/3/12 GPF 1 1
...: 11/3/12 GPF 1 1
...: 11/3/12 GPF 1 1
...: """), sep='\s+', parse_dates=[0])
In [4]: table = df.pivot_table(values=['Acquisitions', 'Applications'],
...: rows='Sub-Product',
...: cols='Application-Date',
...: aggfunc=sum)
In [5]: table
Out[5]:
Applications Acquisitions
Application-Date 2012-11-01 2012-11-02 2012-11-03 2012-11-01 2012-11-02 2012-11-03
Sub-Product
GP 1 2 3 1 2 3
GPF 1 2 3 1 2 3
In [6]: table.stack(0)
Out[6]:
Application-Date 2012-11-01 2012-11-02 2012-11-03
Sub-Product
GP Applications 1 2 3
Acquisitions 1 2 3
GPF Applications 1 2 3
Acquisitions 1 2 3
Upvotes: 2