Reputation: 461
I have the code below but the parameter margin=True is giving false result of total in the last column. As you see, it is supposed to give result of 19 but it gives as all value of its column are false. to_send column is achieved by subtraction of received by sent.
import pandas as pd
df2 = pd.read_csv("https://www.dropbox.com/s/90y07129zn351z9/test_data.csv?dl=1",encoding="latin-1")
df2['received'] = pd.to_datetime(df2['received'])
df2['sent'] = pd.to_datetime(df2['sent'])
df2['to_send']= df2['received']-df2['sent']
pvt_all = df2.dropna(axis=0, how='all', subset=['received', 'sent', 'to_send'])\
.pivot_table(index=['site'], values=['received','sent', 'to_send'],\
aggfunc='count', margins=True, dropna=False)
pvt_all
received sent to_send
site
2 32.0 27.0 27.0
3 20.0 17.0 17.0
4 33.0 31.0 31.0
5 40.0 31.0 31.0
All 125.0 106.0 106.0
I also did in another way as below but with even more false sum results.
import pandas as pd
df=pd.read_csv("https://www.dropbox.com/s/90y07129zn351z9/test_data.csv?dl=1", encoding="latin-1")
pvt_received=df.pivot_table(index=['site'], values = ['received','sent'], aggfunc = { 'received' : 'count' ,'sent': 'count'}, fill_value=0, margins=True)
pvt_received['to_send']=pvt_received['received']-pvt_received['sent']
column_order = ['received', 'sent','to_send']
pvt_received_ordered = pvt_received.reindex_axis(column_order, axis=1)
pvt_received_ordered.to_csv("test_pivot.csv")
table_to_send = pd.read_csv('test_pivot.csv', encoding='latin-1')
table_to_send.rename(columns={'site':'Site','received':'Date Received','sent':'Date Sent','to_send':'Date To Send'}, inplace=True)
table_to_send.set_index('Site', inplace=True)
table_to_send
Date Received Date Sent Date To Send
Site
2 32.0 27.0 5.0
3 20.0 17.0 3.0
4 33.0 31.0 2.0
5 40.0 31.0 9.0
All 106.0 106.0 0.0
Upvotes: 1
Views: 477
Reputation: 461
import pandas as pd
df2 = pd.read_csv("https://www.dropbox.com/s/90y07129zn351z9/test_data.csv?dl=1",encoding="latin-1")
df2['received'] = pd.to_datetime(df2['received'])
df2['sent'] = pd.to_datetime(df2['sent'])
#Removed operation that was before building pivot table and put it after pivot table is done!
pvt_all = df2.dropna(axis=0, how='all', subset=['received', 'sent'])\
.pivot_table(index=['site'], values=['received','sent'],\
aggfunc='count', margins=True, dropna=False)
pvt_all['to_send']= pvt_all['received']-pvt_all['sent'] #Adding operation after making pivot table
pvt_all=pvt_all[['received','sent','to_send']] #Optional, to order columns as desired
pvt_all
received sent to_send
site
2 32.0 27.0 5.0
3 20.0 17.0 3.0
4 33.0 31.0 2.0
5 40.0 31.0 9.0
All 125.0 106.0 19.0
Upvotes: 1