Reputation: 1980
I've got a dataframe I pulled from a poorly organized SQL table. That table has unique rows for every channel I can extract that info to a python dataframe, and intend to do further processing, but for now just want to get it to a more usable format
sample input:
C = pd.DataFrame()
A = np.array([datetime.datetime(2016,8,8,0,0,1,1000),45,'foo1',1])
B = pd.DataFrame(A.reshape(1,4),columns = ['date','chNum','chNam','value'])
C = C.append(B)
A = np.array([datetime.datetime(2016,8,8,0,0,1,1000),46,'foo2',12.3])
B = pd.DataFrame(A.reshape(1,4),columns = ['date','chNum','chNam','value'])
C = C.append(B)
A = np.array([datetime.datetime(2016,8,8,0,0,2,1000),45,'foo1',10])
B = pd.DataFrame(A.reshape(1,4),columns = ['date','chNum','chNam','value'])
C = C.append(B)
A = np.array([datetime.datetime(2016,8,8,0,0,2,1000),46,'foo2',11.3])
B = pd.DataFrame(A.reshape(1,4),columns = ['date','chNum','chNam','value'])
C = C.append(B)
Produces
date chNum chNam value
0 2016-08-08 00:00:01.001000 45 foo1 1
0 2016-08-08 00:00:01.001000 46 foo2 12.3
0 2016-08-08 00:00:02.001000 45 foo1 10
0 2016-08-08 00:00:02.001000 46 foo2 11.3
I want
date foo1 foo2
2016-08-08 00:00:01.001000 1 12.3
2016-08-08 00:00:02.001000 10 113
I have a solution: make a list of unique dates, for each date loop through the dataframe and pull off each channel, making a new row. kind of tedious (error prone)! to program, so I was wondering if there's a better way to utilize Pandas tools
Upvotes: 1
Views: 696
Reputation: 294516
Use set_index
then unstack
to pivot
C.set_index(['date', 'chNum', 'chNam'])['value'].unstack(['chNam', 'chNum'])
To get exactly what you asked for
C.set_index(['date', 'chNam'])['value'].unstack().rename_axis(None, 1)
Upvotes: 2