Reputation: 123
I want to combine monthly data for multiple attributes into a single row consisting of data for the entire year. It is a similar problem to Combine multiple time-series rows into one row with Pandas. The solution in this question would work for me if the months of my years were complete, but for most of my data months are missing from individual years. If there is only data for December, the solution will fill the first column rather then the column 12.
For the example, I am only using January to June.
Here is an example of my data:
ex_df = pd.DataFrame({'year': [1966, 1966, 1966, 1967, 1967, 1967], 'month': [1, 2, 6, 2, 3, 4], 'A': [10, 5, 0, np.nan, 25, 0], 'B': [-100, -80, 100, -120, 0, 60], 'C': [-1, -18, -2, -11, 10, 6]})
Here is the optimal solution:
sol_df = pd.DataFrame({'year': [1966, 1967], 'A1': [10, np.nan], 'A2': [5, np.nan],'A3': [np.nan, 25],'A4': [np.nan, 0],'A5': [np.nan, np.nan],'A6': [0, np.nan],'B1': [-100, np.nan], 'B2': [-80, -120], 'B3': [np.nan, 0], 'B4': [np.nan, 60], 'B5': [np.nan, np.nan], 'B6': [100, np.nan],'C1': [-1, np.nan], 'C2': [-18, -11], 'C3': [np.nan, 10], 'C4': [np.nan, 6], 'C5': [np.nan, np.nan], 'C6': [-2, np.nan] })
A, B, C are attributes and the number corresponds to the month.
Upvotes: 3
Views: 2331
Reputation: 11714
a = ex_df.melt(['year', 'month'])
a['Obsevation'] = a.variable + a.month.astype(str)
a.pivot_table('value', index='year', columns='Obsevation')
Upvotes: 1
Reputation: 862761
Use:
df = ex_df.set_index(['year','month']).unstack()
df.columns = ['{}{}'.format(x, y) for x, y in df.columns]
df = df.reset_index()
print (df)
year A1 A2 A3 A4 A6 B1 B2 B3 B4 B6 C1 C2 \
0 1966 10.0 5.0 NaN NaN 0.0 -100.0 -80.0 NaN NaN 100.0 -1.0 -18.0
1 1967 NaN NaN 25.0 0.0 NaN NaN -120.0 0.0 60.0 NaN NaN -11.0
C3 C4 C6
0 NaN NaN -2.0
Explanation:
set_index
and unstack
for reshapeMultiindex
in columns with list comprehensionreset_index
Upvotes: 5