Reputation: 359
I'm handling my data. Here's my data.
I write my code like this.
complete_data = complete_data.groupby(['STDR_YM_CD', 'TRDAR_CD' ]).sum().reset_index()
I got the dataframe like below picture After executing the code
But I wanna aggregate the values based on the first three letters of characters in SVC_INDUTY_CD column like below picture.
Here is my data link http://blogattach.naver.com/c356df6c7f2127fbd539596759bfc1bd1848b453f1/20170316_215_blogfile/khm2963_1489653338468_dtPz6k_csv/test2.csv?type=attachment
Thank in advance
Upvotes: 0
Views: 66
Reputation: 7038
I'm sure there's a better way but this is one way you could do this:
complete_data['first_three_temp'] = complete_data['SVC_INDUTY_CD'].str[:3]
complete_data = complete_data.groupby(['STDR_YM_CD', 'TRDAR_CD', 'first_three_temp' ], as_index=False).sum()
complete_data.drop('first_three_temp', axis=1, inplace=True)
This will add a temporary column containing only the first three characters of your SVC_INDUTY_CD column. You can then groupby on and drop the temporary column. As I said I'm sure there's a more efficient way so I'm not sure if you'll be limited by the size of your dataset.
Upvotes: 1