Reputation: 1152
I'm having trouble grouping by a column name and just viewing my pandas dataframe.
I have the following dictionary:
d = {'portfolio_name': {4630: 'Retirement Aggressive',
4631: 'Retirement Aggressive',
4632: 'Retirement Aggressive',
4633: 'Retirement Aggressive',
4634: 'Retirement Aggressive',
4635: 'Retirement Aggressive',
4636: 'Retirement Aggressive',
4637: 'Retirement Aggressive',
4638: 'Retirement Aggressive',
4639: 'Retirement Aggressive',
4640: 'Retirement Aggressive',
4641: 'Retirement Aggressive',
4642: 'Retirement Aggressive',
4643: 'Retirement Aggressive',
4644: 'Retirement Aggressive',
4645: 'Retirement Aggressive',
4646: 'Retirement Aggressive',
4647: 'Retirement Aggressive'},
'size_type': {4630: 'percent',
4631: 'percent',
4632: 'percent',
4633: 'percent',
4634: 'percent',
4635: 'percent',
4636: 'percent',
4637: 'percent',
4638: 'percent',
4639: 'percent',
4640: 'percent',
4641: 'percent',
4642: 'percent',
4643: 'percent',
4644: 'percent',
4645: 'percent',
4646: 'percent',
4647: 'percent'},
'portfolio_date': {4630: Timestamp('2019-12-31 00:00:00'),
4631: Timestamp('2019-12-31 00:00:00'),
4632: Timestamp('2019-12-31 00:00:00'),
4633: Timestamp('2019-12-31 00:00:00'),
4634: Timestamp('2019-12-31 00:00:00'),
4635: Timestamp('2019-12-31 00:00:00'),
4636: Timestamp('2019-12-31 00:00:00'),
4637: Timestamp('2019-12-31 00:00:00'),
4638: Timestamp('2019-12-31 00:00:00'),
4639: Timestamp('2019-09-30 00:00:00'),
4640: Timestamp('2019-09-30 00:00:00'),
4641: Timestamp('2019-09-30 00:00:00'),
4642: Timestamp('2019-09-30 00:00:00'),
4643: Timestamp('2019-09-30 00:00:00'),
4644: Timestamp('2019-09-30 00:00:00'),
4645: Timestamp('2019-09-30 00:00:00'),
4646: Timestamp('2019-09-30 00:00:00'),
4647: Timestamp('2019-09-30 00:00:00')},
'portfolio_uuid': {4630: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4631: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4632: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4633: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4634: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4635: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4636: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4637: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4638: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
4639: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4640: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4641: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4642: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4643: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4644: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4645: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4646: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
4647: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83'}}
d = pd.DataFrame(d)
I want to groupby portfolio_name, then date then uuid. So i want it looking like this:
'portfolio_name' 'portfolio_date', 'portfolio_uuid'
'retirement aggressive', '2019-12-31' ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd'
'2019-09-30' ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83
I tried
d.groupby(['portfolio_name'])[['portfolio_date','portfolio_uuid']]
But Im not able to access the grouped object. i want to be able to play around with the dataframe after grouping it. Thanks
Upvotes: 1
Views: 39
Reputation: 21729
You can use get_group
method over the groupby object. You can also use nth method.
# to get the zeroth group
d.groupby(['portfolio_name','portfolio_uuid']).nth(0)
To get grouped dataframe, you can do a list comprehension;
grp = d.groupby(['portfolio_name','portfolio_uuid'])
grps = [grp.nth(x) for x in range(grp.ngroups())]
grps
list contain all the groups.
Upvotes: 2