turtle_in_mind
turtle_in_mind

Reputation: 1152

how do i get grouped by values?

I'm having trouble grouping by a column name and just viewing my pandas dataframe.
I have the following dictionary:

    d = {'portfolio_name': {4630: 'Retirement Aggressive',
  4631: 'Retirement Aggressive',
  4632: 'Retirement Aggressive',
  4633: 'Retirement Aggressive',
  4634: 'Retirement Aggressive',
  4635: 'Retirement Aggressive',
  4636: 'Retirement Aggressive',
  4637: 'Retirement Aggressive',
  4638: 'Retirement Aggressive',
  4639: 'Retirement Aggressive',
  4640: 'Retirement Aggressive',
  4641: 'Retirement Aggressive',
  4642: 'Retirement Aggressive',
  4643: 'Retirement Aggressive',
  4644: 'Retirement Aggressive',
  4645: 'Retirement Aggressive',
  4646: 'Retirement Aggressive',
  4647: 'Retirement Aggressive'},
 'size_type': {4630: 'percent',
  4631: 'percent',
  4632: 'percent',
  4633: 'percent',
  4634: 'percent',
  4635: 'percent',
  4636: 'percent',
  4637: 'percent',
  4638: 'percent',
  4639: 'percent',
  4640: 'percent',
  4641: 'percent',
  4642: 'percent',
  4643: 'percent',
  4644: 'percent',
  4645: 'percent',
  4646: 'percent',
  4647: 'percent'},
 'portfolio_date': {4630: Timestamp('2019-12-31 00:00:00'),
  4631: Timestamp('2019-12-31 00:00:00'),
  4632: Timestamp('2019-12-31 00:00:00'),
  4633: Timestamp('2019-12-31 00:00:00'),
  4634: Timestamp('2019-12-31 00:00:00'),
  4635: Timestamp('2019-12-31 00:00:00'),
  4636: Timestamp('2019-12-31 00:00:00'),
  4637: Timestamp('2019-12-31 00:00:00'),
  4638: Timestamp('2019-12-31 00:00:00'),
  4639: Timestamp('2019-09-30 00:00:00'),
  4640: Timestamp('2019-09-30 00:00:00'),
  4641: Timestamp('2019-09-30 00:00:00'),
  4642: Timestamp('2019-09-30 00:00:00'),
  4643: Timestamp('2019-09-30 00:00:00'),
  4644: Timestamp('2019-09-30 00:00:00'),
  4645: Timestamp('2019-09-30 00:00:00'),
  4646: Timestamp('2019-09-30 00:00:00'),
  4647: Timestamp('2019-09-30 00:00:00')},
 'portfolio_uuid': {4630: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4631: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4632: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4633: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4634: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4635: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4636: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4637: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4638: 'ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd',
  4639: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4640: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4641: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4642: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4643: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4644: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4645: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4646: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83',
  4647: 'ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83'}}

  d = pd.DataFrame(d)

I want to groupby portfolio_name, then date then uuid. So i want it looking like this:

'portfolio_name'   'portfolio_date',   'portfolio_uuid'
'retirement aggressive',  '2019-12-31' ModelPortfolio-4632664b7290a7c3dea508f09d9d66cd'
                          '2019-09-30' ModelPortfolio-b5e3b7b546c09eed42709f0f7f2d5e83

I tried

d.groupby(['portfolio_name'])[['portfolio_date','portfolio_uuid']]

But Im not able to access the grouped object. i want to be able to play around with the dataframe after grouping it. Thanks

Upvotes: 1

Views: 39

Answers (1)

YOLO
YOLO

Reputation: 21729

You can use get_group method over the groupby object. You can also use nth method.

# to get the zeroth group
d.groupby(['portfolio_name','portfolio_uuid']).nth(0)

To get grouped dataframe, you can do a list comprehension;

grp = d.groupby(['portfolio_name','portfolio_uuid'])
grps = [grp.nth(x) for x in range(grp.ngroups())]

grps list contain all the groups.

Upvotes: 2

Related Questions