Reputation: 769
I'm trying to get a list of indices out of a pandas dataframe.
First do an import.
import pandas as pd
Construct a pandas dataframe.
# Create dataframe
data = {'name': ['Jason', 'Jason', 'Tina', 'Tina', 'Tina', 'Jason', 'Tina'],
'reports': [4, 24, 31, 2, 3, 5, 10],
'coverage': [True, False, False, False, True, True, False]}
df = pd.DataFrame(data)
print(df)
Output:
coverage name reports
0 True Jason 4
1 False Jason 24
2 False Tina 31
3 False Tina 2
4 True Tina 3
5 True Jason 5
6 False Tina 10
I would like to have the indices on the left of the dataframe when the coverage is set to True, but I would like to have this for every name separately. Preferably do this without an explicit for-loop.
Desired output is something like this.
list_Jason = [0, 5]
list_Tina = [4]
Attempted solution: I thought I should use 'groupby' and then access the coverage column. From there I don't know how to proceed. All help is appreciated.
df.groupby('name')['coverage']
Upvotes: 3
Views: 1405
Reputation: 402253
This is doable, using boolean indexing
first followed by the groupby:
In [942]: df[df.coverage].groupby('name').agg({'reports' : lambda x: list(x.index)})
Out[942]:
reports
name
Jason [0, 5]
Tina [4]
You may use dfGroupBy.agg
to get your output as a column of lists.
Upvotes: 1
Reputation: 3382
This should work:
grouped=df.groupby('name').apply(lambda x: x.index[x.coverage].values)
output:
name
Jason [0, 5]
Tina [4]
Upvotes: 0
Reputation: 2293
You want to get the index out for each group.
this is stored in the 'groups' attribute of a groupbydataframe.
#filter for coverage==True
#group by 'name'
#access the 'groups' attribute
by_person = df[df.coverage].groupby('name').groups
will return:
{'Jason': Int64Index([0, 5], dtype='int64'),
'Tina': Int64Index([4], dtype='int64')}
From which you can access the individuals as you would a regular dictionary:
by_person['Jason']
returns:
Int64Index([0, 5], dtype='int64')
Which you can treat like a regular list.
Upvotes: 2