Reputation: 203
I made a filtering of certain ids on an xlsx file. Transformed them to a dataframe. The total of ids is 3. where Id1 (in the xlsx file) have 5 row values Id2 have 3 row values ID3 have 19 row values
In a first step I extracted just a row value for each Id (row value is time in my case in %H:%M:%S.%f and it's unique for each Id).
My dataframe looks this way :
import pandas as pd
df = pd.DataFrame([['Id1','01:22:52.134'],['Id2','03:21:31.123'], ['Id1','21:12:52.544'],['Id3','23:12:31.216'],['Id1','10:22:02.134'],['Id2','06:52:48.184'], ['Id3','12:52:46.188'], ['Id3','06:52:46.184'], ['Id1','13:33:46.235'], ['Id2','14:35:12.235'], ['Id3','14:59:12.177']], columns=["Ids",'Time'])
My request is : I want to extract the row values for my selected Ids but not all row values. e.g : - 1 row value for Id1 (initially contains 5) - 2 row values for Id2 (initially contains 3) - 17 row values for Id3 (initially contains 19)
Upvotes: 1
Views: 781
Reputation: 862761
Use:
ids = {'Id1':1, 'Id2':2, 'Id3':17}
df = df.groupby('Ids', group_keys=False).apply(lambda x: x.head(ids[x.name]))
print (df)
Ids Time
0 Id1 01:22:52.134
1 Id2 03:21:31.123
2 Id2 06:52:48.184
3 Id3 23:12:31.216
4 Id3 12:52:46.188
5 Id3 06:52:46.184
6 Id3 14:59:12.177
Explanation:
groupby
and for each filtered group use head
with mapped value by dictionaryUpvotes: 2
Reputation: 402563
I'd recommend doing this with a groupby
+ pd.concat
. First, you'll need a mapping:
mapping = {'Id1' : 1, 'Id2' : 2, 'Id3' : 17}
Now, use mapping
to fetch only your desired number of rows with GroupBy.head
:
pd.concat(
[g.head(mapping[k]) for k, g in df.groupby('Ids')], axis=0
)
Ids Time
0 Id1 01:22:52.134
1 Id2 03:21:31.123
5 Id2 06:52:48.184
3 Id3 23:12:31.216
6 Id3 12:52:46.188
7 Id3 06:52:46.184
10 Id3 14:59:12.177
Upvotes: 3