Reputation: 51
I'm new to Python and I'm trying to understand how to select n rows from each Index within a Dataframe and build a new Dataframe with only selected rows.
My df looks like this:
Col1 Col2 Col3 etc
A
A
A
A
B
B
B
B
I would basically to take the first two rows for each index to have:
Col1 Col2 Col3 etc.
A
A
B
B
I tried to do this with a for loop and iloc like here below but the loop stops to index A:
for i in df:
sel=df.iloc[:3]
I'm aware it is a basic question but more I read and more I get confused with for, apply, range, etc
Please help! Thanks
Upvotes: 1
Views: 168
Reputation: 148890
A slight variation on @Chris's answer if A, B, etc. are in the index and not in the first column. You should first reset the index, use group_by, head, reset the index and remove its name:
df.reset_index().groupby('index').head(2).set_index('index').rename_axis(None)
Upvotes: 0
Reputation: 16147
If you want to get the first two rows of each group you can do the following:
df.groupby('Col1').head(2)
Upvotes: 1