FedePy
FedePy

Reputation: 51

How to iterate over DataFrame to select rows

I'm new to Python and I'm trying to understand how to select n rows from each Index within a Dataframe and build a new Dataframe with only selected rows.

My df looks like this:

      Col1 Col2 Col3 etc
   A
   A
   A
   A
   B
   B
   B
   B

I would basically to take the first two rows for each index to have:

     Col1 Col2 Col3 etc.
   A
   A
   B
   B

I tried to do this with a for loop and iloc like here below but the loop stops to index A:

   for i in df:
       sel=df.iloc[:3]

I'm aware it is a basic question but more I read and more I get confused with for, apply, range, etc

Please help! Thanks

Upvotes: 1

Views: 168

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 148890

A slight variation on @Chris's answer if A, B, etc. are in the index and not in the first column. You should first reset the index, use group_by, head, reset the index and remove its name:

df.reset_index().groupby('index').head(2).set_index('index').rename_axis(None)

Upvotes: 0

Chris
Chris

Reputation: 16147

If you want to get the first two rows of each group you can do the following:

df.groupby('Col1').head(2)

Upvotes: 1

Related Questions