Reputation: 49
Lets say I have a dataframe like below:
pid name
0 5 A
1 5 X
2 5 C
3 5 Q
4 3 H
5 3 E
6 4 U
7 5 J
I don't know what is the value of pid
in the first row in advance, but I would like to get all the rows from the beginning until the value of pid
changes. So in this case, all the consecutive rows with pid = 5 should be printed. Note that the last row that has pid=5 should not be in the results.
So the result will be:
pid name
0 5 A
1 5 X
2 5 C
3 5 Q
Name
is just a column without any specific considerations.
Upvotes: 0
Views: 643
Reputation: 25259
Try this
df_final = df[df.pid.diff().ne(0).cumsum().eq(1)]
Out[909]:
pid name
0 5 A
1 5 X
2 5 C
3 5 Q
Upvotes: 2