user2003052
user2003052

Reputation: 119

Pandas extract previous row on value change

I have two columns of data 'Part & Qty' where some part numbers are repeated across multiple rows. The quantity value I need to retrieve is contained in the final row before the part number changes.

My code (below) adds a True/False column to flag when the part number changes. I had thought when the flag is 'True' I want to retrieve the data from the previous row, however this does not work for the first and last rows.

Running my code gives the output (left) vs the data I'm trying to extract is(right):

output enter image description here

What is the best way to achieve this?

    import pandas as pd
    import numpy as np
    df = pd.DataFrame({
        'part_no_2': [22, 22, 22, 23, 23, 24, 25, 25, 25, 26],
        'qty': [0, 0, 4, 44, 22, 0, 7, 16, 5, 6]})

    df['part_no_change'] = np.where(df["part_no_2"].shift() != df["part_no_2"], True, False) #look for when PNo changes

    df

Upvotes: 2

Views: 1051

Answers (2)

Pedro Maia
Pedro Maia

Reputation: 2712

Pandas has a built-in method to do this:

df.drop_duplicates(subset=['part_no_2'], keep='last')

Upvotes: 2

akuiper
akuiper

Reputation: 214927

Try shift(-1):

df[df.part_no_2 != df.part_no_2.shift(-1)]

   part_no_2  qty
2         22    4
4         23   22
5         24    0
8         25    5
9         26    6

Upvotes: 4

Related Questions