nrcjea001
nrcjea001

Reputation: 1057

Pandas: set preceding values conditional on current value in column (by group)

I have a pandas data frame where values should be greater or equal to preceding values. In cases where the current value is lower than the preceding values, the preceding values must be set equal to the current value. This is best explained by example below:

data = {'group':['A', 'A', 'A', 'A', 'A', 'B', 'B', 
                'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'], 
                 'value':[0, 1, 2, 3, 2, 0, 1, 2, 3, 1, 5, 0, 1, 0, 3, 2]} 
df = pd.DataFrame(data)
df

  group value
0   A     0
1   A     1
2   A     2
3   A     3
4   A     2
5   B     0
6   B     1
7   B     2
8   B     3
9   B     1
10  B     5
11  C     0
12  C     1
13  C     0
14  C     3
15  C     2

and the result I am looking for is:

  group value
0   A     0
1   A     1
2   A     2
3   A     2
4   A     2
5   B     0
6   B     1
7   B     1
8   B     1
9   B     1
10  B     5
11  C     0
12  C     0
13  C     0
14  C     2
15  C     2

Upvotes: 1

Views: 86

Answers (2)

Adam.Er8
Adam.Er8

Reputation: 13393

So here's my go! (Special thanks to @jezrael for helping me simplify it considerably!)

I'm basing this on Expanding Windows, in reverse, to always get a suffix of the elements in each group (from the last element, expanding towards first).

this expanding window has the following logic: For element in index i, you get a Series containing all elements in group with indices >=i, and I need to return a new single value for i in the result.
What is the value corresponding to this suffix? its minimum! because if the later elements are smaller, we need to take the smallest among them.

then we can assign the result of this operation to df['value'].

try this:

df['value'] = (df.iloc[::-1]
                .groupby('group')['value']
                .expanding()
                .min()
                .reset_index(level=0, drop=True)
                .astype(int))
print (df)

Output:

   group  value
0      A      0
1      A      1
2      A      2
3      A      2
4      A      2
5      B      0
6      B      1
7      B      1
8      B      1
9      B      1
10     B      5
11     C      0
12     C      0
13     C      0
14     C      2
15     C      2

Upvotes: 1

Jamie
Jamie

Reputation: 146

I didnt get your output but I believe you are looking for something like

df['fwd'] = df.value.shift(-1)
df['new'] = np.where(df['value'] > df['fwd'], df['fwd'], df['value'])

Upvotes: 0

Related Questions