John
John

Reputation: 1947

Omitting loop while referring to next elements of pandas DataFrame

Let's consider the following dataframe:

import pandas as pd
import numpy as np
df = pd.DataFrame([1, 2, 3, 4, 3, 2 , 5, 6, 4, 2, 1, 6])

I want to do the following thing: If i-th element of the dataframe is bigger than mean of two next, then we assign 1, and if not, we assign -1 to this ith element.

My solution

An obvious solution is the following:

df_copy = df.copy()
for i in range(len(df) - 2):
    
    if (df.iloc[i] > np.mean(df.iloc[(i+1):(i+2)]))[0]:
        df_copy.iloc[i] = 1
    else:
        df_copy.iloc[i] = -1

However, I find it little cumbersome, and I'm wondering if there is any loop-free solution to these kind of problems.

Desired output

    0
0   -1
1   -1
2   -1
3   1
4   1
5   -1
6   -1
7   1
8   1
9   1
10  1
11  6

Upvotes: 2

Views: 34

Answers (1)

mozway
mozway

Reputation: 261900

You can use a rolling.mean and shift:

df['out'] = np.where(df[0].gt(df[0].rolling(2).mean().shift(-2)), 1, -1)

output:

    0  out
0   1   -1
1   2   -1
2   3   -1
3   4    1
4   3   -1
5   2   -1
6   5   -1
7   6    1
8   4    1
9   2   -1
10  1   -1
11  6   -1

keeping last items unchanged:

m = df[0].rolling(2).mean().shift(-2)
df['out'] = np.where(df[0].gt(m), 1, -1)
df['out'] = df['out'].mask(m.isna(), df[0])

output:

    0  out
0   1   -1
1   2   -1
2   3   -1
3   4    1
4   3   -1
5   2   -1
6   5   -1
7   6    1
8   4    1
9   2   -1
10  1    1
11  6    6

Upvotes: 2

Related Questions