Python pandas dataframe add element to previous row values and create a new column

Question

Volume = [-1 -2 0 1 3 -1 -2 -1 -2 2 -2 -1 0]

Net_rate = [1 0 0 1 0 1 2 3 0 0 1 0]

I have the Pandas dataframe 'Volume' and need to create the column 'Net_rate'.

The algorithmic calculation is like that:

if Volume[i]*Volume[i-1]>0 then Net_rate[i-1]+1 else 0

I tried the following code but returns error:

def f_DataSlider(df, input, n, name):
    """Calculate the moving average for the given data.

    :param df: pandas.DataFrame
    :param n: 
    :return: pandas.DataFrame
    """
    Slider = pd.Series(df[input].shift(n), name=str(name))
    df = df.join(Slider)
    return df

data = f_DataSlider(data, 'Volume', 1, 'Volume_1')

mask2 = (data['Volume']*dfohlc['Volume_1'])>0
dfohlc.loc[mask2, 'Net_rate'] = dfohlc['Net_rate'].shift(-1)+1
dfohlc.loc[~mask2, 'Net_rate'] = 1

How should I code it? Could anyone help me, please?

user3483203 · Accepted Answer

Using `numba`

You can make use of numba here, just make sure to pip install numba first. This option is slower than the cumsum option, but it helps to demonstrate how an algorithm for this problem would look.

Setup

from numba import njit

Create a boolean series using shift:

s = df.Volume.mul(df.Volume.shift(-1)).gt(0).tolist()

@njit
def increment(s):
    rate = [int(s[0]>0)]
    for idx, el in enumerate(s[1:]):
        if el:
            rate.append(rate[idx]+1)
        else:
            rate.append(0)
    return rate

increment(s)

[1, 0, 0, 1, 0, 1, 2, 3, 0, 0, 1, 0, 0]

Using `cumsum` and `groupby`:

s = df.Volume.mul(df.Volume.shift(-1)).gt(0).astype(int)

s.groupby(s.ne(s.shift()).cumsum()).cumsum()

0     1
1     0
2     0
3     1
4     0
5     1
6     2
7     3
8     0
9     0
10    1
11    0
12    0
Name: Volume, dtype: int32

Explanation

We can use shift for the Volume[i] * Volume[i-1] check:

df.Volume.mul(df.Volume.shift(-1))

0     2.0
1    -0.0
2     0.0
3     3.0
4    -3.0
5     2.0
6     2.0
7     2.0
8    -4.0
9    -4.0
10    2.0
11   -0.0
12    NaN
Name: Volume, dtype: float64

We can take this series, and check it is greater than 0:

df.Volume.mul(df.Volume.shift(-1)).gt(0).astype(int)

0     1
1     0
2     0
3     1
4     0
5     1
6     1
7     1
8     0
9     0
10    1
11    0
12    0
Name: Volume, dtype: int32

Now you can see the structure of the output is becoming clear, but we need to increment based on consecutive ranges of 1, which we can do using a groupby trick to groupby consecutive values, and take the cumsum.

Python pandas dataframe add element to previous row values and create a new column

Answers (1)

Using `numba`

Using `cumsum` and `groupby`:

Related Questions

Python pandas dataframe add element to previous row values and create a new column

Answers (1)

Using numba

Using cumsum and groupby:

Related Questions

Using `numba`

Using `cumsum` and `groupby`: