A_HinkBy
A_HinkBy

Reputation: 13

Trying to create a new dataframe column in pandas based on a dataframe related if statement

I'm learning Python & pandas and practicing with different stock calculations. I've tried to search help with this but just haven't found a response similar enough or then didn't understand how to deduce the correct approach based on the previous responses.

I have read stock data of a given time frame with datareader into dataframe df. In df I have Date Volume and Adj Close columns which I want to use to create a new column "OBV" based on given criteria. OBV is a cumulative value that adds or subtracts the value of the volume today to the previous' days OBV depending on the adjusted close price.

The calculation of OBV is simple:

If Adj Close is higher today than Adj Close of yesterday then add the Volume of today to the (cumulative) volume of yesterday.

If Adj Close is lower today than Adj Close of yesterday then substract the Volume of today from the (cumulative) volume of yesterday.

On day 1 the OBV = 0

This is then repeated along the time frame and OBV gets accumulated.

Here's the basic imports and start

import numpy as np
import pandas as pd
import pandas_datareader

import datetime
from pandas_datareader import data, wb

start = datetime.date(2012, 4, 16)
end = datetime.date(2017, 4, 13)

# Reading in Yahoo Finance data with DataReader
df = data.DataReader('GOOG', 'yahoo', start, end)

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

#This is what I cannot get to work, and I've tried two different ways.

#ATTEMPT1

def obv1(column):
    if column["Adj Close"] > column["Adj close"].shift(-1):
        val = column["Volume"].shift(-1) + column["Volume"]
    else:
        val = column["Volume"].shift(-1) - column["Volume"] 
    return val

df["OBV"] = df.apply(obv1, axis=1)

#ATTEMPT 2

def obv1(df):
    if df["Adj Close"] > df["Adj close"].shift(-1):
        val = df["Volume"].shift(-1) + df["Volume"]
    else:
        val = df["Volume"].shift(-1) - df["Volume"] 
    return val

df["OBV"] = df.apply(obv1, axis=1)

Both give me an error.

Upvotes: 1

Views: 1303

Answers (1)

piRSquared
piRSquared

Reputation: 294318

Consider the dataframe df

np.random.seed([3,1415])
df = pd.DataFrame(dict(
        Volume=np.random.randint(100, 200, 10),
        AdjClose=np.random.rand(10)
    ))

print(df)

   AdjClose  Volume
0  0.951710     111
1  0.346711     198
2  0.289758     174
3  0.662151     190
4  0.171633     115
5  0.018571     155
6  0.182415     113
7  0.332961     111
8  0.150202     113
9  0.810506     126

Multiply the Volume by -1 when change in AdjClose is negative. Then cumsum

(df.Volume * (~df.AdjClose.diff().le(0) * 2 - 1)).cumsum()

0    111
1    -87
2   -261
3    -71
4   -186
5   -341
6   -228
7   -117
8   -230
9   -104
dtype: int64

Include this along side the rest of the df

df.assign(new=(df.Volume * (~df.AdjClose.diff().le(0) * 2 - 1)).cumsum())

   AdjClose  Volume  new
0  0.951710     111  111
1  0.346711     198  -87
2  0.289758     174 -261
3  0.662151     190  -71
4  0.171633     115 -186
5  0.018571     155 -341
6  0.182415     113 -228
7  0.332961     111 -117
8  0.150202     113 -230
9  0.810506     126 -104

Upvotes: 6

Related Questions