Devang Mukherjee
Devang Mukherjee

Reputation: 197

Trying to implement basic subtraction function between two rows of dataframe based on the months of datetime

I'm trying to find the average monthly gap between stock closing and opening prices which are in dataframe.
The basic function looks like this

for i in range(1,len(df)):
    #difference between today's opening price minus yesterday's closing price
    gap+=(df['Open'][i]-df['Close'][i-1])

But I want to find this avg_gap for every month in my dataframe. And store the avg_gaps in a list.
Dataframe has each days opening, highest, lowest and closing price with the associated date (in datetime format)
Looks like this

Date        Open    High    Low      Close
2019-01-01  10881.7 10923.6 10807.1 10910.1
2019-01-02  10868.8 10895.3 10735.0 10792.5
2019-01-03  10796.8 10814.0 10661.2 10672.2
2019-01-04  10699.7 10741.0 10628.6 10727.3
2019-01-07  10804.8 10835.9 10750.1 10771.8
2019-01-08  10786.2 10818.4 10733.2 10802.1

Like this the dataframe has data for 1.5 years

This is what I have tried

list=[]
gap=0
count=0

for i in range(1,len(df)):   
    if(df['Date'][i].month==df['Date'][i-1].month):
        count+=1
        gap+=(df['Open'][i]-df['Close'][i-1])
    if(df['Date'][i].month<df['Date'][i-1].month):
        avg_gap=gap/count
        list.append(avg_gap)
        count=0
        gap=0

This isn't giving me the correct answer.

I want 12 avg_gap for every month of the year to be appened in the list.

avg_gaps are calculated by (total gap in that month)/(total no of days in that month)

What can I do?

Upvotes: 1

Views: 50

Answers (1)

nathan.j.mcdougall
nathan.j.mcdougall

Reputation: 524

You will want to create a month column in your DataFrame to group by. Then, write a function which calculates the average gap.

Gap Calculation

You are adding up lots of differences as follows:

 df['Open'][1]-df['Close'][0]
+df['Open'][2]-df['Close'][1]
+df['Open'][3]-df['Close'][2]
+...

However, we could group this a different way as follows:

 (df['Open'][1]+df['Open'][2]+df['Open'][3]+...)
-(df['Close'][0]+df['Close'][1]+df['Close'][2]+...)

So a simpler way to perform your calculation is just as

df['Open'].iloc[1:].sum()-df['Close'].iloc[:-1].sum()

Code

def average_gap(month_df):
    openings = month_df['Open'].iloc[1:].sum()
    closings = month_df['Close'].iloc[:-1].sum()
    number = month_df.shape[0]-1
    
    return (openings-closings)/number

df['month'] = df['Date'].dt.month
gap_by_month = df.groupby('month').apply(average_gap)

Upvotes: 1

Related Questions