STEIN
STEIN

Reputation: 303

Analysing timeseries data using Panda dataframes

I have some timeseries data as per below that I want to do some specific analysis on

"timestamp","epic","closeprice_bid","closeprice_ask","last_traded_volume"
"2020-03-24 12:00:00","KA.D.BARC.DAILY.IP","91.17","91.38","7836277"
"2020-03-24 13:00:00","KA.D.BARC.DAILY.IP","90.33","90.66","8001075"
"2020-03-24 14:00:00","KA.D.BARC.DAILY.IP","89.96","90.22","11490520"
"2020-03-24 15:00:00","KA.D.BARC.DAILY.IP","91.62","91.89","9014323"
"2020-03-24 16:00:00","KA.D.BARC.DAILY.IP","93.84","94.23","7270054"
"2020-03-24 16:00:00","KA.D.BARC.DAILY.IP","93.84","94.23","7270054.0"
"2020-03-25 08:00:00","KA.D.BARC.DAILY.IP","109.47","109.89","25414762.0"
"2020-03-25 08:00:00","KA.D.BARC.DAILY.IP","109.47","109.89","25414762

I want to simulate a basic trading strategy whereby using pandas dataframes I am able to analyse the timeseries data by 1) checking to see if there was a difference of ≥1% or ≤1% between the previous days last closeprice_bid and today's first closeprice_bid 2) for every hour of data check if the closeprice_bid is ≥3% or ≤3% of today's opening closeprice_bid.

Can someone please provide some guidance as to how go about doing the analysis described above using pandas

I have loaded the data into a df using the code below:

cols = ['timestamp', 'epic', 'closeprice_bid', 'closeprice_ask','last_traded_volume']
stock_data = pd.read_csv('barc.csv', header=0, names=cols)
stock_data['closeprice_bid'] = pd.to_numeric(stock_data['closeprice_bid'], errors='coerce')

Upvotes: 1

Views: 89

Answers (2)

Darth Vader
Darth Vader

Reputation: 921

you can do something like :

 (stock_data['closeprice_bid'].shift(-1) -stock_data['closeprice_bid'])/stock_data['closeprice_bid'] > 0.01

and similar for per hour case.

Upvotes: 1

NYC Coder
NYC Coder

Reputation: 7594

You can try something like this:

df['date'] = pd.to_datetime(df['timestamp']).dt.date
df['time'] = pd.to_datetime(df['timestamp']).dt.time
df.sort_values(by=['date', 'time'], inplace=True)
first = df.groupby(by='date').first()['closeprice_bid']
df_first = first.to_frame().reset_index()
last = df.groupby(by='date').last()['closeprice_bid']
df_last = last.to_frame().reset_index()

df_merged = df_first.merge(df_last, left_on=['date'], right_on=['date'])
df_merged.set_index(['date'], inplace=True)
df_merged['pct_change'] = df_merged.pct_change(axis=1)['closeprice_bid_y']
df_merged['greater_than_1'] = df_merged['pct_change'] > 1
print(df_merged)

            closeprice_bid_x  closeprice_bid_y  pct_change  greater_than_1
date
2020-03-24             91.17             93.84    0.029286           False
2020-03-25            109.47            109.47    0.000000           False

Upvotes: 0

Related Questions