DJHeels
DJHeels

Reputation: 89

Determining Minimum and Maximum Values During Particular Time in Pandas

I have a dataframe similar to the below:

df = pd.DataFrame({'ticker':['A', 'A', 'A', 'A', 'A', 
                             'B', 'B', 'B', 'B', 'B'], 
                   'date':['2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00', 
                           '2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00'], 
                   'datetime':['2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00', 
                               '2020-05-01 09:30:00', '2020-05-01 09:31:00', '2020-05-01 09:32:00', '2020-05-01 09:33:00', '2020-05-01 09:34:00'], 
                   'low':[10.00, 11.00, 12.00, 11.00, 11.50,
                          15.00, 14.00, 13.00, 12.00, 12.50], 
                   'high':[10.25, 11.25, 12.25, 11.25, 11.75,
                          15.25, 14.25, 13.25, 12.25, 12.75], 
                   'close':[10.20, 11.20, 12.20, 11.20, 11.70,
                          15.20, 14.20, 13.20, 12.20, 12.75]})

df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d %H:%M')
df['datetime'] = pd.to_datetime(df['datetime'], format='%Y-%m-%d %H:%M')
df = df.set_index(['ticker', 'date'])

I am trying to figure out how to create columns in my dataframe with the minimum and maximum prices of each ticker at a given point in time. That is, between the first time that day and the current time of the given row.

I've tried various different .groupby and .loc statements, but can't seem to get the right one.

Any help would be greatly appreciated!

Upvotes: 1

Views: 99

Answers (2)

gosuto
gosuto

Reputation: 5741

You are looking for .expanding() in combination with .min() and .max(), e.g.:

df.groupby('ticker')[['low', 'high', 'close']].expanding().min()
                                    low   high  close
ticker ticker date                                   
A      A      2020-05-01 09:30:00  10.0  10.25   10.2
              2020-05-01 09:31:00  10.0  10.25   10.2
              2020-05-01 09:32:00  10.0  10.25   10.2
              2020-05-01 09:33:00  10.0  10.25   10.2
              2020-05-01 09:34:00  10.0  10.25   10.2
B      B      2020-05-01 09:30:00  15.0  15.25   15.2
              2020-05-01 09:31:00  14.0  14.25   14.2
              2020-05-01 09:32:00  13.0  13.25   13.2
              2020-05-01 09:33:00  12.0  12.25   12.2
              2020-05-01 09:34:00  12.0  12.25   12.2

It is basically like .rolling() but its length keeps increasing instead of the window being of fixed size.

Upvotes: 2

coco18
coco18

Reputation: 1085

I am not sure, what you excally want, but I think the following code will work:

df[['date','low']].loc[df['ticker']=='A'].min()

Output:

date    2020-05-01 09:30:00
low                      10

You don't neet to set the index to the ticker and date

Upvotes: 0

Related Questions