Assign first value in the day to the rest of the rows for that day using Pandas

Question

Please, I have a pandas dataframe containing intraday data for 2 stocks. The index is a time series sampled by minute (i.e. 1/1/2017 9:30, 1/1/2017 9:31, 1/1/2017 9:32, ...). There are only two columns "Price A", "Price B". Total number of rows = 52000. I need to create a new column in which I store the 9.30 am value for every day. Assuming for 1/1/2017, the 9:30 am "Price A" is 150, I would need to store this value in a new column called "Open A" for every row that has the same day. For example:

Sample input:

                     Price A  Price B
date                                 
2017-01-01 09:30:00      150        1
2017-01-01 09:31:00      153        2
2017-01-01 09:31:00      149        3
2017-01-01 09:31:00      151        4
2017-02-01 09:30:00      145        1
2017-02-01 09:31:00      139        2
2017-02-01 09:31:00      142        3
2017-02-01 09:31:00      149        4

I tried to simply use:

for ind in df.index: df['Open A'][ind] = 2

just to make a test but this seems to be taking forever. I also tried to read what's available here: How to iterate over rows in a DataFrame in Pandas? but it doesn't seem to be of help. does anybody have a suggestion? Thanks

cs95 · Accepted Answer

If needed, set your index to datetime -

df.index = pd.to_datetime(df.index, errors='coerce')

df

                     Price A  Price B
date                                 
2017-01-01 09:30:00      150        1
2017-01-01 09:31:00      153        2
2017-01-01 09:31:00      149        3
2017-01-01 09:31:00      151        4
2017-02-01 09:30:00      145        1
2017-02-01 09:31:00      139        2
2017-02-01 09:31:00      142        3
2017-02-01 09:31:00      149        4

An assumption here is that your day's recordings start at 9:30, making our job really easy.

Use groupby with a pd.Grouper + transform + first -

df['Open A'] = df.groupby(pd.Grouper(freq='1D'))['Price A'].transform('first')    
df

                     Price A  Price B  Open A
date                                         
2017-01-01 09:30:00      150        1     150
2017-01-01 09:31:00      153        2     150
2017-01-01 09:31:00      149        3     150
2017-01-01 09:31:00      151        4     150
2017-02-01 09:30:00      145        1     145
2017-02-01 09:31:00      139        2     145
2017-02-01 09:31:00      142        3     145
2017-02-01 09:31:00      149        4     145

Assign first value in the day to the rest of the rows for that day using Pandas

Answers (1)

Related Questions