Loai Alnouri
Loai Alnouri

Reputation: 133

Pandas: Iterating over rows, Adding and Subtracting Dates, Appending those dates to a new column depending on row value

To give a brief overview of whats going on, I am observing temperature fluctutations and have filtered data from indoor and outdoor temp in an office only where temperature fluctuates. these fluctuations only occur in the mornings and at night as during the day, the temp is controlled. I will be using an ANN to learn from these fluctuations and model how long it would take for temp to change depending on other variables like OutdoorTemp, SolarDiffuseRate, etc.

Question 1: How do I iterate by row, firstly, looking at times and adding a binary column where 0 would be mornings, and 1 would be the night-time.

Question 2: for each day, there will be a different length of series of rows for mornings and evenings depending on how long it takes the temperature to change between 22 degrees and 17 degrees. How do I add a column for each day, and each morning and evening, which states the time it took for the temp to get from X to Y.

Basically adding or subtracting time to get the difference, then appending per morning or night.

                     OfficeTemp  OutdoorTemp  SolarDiffuseRate 
DateTime                                                         
2006-01-01 07:15:00   19.915275       0.8125               0.0   
2006-01-01 07:30:00   20.463506       0.8125               0.0   
2006-01-01 07:45:00   20.885112       0.8125               0.0   
2006-01-01 20:15:00   19.985398       8.3000               0.0   
2006-01-01 20:30:00   19.157857       8.3000               0.0   
...                         ...          ...               ...   
2006-06-30 22:00:00   18.056205      22.0125               0.0   
2006-06-30 22:15:00   17.993072      19.9875               0.0   
2006-06-30 22:30:00   17.929643      19.9875               0.0   
2006-06-30 22:45:00   17.867148      19.9875               0.0   
2006-06-30 23:00:00   17.804429      19.9875               0.0   

Upvotes: 0

Views: 58

Answers (1)

Ted
Ted

Reputation: 1233

df = pd.DataFrame(index=pd.date_range('2006-01-01', '2006-06-30', freq='15min'))
df['OfficeTemp'] = np.random.normal(loc=20, scale=5, size=df.shape[0])
df['OutdoorTemp'] = np.random.normal(loc=12, scale=5, size=df.shape[0])
df['SolarDiffuseRate'] = 0.0

Question 1:

df['PartofDay'] = df.index.hour.map(lambda x: 0 if x < 12 else 1)

For question 2, a tolerance would need to be defined (the temperature is never going to be exactly 17 or 22 degrees).

import numpy as np

def temp_change_duration(group):
   tol=0.3
   first_time = group.index[np.isclose(group['OfficeTemp'], 17, atol=tol)][0]
   second_time = group.index[np.isclose(group['OfficeTemp'], 22, atol=tol)][0]
   return(abs(second_time-first_time))

Then apply this function to our df:

df.groupby([df.index.day, 'PartofDay']).apply(temp_change_duration)

This will get you most of the way there, but will give funny answers using the normally distributed synthetic data I've generated. See if you can adapt temp_change_duration to work with your data

Upvotes: 1

Related Questions