Reputation: 133
To give a brief overview of whats going on, I am observing temperature fluctutations and have filtered data from indoor and outdoor temp in an office only where temperature fluctuates. these fluctuations only occur in the mornings and at night as during the day, the temp is controlled. I will be using an ANN to learn from these fluctuations and model how long it would take for temp to change depending on other variables like OutdoorTemp, SolarDiffuseRate, etc.
Question 1: How do I iterate by row, firstly, looking at times and adding a binary column where 0 would be mornings, and 1 would be the night-time.
Question 2: for each day, there will be a different length of series of rows for mornings and evenings depending on how long it takes the temperature to change between 22 degrees and 17 degrees. How do I add a column for each day, and each morning and evening, which states the time it took for the temp to get from X to Y.
Basically adding or subtracting time to get the difference, then appending per morning or night.
OfficeTemp OutdoorTemp SolarDiffuseRate
DateTime
2006-01-01 07:15:00 19.915275 0.8125 0.0
2006-01-01 07:30:00 20.463506 0.8125 0.0
2006-01-01 07:45:00 20.885112 0.8125 0.0
2006-01-01 20:15:00 19.985398 8.3000 0.0
2006-01-01 20:30:00 19.157857 8.3000 0.0
... ... ... ...
2006-06-30 22:00:00 18.056205 22.0125 0.0
2006-06-30 22:15:00 17.993072 19.9875 0.0
2006-06-30 22:30:00 17.929643 19.9875 0.0
2006-06-30 22:45:00 17.867148 19.9875 0.0
2006-06-30 23:00:00 17.804429 19.9875 0.0
Upvotes: 0
Views: 58
Reputation: 1233
df = pd.DataFrame(index=pd.date_range('2006-01-01', '2006-06-30', freq='15min'))
df['OfficeTemp'] = np.random.normal(loc=20, scale=5, size=df.shape[0])
df['OutdoorTemp'] = np.random.normal(loc=12, scale=5, size=df.shape[0])
df['SolarDiffuseRate'] = 0.0
Question 1:
df['PartofDay'] = df.index.hour.map(lambda x: 0 if x < 12 else 1)
For question 2, a tolerance would need to be defined (the temperature is never going to be exactly 17 or 22 degrees).
import numpy as np
def temp_change_duration(group):
tol=0.3
first_time = group.index[np.isclose(group['OfficeTemp'], 17, atol=tol)][0]
second_time = group.index[np.isclose(group['OfficeTemp'], 22, atol=tol)][0]
return(abs(second_time-first_time))
Then apply this function to our df
:
df.groupby([df.index.day, 'PartofDay']).apply(temp_change_duration)
This will get you most of the way there, but will give funny answers using the normally distributed synthetic data I've generated. See if you can adapt temp_change_duration
to work with your data
Upvotes: 1