Reputation: 627
Is there a pandas way to do that:
predicted_sells = []
for row in df.values:
index_tms = row[0]
delta = index_tms + timedelta(hours=1)
try:
sells_to_predict = df.loc[delta]['cars_sold']
except KeyError:
new_element = None
predicted_sells.append(sells_to_predict)
df['sell_to_predict'] = predicted_sells
example explanation:
sell is the number of cars I sold at the time tms. sell_to_predict is the number of cars I sold the hour after. I want to predict that. So I want to build a new column containing at the time tms the number of cars I will sell at the time tms+1h
before my code it looks like that
tms sell
2015-11-23 15:00:00 6
2015-11-23 16:00:00 2
2015-11-23 17:00:00 10
after it looks like that
tms sell sell_to_predict
2015-11-23 15:00:00 6 2
2015-11-23 16:00:00 2 10
2015-11-23 17:00:00 10 NaN
I create a new column based on a shift of an other column, but that's not a shift in number of columns. That's a shift based on an index (here the index is a timestamp)
Here is an other example, little more complex :
before :
sell random
store hour
1 1 1 9
2 7 7
2 1 4 3
2 2 3
after :
sell random predict
store hour
1 1 1 9 7
2 7 7 NaN
2 1 4 3 2
2 2 3 NaN
Upvotes: 2
Views: 9737
Reputation: 627
the answer was to resample so I won't have any hole, and then apply the answer for this question : How do you shift Pandas DataFrame with a multiindex?
Upvotes: 2
Reputation: 9946
have you tried shift
?
e.g.
df = pd.DataFrame(list(range(4)))
df.columns = ['sold']
df['predict'] = df.sold.shift(-1)
df
sold predict
0 0 1
1 1 2
2 2 3
3 3 NaN
Upvotes: 2