Reputation: 13244
Let's assume a dataframe using datetimes as index, where we have a column named 'Score', initialy set to 10:
score
2016-01-01 10
2016-01-02 10
2016-01-03 10
2016-01-04 10
2016-01-05 10
2016-01-06 10
2016-01-07 10
2016-01-08 10
I want to substract a fixed value (let's say 1) from the score, but only when the index is between certain dates (for example between the 3rd and the 6th):
score
2016-01-01 10
2016-01-02 10
2016-01-03 9
2016-01-04 9
2016-01-05 9
2016-01-06 9
2016-01-07 10
2016-01-08 10
Since my real dataframe is big, and I will be doing this for different dateranges and different fixed values N for each one of them, I'd like to achieve this without requiring to create a new column set to -N for each case.
Something like numpy's where
function, but for a certain range, and allowing me to sum/substract to current value if the condition is met, and do nothing otherwise. Is there something like that?
Upvotes: 2
Views: 1913
Reputation: 2410
I would do something like that using query :
import pandas as pd
df = pd.DataFrame({"score":pd.np.random.randint(1,10,100)},
index=pd.date_range(start="2018-01-01", periods=100))
start = "2018-01-05"
stop = "2018-04-08"
df.query('@start <= index <= @stop ') - 1
Edit : note that something using eval
which goes to boolean, can be used but in a different manner because pandas where
acts on the False
values.
df.where(~df.eval('@start <= index <= @stop '),
df['score'] - 1, axis=0, inplace=True)
See how I inverted the comparison operators (with ~
), in order to get what I wanted. It's efficient but not really clear. Of course, you can also use pd.np.where
and all is good in the world.
Upvotes: 2
Reputation: 1167
Assuming dates are datetime dtype:
#if date is own column:
df.loc[df['date'].dt.day.between(3,6), 'score'] = df['score'] - 1
#if date is index:
df.loc[df.index.day.isin(range(3,7)), 'score'] = df['score'] - 1
Upvotes: 2
Reputation: 93151
Use index slicing:
df.loc['2016-01-03':'2016-01-06', 'score'] -= 1
Upvotes: 5