Roman Rdgz
Roman Rdgz

Reputation: 13244

Python Pandas sum a constant value in Columns If date between 2 dates

Let's assume a dataframe using datetimes as index, where we have a column named 'Score', initialy set to 10:

            score
2016-01-01  10
2016-01-02  10
2016-01-03  10
2016-01-04  10
2016-01-05  10
2016-01-06  10
2016-01-07  10
2016-01-08  10

I want to substract a fixed value (let's say 1) from the score, but only when the index is between certain dates (for example between the 3rd and the 6th):

            score
2016-01-01  10
2016-01-02  10
2016-01-03  9
2016-01-04  9
2016-01-05  9
2016-01-06  9
2016-01-07  10
2016-01-08  10

Since my real dataframe is big, and I will be doing this for different dateranges and different fixed values N for each one of them, I'd like to achieve this without requiring to create a new column set to -N for each case.

Something like numpy's where function, but for a certain range, and allowing me to sum/substract to current value if the condition is met, and do nothing otherwise. Is there something like that?

Upvotes: 2

Views: 1913

Answers (3)

Nathan Furnal
Nathan Furnal

Reputation: 2410

I would do something like that using query :

import pandas as pd

df = pd.DataFrame({"score":pd.np.random.randint(1,10,100)}, 
    index=pd.date_range(start="2018-01-01", periods=100))

start = "2018-01-05"
stop = "2018-04-08"

df.query('@start <= index <= @stop ') - 1  

Edit : note that something using eval which goes to boolean, can be used but in a different manner because pandas where acts on the False values.

df.where(~df.eval('@start <= index <=  @stop '), 
         df['score'] - 1, axis=0, inplace=True)

See how I inverted the comparison operators (with ~), in order to get what I wanted. It's efficient but not really clear. Of course, you can also use pd.np.where and all is good in the world.

Upvotes: 2

manwithfewneeds
manwithfewneeds

Reputation: 1167

Assuming dates are datetime dtype:

#if date is own column:    
df.loc[df['date'].dt.day.between(3,6), 'score'] = df['score'] - 1

#if date is index:    
df.loc[df.index.day.isin(range(3,7)), 'score'] = df['score'] - 1

Upvotes: 2

Code Different
Code Different

Reputation: 93151

Use index slicing:

df.loc['2016-01-03':'2016-01-06', 'score'] -= 1

Upvotes: 5

Related Questions