armara
armara

Reputation: 557

Create mask based on element inside Pandas Series

I have a dataframe with around 250k rows, here's code to create a small version of it;

some_dict = {'month end': {0: pd.Timestamp('2020-10-31 00:00:00'),1: pd.Timestamp('2020-10-31 00:00:00')},'end date': {0: pd.Timestamp('2021-02-21 00:00:00'),1: pd.Timestamp('2021-10-20 00:00:00')},'value': {0: 36.15, 1: 40.10}}
df = pd.DataFrame(some_dict)

Previously I've used a mask to get the value if the column end date is between the column month end + 2 days and month end + 3 days, like so;

mask = (df['end date'] > (df['month end'] + pd.Timedelta(2, 'days'))) & (df['end date'] <= (df['month end'] + pd.Timedelta(3, 'days')))
df['test'] = df.loc[mask, 'value']

Right now I'm trying to create a mask according to the row below;

mask = (df['End Date'].day > (df['month end'] + pd.Timedelta(2, 'days')).day) & (df['End Date'].day <= (df['month end'] + pd.Timedelta(3, 'days')).day) & (df['End Date'] >= (df['month end'] + pd.Timedelta(3, 'days')))
df['Greater than 2 days up to 3 days'] = df.loc[mask, 'value']

which should be True for the row with value 40.10, and False for the row with value 36.15. But both df['End Date'].day and (df['month end'] + pd.Timedelta(2, 'days')).day produces an error that says 'Series' object has no attribute 'day'. I'm wondering if it's possible to create a mask that uses the object within the Series object? In my case I have pd.Timestamp, and I would like to use pd.Timestamp.day.

The mask above is for one column, if this could work smoothly I'll try to do the same for other columns, such as Greater than 4 days up to 5 days, Greater than 2 weeks up to 3 weeks, Greater than 2 months up to 3 months etc.

EDIT: I tried creating extra columns including the month end days like so

df['month end day'] = df.loc[0, 'month end'].day

and then create the mask using df['month end day'] instead of df['month end'].day, but this won't work for the end date-column because it doesn't have the same value in each row, and writing

df['end date day'] = df.loc[:, 'end date'].day

gives the same error 'Series' object has no attribute ´day´.

Upvotes: 0

Views: 524

Answers (1)

Just like you have to use .str to access the string methods in a Series, you also need to use dt to access datetimelike properties of the Series values. If accessing a specific Timestamp value from a Series, then you don't need to use dt, since you're working with Timestamp directly.

So, summarizing, just use df['end date'].dt.day and you should be fine.

Upvotes: 1

Related Questions