creating new variable and applying conditional value based on a date range with pandas dataframe

Question

New to Python and coding in general here so this should be pretty basic for most of you.

I basically created this dataframe with a Datetime index.

Here's the dataframe

df = pd.date_range(start='2018-01-01', end='2019-12-31', freq='D')

I would now like to add a new variable to my df called "vacation" with a value of 1 if the date is between 2018-06-24 and 2018-08-24 and value of 0 if it's not between those dates. How can I go about doing this? I've created a variable with a range of vacation but I'm not sure how to put these two together along with creating a new column for "vacation" in my dataframe.

vacation = pd.date_range(start = '2018-06-24', end='2018-08-24')

Thanks in advance.

Gautham Pughazhendhi · Accepted Answer

First, pd.date_range(start='2018-01-01', end='2019-12-31', freq='D') will not create a DataFrame instead it will create a DatetimeIndex. You can then convert it into a DataFrame by having it as an index or a separate column.

# Having it as an index

datetime_index = pd.date_range(start='2018-01-01', end='2019-12-31', freq='D')
df = pd.DataFrame({}, index=datetime_index)
# Using numpy.where() to create the Vacation column
df['Vacation'] = np.where((df.index >= '2018-06-24') & (df.index <= '2018-08-24'), 1, 0)

Or

# Having it as a column

datetime_index = pd.date_range(start='2018-01-01', end='2019-12-31', freq='D')
df = pd.DataFrame({'Date': datetime_index})
# Using numpy.where() to create the Vacation column
df['Vacation'] = np.where((df['Date'] >= '2018-06-24') & (df['Date'] <= '2018-08-24'), 1, 0)

Note: Displaying only the first five rows of the dataframe df.

creating new variable and applying conditional value based on a date range with pandas dataframe

Answers (2)

Related Questions