cphill
cphill

Reputation: 5914

Python - Pandas - Set New DataFrame from SubSelect Mask Not Index Error

I am trying to set a new dataFrame based on the user values being set for the date variables below. the raw data date column (Date) comes into pandas in the following format 7/5/17. Following what I assume are best practices, I convert the field into datetime format which produces an array with the yyyy-mm-dd format, '2017-12-01', '2017-12-02', '2017-12-03', '2017-12-04','2017-12-05',. From here I am trying to subselect my dataFrame with a date_range within my start and end date and then only show the columns being selected with variables X and y. However, I produce raise KeyError('{mask} not in index'.format(mask=objarr[mask])) at the subselect line. What value in my code could be throwing that error? Is it due to the datetime formatting?

# date column and conversion to datetime64[ns]
dateColumn = pd.to_datetime(rawData['Date'])

# date start
dateStart = '12/1/17'

# date end
dateEnd = '2/28/18'

# date range
dateRange = pd.date_range(dateStart, dateEnd)


# dependent variable
y = 'Leads'

# independent variable(s)
X = 'Clicks'

Sub-select x and y columns for Date rows between 12/1/17 and 2/28/18:

print(rawData[rawData[dateColumn].isin(dateRange)][X,y])

Upvotes: 0

Views: 248

Answers (1)

pomber
pomber

Reputation: 23980

You are indexing with the column instead of the column name:

print(rawData[dateColumn.isin(dateRange)][[X,y]])

Upvotes: 1

Related Questions