Michael Qiu
Michael Qiu

Reputation: 13

Unable to get dummies for a specific variable in pandas dataframe

I am trying to create dummies for one of my variables in this dataset, however an error is occurring which I do not know how to resolve it, any clues?

The code:

df = pd.read_excel(open('DID dataset.xlsx', 'rb'), sheet_name = 'All2')
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)

The data: https://gyazo.com/79af7378c4e06c0f36f7f43d03a65119

The error:

Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
Traceback (most recent call last):

File "<ipython-input-5-f9cbe04c43a1>", line 1, in <module>
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2685, in __getitem__
return self._getitem_column(key)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2692, in _getitem_column
return self._get_item_cache(key)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2486, in _get_item_cache
values = self._data.get(item)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3065, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))

File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc

File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item

File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'Location'

The same error occurs when I just type in

df['Location']

Is there something wrong with my excel dataset for this particular column since I am able to obtain dummies for my other variables or is it something else?

Upvotes: 1

Views: 1178

Answers (1)

MD Rijwan
MD Rijwan

Reputation: 481

Your code is Perfectly fine but the problem may or may not is in the column name, your column name must have some leading or trailing space. So to check it use:

print("Column headings:")
print(df.columns)

So you can check df['Location '] or df[' location'] to get your column data and accordingly change your code for get_dummies.

Upvotes: 1

Related Questions