Reputation: 13
I am trying to create dummies for one of my variables in this dataset, however an error is occurring which I do not know how to resolve it, any clues?
The code:
df = pd.read_excel(open('DID dataset.xlsx', 'rb'), sheet_name = 'All2')
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
The data: https://gyazo.com/79af7378c4e06c0f36f7f43d03a65119
The error:
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
Traceback (most recent call last):
File "<ipython-input-5-f9cbe04c43a1>", line 1, in <module>
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2685, in __getitem__
return self._getitem_column(key)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2692, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2486, in _get_item_cache
values = self._data.get(item)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3065, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Location'
The same error occurs when I just type in
df['Location']
Is there something wrong with my excel dataset for this particular column since I am able to obtain dummies for my other variables or is it something else?
Upvotes: 1
Views: 1178
Reputation: 481
Your code is Perfectly fine but the problem may or may not is in the column name, your column name must have some leading or trailing space. So to check it use:
print("Column headings:")
print(df.columns)
So you can check df['Location ']
or df[' location']
to get your column data and accordingly change your code for get_dummies.
Upvotes: 1