Reputation: 29
I have an excel spreadsheet with a column called "Description" that I would like to itereate through. In this column are items that I use as criteria to categorize each item. For example, if the description column says "Interest on Investment" I would like the code to see that in the description column and return "Investment Income" in another column called "Category." I have tried this and can not seem to get it to work. You can see below that I have tried this different ways and I think that I am missing something simple such as classifying a string even though I have attempted this as well. If there are any questions please let me know or if more info is needed as I am just learning.
import pandas as pd
GLDetail = pd.read_excel("GL_DetailFY19.xlsx")
GLDetail["Category"] = ["Interest Income" if desc == "Income on Investment" else cat
for desc, cat in zip(GLDetail["Description"], GLDetail["Category"])]
GLDetail.to_excel(r'C:\Users\Gregg.Sawdy\Desktop\newxlrd\Test.xlsx', index = False)
print(GLDetail)
This is the error Traceback (most recent call last): File "C:\Users\Gregg.Sawdy\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexes\base.py", line 2891, in get_loc return self._engine.get_loc(casted_key) File "pandas_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'Category'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\Gregg.Sawdy\Desktop\GLCatogories.py", line 6, in for desc, cat in zip(GLDetail["Description"], GLDetail["Category"])] File "C:\Users\Gregg.Sawdy\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\frame.py", line 2902, in getitem indexer = self.columns.get_loc(key) File "C:\Users\Gregg.Sawdy\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexes\base.py", line 2893, in get_loc raise KeyError(key) from err KeyError: 'Category'
Upvotes: 1
Views: 170
Reputation: 167
Your problem here is that you're using the equality operator ==
instead of the assignment operator =
.
What you want is this:
for _, row in GLDetail.iterrows():
if row["Description"] == "Income on Investment":
row["Category"] = "Interest Income"
Notice that I made a couple other changes to your code as well, to make it cleaner. I changed ind
to _
, which means "this variable is unimportant and will not be used". Second, I got rid of the line GLDetail.loc[ind, "Category"];
, which does nothing. Third, I got rid of the else: pass
. I also corrected the spelling on "income".
Although a more python method would be to use a list comprehension. In this case:
GLDetail["Category"] = ["Interest Income" if desc == "Income on Investment" else cat
for desc, cat in zip(GLDetail["Description"], GLDetail["Category"])]
Upvotes: 1