gsawdy
gsawdy

Reputation: 29

Python for loop based on criteria in one column return result in another column

I have an excel spreadsheet with a column called "Description" that I would like to itereate through. In this column are items that I use as criteria to categorize each item. For example, if the description column says "Interest on Investment" I would like the code to see that in the description column and return "Investment Income" in another column called "Category." I have tried this and can not seem to get it to work. You can see below that I have tried this different ways and I think that I am missing something simple such as classifying a string even though I have attempted this as well. If there are any questions please let me know or if more info is needed as I am just learning.

import pandas as pd

GLDetail = pd.read_excel("GL_DetailFY19.xlsx")

GLDetail["Category"] = ["Interest Income" if desc == "Income on Investment" else cat
                        for desc, cat in zip(GLDetail["Description"], GLDetail["Category"])]

GLDetail.to_excel(r'C:\Users\Gregg.Sawdy\Desktop\newxlrd\Test.xlsx', index = False)

print(GLDetail)

This is the error Traceback (most recent call last): File "C:\Users\Gregg.Sawdy\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexes\base.py", line 2891, in get_loc return self._engine.get_loc(casted_key) File "pandas_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'Category'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\Gregg.Sawdy\Desktop\GLCatogories.py", line 6, in for desc, cat in zip(GLDetail["Description"], GLDetail["Category"])] File "C:\Users\Gregg.Sawdy\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\frame.py", line 2902, in getitem indexer = self.columns.get_loc(key) File "C:\Users\Gregg.Sawdy\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pandas\core\indexes\base.py", line 2893, in get_loc raise KeyError(key) from err KeyError: 'Category'

Upvotes: 1

Views: 170

Answers (1)

Conor
Conor

Reputation: 167

Your problem here is that you're using the equality operator == instead of the assignment operator =.

What you want is this:

for _, row in GLDetail.iterrows():
    if row["Description"] == "Income on Investment":
        row["Category"] = "Interest Income"

Notice that I made a couple other changes to your code as well, to make it cleaner. I changed ind to _, which means "this variable is unimportant and will not be used". Second, I got rid of the line GLDetail.loc[ind, "Category"];, which does nothing. Third, I got rid of the else: pass. I also corrected the spelling on "income".

Although a more python method would be to use a list comprehension. In this case:

GLDetail["Category"] = ["Interest Income" if desc == "Income on Investment" else cat
                        for desc, cat in zip(GLDetail["Description"], GLDetail["Category"])]

Upvotes: 1

Related Questions