Chong
Chong

Reputation: 21

Python: Getting values of one column based on another column name

I am new to python and trying to read a csv file and generate a sales report. For example, I have a dataset as per below

Category  | State     | Sales| Profits|
Clothes   | California| 2389 | 608
Stationery| Georgia   | 687  | 54
Gadgets   | Washington| 324  | 90

How can I get the sum of the profit based on the state and category without using pandas? Meaning I need to sum the values of "Sales" and "Profit" when category is "Clothes".

I am using the code below currently, which requires a lot manual effort.

with open(path,"r") as store_file:
    reader = csv.DictReader(superstore_file)
    
    total_sales_clothes = 0
    total_sales_stationery = 0
    total_sales_gadgets = 0
    
    for row in reader:
        category = row.get("Category")
        if category=="Clothes":
            sales_clothes = float(row.get("Sales"))
            total_sales_clothes += sales_clothes
        elif category=="Stationery":
            sales_stationery = float(row.get("Sales"))
            total_sales_stationery += sales_stationery
        elif category=="Gadgets":
            sales_office_gadgets = float(row.get("Sales"))
            total_sales_gadgets += sales_gadgets 
    print("Total Sales for Clothes is: {0}".format(total_sales_clothes))
    print("Total Sales for Stationery is {0}".format(total_sales_stationery))
    print("Total Sales for Gadgets is {0}".format(total_sales_gadgets))

Upvotes: 2

Views: 254

Answers (1)

edison16029
edison16029

Reputation: 346

You can use the python dict. This way, you don't have to hardcode the categories The python dictionary is a key-value data structure which is extremely useful. In your case, key would be the categories and value would be the total sales. Eg.

{ "Clothes" : 2389, "Stationery" : 0, "Gadgets" : 0, }

Edit : Note that you should check if the category exists. If it does, just add the sales value, else just assign the value.

with open(path,"r") as store_file:

reader = csv.DictReader(superstore_file)

total_sales = {}

for row in reader:
    category = row.get("Category")
    if category in total_sales:
        total_sales[category] += float(row.get(category))
    else:
        total_sales[category] = float(row.get(category))

print("Total Sales for Clothes is: {0}".format(total_sales['Clothes']))

If you want to traverse through the dict and print sales for all the categories use

for key in total_sales:
    print("Total Sales for " + key + " is : " + total_sales[key])

And, though you mentioned you don't want to use Pandas, I would suggest you to check its usage if you are going to work in these types of CSV Dataset for long period. Once you start, you ll find how easy it makes your job. You will save more time than what you spend learning Pandas.

Upvotes: 1

Related Questions