LeeHarveyOswald
LeeHarveyOswald

Reputation: 39

Sum Values Associated with a String

I am using Python to play around with a CSV file. I am trying to create a loop that will return values associated with the strings in the first column. Specifically, for every row that contains the string "string" I would like the values of column 2 to be summed. I have many rows of data that contain the string "string", and I would like every value of the second column associated with the string "string" summed to give me one single number.

Thank you for your help in advance.

             Column1                         Column2                     Column 3
9           NaN                        40                      101

12        String                        50                      102

Upvotes: 0

Views: 476

Answers (2)

JohntheTyro
JohntheTyro

Reputation: 126

If you are using Python, this might help you. I assume you are using Excel to store the values, so what you need to do is: First get the openpyxl package pip install openpyxl install into your python

import json
from openpyxl import load_workbook

workbook = load_workbook(filename='C:/Users/john/Desktop/fjdc.xlsx')
sheet = workbook["sheet1"]

sum = 0

# Using the values_only because you want to return the cells' values
# whatever the min and max col you want
for row in sheet.iter_rows(min_row=0,
                           min_col=0,
                           max_col=3,
                           values_only=True):
 if(row[0]=="Full_Shadow"):
   sum+=float(row[1])

# Using json here to be able to format the output for displaying later
print(json.dumps(sum))

Upvotes: 0

Pierre D
Pierre D

Reputation: 26211

I highly encourage you to look into pandas.

Its pd.read_csv() is second to none (IMO) in terms of speed and features. And then, you get pretty awesome analytics capabilities.

Anyway, for your question, there are many ways you can approach that. Here is one:

import pandas as pd

df = pd.read_csv('myfile.csv')
df.loc[df['Shadow'] == 'Full Shadow', 'February'].sum()
# gives: 35.37

Or, to give you a taste of cool things you can do in one line of pandas:

df.groupby('Shadow').sum()

Gives:

             February  March
Shadow                      
Full Shadow     35.37  41.05
No Record       26.60  39.97

Upvotes: 1

Related Questions