Reputation: 39
I am using Python to play around with a CSV file. I am trying to create a loop that will return values associated with the strings in the first column. Specifically, for every row that contains the string "string" I would like the values of column 2 to be summed. I have many rows of data that contain the string "string", and I would like every value of the second column associated with the string "string" summed to give me one single number.
Thank you for your help in advance.
Column1 Column2 Column 3
9 NaN 40 101
12 String 50 102
Upvotes: 0
Views: 476
Reputation: 126
If you are using Python, this might help you.
I assume you are using Excel to store the values, so what you need to do is:
First get the openpyxl package pip install openpyxl
install into your python
import json
from openpyxl import load_workbook
workbook = load_workbook(filename='C:/Users/john/Desktop/fjdc.xlsx')
sheet = workbook["sheet1"]
sum = 0
# Using the values_only because you want to return the cells' values
# whatever the min and max col you want
for row in sheet.iter_rows(min_row=0,
min_col=0,
max_col=3,
values_only=True):
if(row[0]=="Full_Shadow"):
sum+=float(row[1])
# Using json here to be able to format the output for displaying later
print(json.dumps(sum))
Upvotes: 0
Reputation: 26211
I highly encourage you to look into pandas
.
Its pd.read_csv()
is second to none (IMO) in terms of speed and features. And then, you get pretty awesome analytics capabilities.
Anyway, for your question, there are many ways you can approach that. Here is one:
import pandas as pd
df = pd.read_csv('myfile.csv')
df.loc[df['Shadow'] == 'Full Shadow', 'February'].sum()
# gives: 35.37
Or, to give you a taste of cool things you can do in one line of pandas:
df.groupby('Shadow').sum()
Gives:
February March
Shadow
Full Shadow 35.37 41.05
No Record 26.60 39.97
Upvotes: 1