Reputation: 43
Beginner programmer here. I'm given a csv file. One column contains dates, another column contains a string that either says 'Absent' or 'Present'. (the dates are also a string)
What I'm trying to accomplish is to group a percentage of the amount of kids that showed up for a particular date.
So like maybe for an example as the end result I'd have a list of lists that contains the date and the percentage of students that attended like so
Attendance = [[08/22/2016, 89.013],[08/26/2016, 84.33]]
The only problem is I don't know how to get to that point.
Could someone show me how I'd get from point A to point B?
Edit: for this example let's say
file_o = open(csvFile, 'r')
csvF = csv.reader(file_o)
for line in csvF:
line[0] # contains date
line[1] # contains 'Absent' or 'Present
Upvotes: 0
Views: 73
Reputation: 77347
A dict
seems like the easiest approach. Use it to record a list of present/absent values for each date and then sum those up. Since you want only certain dates, I've initialized the tracking dictionary with those dates and just ignore the others.
(note: updated to a working example)
import csv
# write a test file
open('mytest.csv', 'w').write("""08/22/2016,Present,Fiona
08/22/2016,Absent,Ralph
08/23/2016,Present,Fiona
08/23/2016,Absent,Ralph
08/24/2016,Present,Fiona
08/24/2016,Absent,Ralph
08/25/2016,Present,Fiona
08/25/2016,Absent,Ralph
""")
# initialize tracker with wanted dates.
wanted_dates = ['08/22/2016', '08/25/2016', '08/30/2016']
tracker = {wanted:[] for wanted in wanted_dates}
with open('mytest.csv', newline='') as fp:
reader = csv.reader(fp)
for row in reader:
if row:
date = row[0]
# only add wanted dates
if date in tracker:
present = row[1].lower()
tracker[date].append(present == 'present')
# create final report. make a copy of tracker's values because we
# will change tracker during enumeration.
for date, present_list in tracker.items():
if not present_list:
# no data, so show 0
present_list = [0]
tracker[date] = float(sum(present_list))/len(present_list) * 100
for date, percent in sorted(tracker.items()):
print('{} {:2.2f}'.format(date, percent))
Upvotes: 1