yessir1324
yessir1324

Reputation: 43

What data structure should I organize my data in?

Beginner programmer here. I'm given a csv file. One column contains dates, another column contains a string that either says 'Absent' or 'Present'. (the dates are also a string)

What I'm trying to accomplish is to group a percentage of the amount of kids that showed up for a particular date.

So like maybe for an example as the end result I'd have a list of lists that contains the date and the percentage of students that attended like so

Attendance = [[08/22/2016, 89.013],[08/26/2016, 84.33]]

The only problem is I don't know how to get to that point.

Could someone show me how I'd get from point A to point B?

Edit: for this example let's say

file_o = open(csvFile, 'r')
csvF = csv.reader(file_o)
for line in csvF:
    line[0] # contains date
    line[1] # contains 'Absent' or 'Present

Upvotes: 0

Views: 73

Answers (1)

tdelaney
tdelaney

Reputation: 77347

A dict seems like the easiest approach. Use it to record a list of present/absent values for each date and then sum those up. Since you want only certain dates, I've initialized the tracking dictionary with those dates and just ignore the others.

(note: updated to a working example)

import csv

# write a test file
open('mytest.csv', 'w').write("""08/22/2016,Present,Fiona
08/22/2016,Absent,Ralph
08/23/2016,Present,Fiona
08/23/2016,Absent,Ralph
08/24/2016,Present,Fiona
08/24/2016,Absent,Ralph
08/25/2016,Present,Fiona
08/25/2016,Absent,Ralph



""")

# initialize tracker with wanted dates.
wanted_dates = ['08/22/2016', '08/25/2016', '08/30/2016']
tracker = {wanted:[] for wanted in wanted_dates}

with open('mytest.csv', newline='') as fp:
    reader = csv.reader(fp)
    for row in reader:
        if row:
            date = row[0]
            # only add wanted dates
            if date in tracker:
                present = row[1].lower()
                tracker[date].append(present == 'present')

# create final report. make a copy of tracker's values because we
# will change tracker during enumeration.
for date, present_list in tracker.items():
    if not present_list:
        # no data, so show 0
        present_list = [0]
    tracker[date] = float(sum(present_list))/len(present_list) * 100

for date, percent in sorted(tracker.items()):
    print('{} {:2.2f}'.format(date, percent))

Upvotes: 1

Related Questions