Sam
Sam

Reputation: 1062

Python script not looping correctly

I am using this python code to look through a csv, which has dates in one column and values in the other. I am recording the minimum value from each year. My code is not looping through correctly. What's my stupid mistake? Cheers

import csv
refMin = 40

with open('data.csv') as csvfile:
        reader = csv.reader(csvfile, delimiter=',',quotechar='|', quoting=csv.QUOTE_ALL)
        for i in range(1968,2014):
            for row in reader:
                if str(row[0])[:4] == str(i):
                    if float(row[1]) <= refMin:
                        refMin = float(row[1])
            print 'The minimum value for ' + str(i) + ' is: ' + str(refMin)

Upvotes: 0

Views: 221

Answers (2)

Weeble
Weeble

Reputation: 17910

The reader can only be iterated once. The first time around the for i in range(1968,2014) loop, you consume every item in the reader. So the second time around that loop, there are no items left.

If you want to compare every value of i against every row in the file, you could swap your loops around, so that the loop for row in reader is on the outside and only runs once, with multiple runs of the i loop instead. Or you could create a new reader each time round, although that might be slower.

If you want to process the entire file in one pass, you'll need to create a dictionary of values to replace refMin. When processing each row, either iterate through the dictionary keys, or look it up based on the current row. On the other hand, if you're happy to read the file multiple times, just move the line reader = csv.reader(...) inside the outer loop.

Here's an untested idea for doing it in one pass:

import csv
import collections
refMin = collections.defaultdict(lambda:40)

with open('data.csv') as csvfile:
    reader = csv.reader(csvfile, delimiter=',',quotechar='|', quoting=csv.QUOTE_ALL)
    allowed_years = set(str(i) for i in range(1968,2014))
    for row in reader:
        year = int(str(row[0])[:4])
        if float(row[1]) <= refMin[year]:
            refMin[year] = float(row[1])

for year in range(1968, 2014):
    print 'The minimum value for ' + str(year) + ' is: ' + str(refMin[year])

defaultdict is just like a regular dictionary except that it has a default value for keys that haven't previously been set.

Upvotes: 4

njzk2
njzk2

Reputation: 39397

I would refactor that to read the file only once:

import csv
refByYear = DefaultDict(list)

with open('data.csv') as csvfile:
    reader = csv.reader(csvfile, delimiter=',',quotechar='|', quoting=csv.QUOTE_ALL)
    for row in reader:
        refByYear[str(row[0])[:4]].append(float(row[1]))
for year in range(1968, 2014):
    print 'The minimum value for ' + str(year) + ' is: ' + str(min(refByYear[str(year)]))

Here I store all values for each year, which may be useful for other purposes, or totally useless.

Upvotes: 0

Related Questions