Information Technology
Information Technology

Reputation: 2333

How to properly iterate through an array of rows from a CSV file, after the array has been created?

Very sorry if this was answered before but I searched StackOverflow and could not find a clear solution that solved the problem.

I have a CSV file called “myFile.csv”

I open and read the CSV file, assigning each row to an array called “myArray” that is intended to be used later, for different things…

with open("..\dirX\myFile.csv", 'rb') as fileHandle:
    myArray = []
    for row in csv.reader(fileHandle, delimiter=','):
        myArray.append(row)

I can successfully print individual rows from the array…

print myArray[0]    # Works fine!  Prints header row.
print myArray[1]    # Works fine!  Prints first data row.

However, when I try to loop through the array created from the CSV file in order to extract each row, I get a failure. The for loop code looks as follows...

for idx, row in myArray:  # <--- This where the error message points to
    print 'Index = ' + str(idx)
    print row

The error message I get is:

Traceback (most recent call last):
    File “myScript.py”, line 155, in (module)
        For idx, row in myArray:
ValueError: too many values to unpack

My Question: Exactly why does this happen and what is the best way to correct this problem?

Upvotes: 3

Views: 2811

Answers (2)

bruno desthuilliers
bruno desthuilliers

Reputation: 77892

lists don't behave differently than any other sequence wrt/ iterations: you only get the items, not the indices (just like when iterating over the csv.reader you only got rows, not indices).

If you want to have both indices and items, you can use enumerate():

for index, item in enumerate(somelist):
    print("item at {} is {}".format(index, item))

Update:

Because it's enumerated, isn't "item" immutable? What if I want to change it's data, later (e.g. replace text strings)

Iteration doesn't make anything more or less mutable. Either an object is mutable (and mutating it in the for loop body works just the same as outside a loop) or it isn't, period.

It seems you are confused by the difference between mutating and rebinding. Here's an example with a list of mutable objects:

>>> data = [dict(a=i) for i in xrange(3)]
>>> data
[{'a': 0}, {'a': 1}, {'a': 2}]
>>> for item in data:
...     item["b"] = item["a"] + 42
... 
>>> data
[{'a': 0, 'b': 42}, {'a': 1, 'b': 43}, {'a': 2, 'b': 44}]

As you can see the items are perfectly mutable..

Now you can't do this with a list of immutable objects, not because of the for loop (wheter you use enunerate or not is irrelevant here) but because immutable object are, well, immutable. Let's first check this outside a for loop:

>>> s = "foo 1"
>>> s.replace("1", "2")
'foo 2'
>>> s
'foo 1'

As you can see, str.replace() returns a new string and leave the original unchanged (of course - strings are immutable). If you want s to become "foo 2", you have to rebind s to make it point to another string:

>>> s
'foo 1'
>>> id(s)
139792880514032
>>> s = s.replace("1", "2")
>>> s
'foo 2'
>>> id(s)
139792880474080

Note that rebinding a variable does NOT affect other variables pointing to the same object:

>>> s1 = "aaa"
>>> id(s1)
139792880524584
>>> s2 = "bbb"
>>> id(s2)
139792880522104
>>> s1 = "aaa"
>>> s1
'aaa'
>>> id(s1)
139792880524584
>>> s2 = s1
>>> s2
'aaa'
>>> id(s2)
139792880524584
>>> s2 is s1
True
>>> # now let's rebind s1:    
>>> s1 = "bbb"
>>> s1
'bbb'
>>> id(s1)
139792880522104
>>> s2
'aaa'
>>> id(s2)
139792880524584
>>> s2 is s1
False
>>> 

So rebinding the iteration variable (item in our above snippets) technically works (this variable IS rebound) but this won't affect the list or whatever you are iterating above (just like rebinding s1 doesn't impact s2):

>>> data = ["aaa", "bbb", "ccc"]
>>> for item in data:
...     print "item before : {}".format(item)
...     item = 42
...     print "item after : {}".format(item)
...     print "data : {}".format(data)
... 
item before : aaa
item after : 42
data : ['aaa', 'bbb', 'ccc']
item before : bbb
item after : 42
data : ['aaa', 'bbb', 'ccc']
item before : ccc
item after : 42
data : ['aaa', 'bbb', 'ccc']

So if you have a list of strings and want to update the list in place, you have to mutate the list itself - which requires having the matching indexes too, which you get using enumerate:

>>> data = ["aaa", "bbb", "ccc"]
>>> for index, item in enumerate(data):
...     data[index] = item.upper()
... 
>>> data
['AAA', 'BBB', 'CCC']

Notice that here we are not rebinding the iteration variable but mutating the data list itself. It works just the same as without a for loop:

>>> data = ["aaa", "bbb", "ccc"]
>>> item = data[0]
>>> item
'aaa'
>>> item = "AAA"
>>> item
'AAA'
>>> data
['aaa', 'bbb', 'ccc']

versus:

>>> data = ["aaa", "bbb", "ccc"]
>>> data[0] = "AAA"
>>> data
['AAA', 'bbb', 'ccc']
>>> 

Upvotes: 2

galaxyan
galaxyan

Reputation: 6111

the error you had means there are more variable name than actual values.

example:

lst = [1,2]
a,b = lst # <-- this is ok

a,b,c = lst # error 

so if you need idx, you can try to use enumerate, thus every iteration will produce two values

for idx, row in enumerate(myArray): 
    print 'Index = ' + str(idx)
    print row

in case you want to change element

res = []
for idx, row in enumerate(myArray): 
  print 'Index = ' + str(idx)
  print row
  #do change
  res.append(changed_row)

Upvotes: 2

Related Questions