Reputation: 9
It's probably easier if I explain by showing you the .csv file I'm trying to manipulate:
https://www.dropbox.com/s/4kms4hm28y7sv8w/Test1.csv
I have many hundreds of lines of data such as this, but we've decided we want it in a different format, with each of the fossil genus and species (columns W, X, Y) in rows of their own.
I have very limited Python knowledge, but I wanted to try and use it regardless to split these cells and insert each value into the row below from which it was split. I was going to manually then drag them across to the correct column and drag down the other details on Excel.
The code:
#nektonic=[row[22].split(',') for row in data]
#infaunal=[row[23].split(',') for row in data]
#epifaunal=[row[24].split(',') for row in data]
f=0
r=0
def splitfossils(f, r):
#f=0 #fossil index: counter that moves the selection along the fossils in a cell that are being split by commas
for row in data:
r=(data.index(row)+1) #row index: counter so that split fossils can be inserted beneath the row that is being processed; the +1 is to ensure that the counter starts on 1, not 0.
if row[22] == '':
continue #if no fossils are found, move onto the next row
else:
nektonic=[row[22].split(',')] #nektonic fossils are found to be in the 23rd column of the spreadsheet
if len(nektonic) == 1:
data.insert(r,(nektonic[f])) #if only one fossil is present in the nektonic list, insert only that fossil and do not increase counter number
else:
while f < len(nektonic): #the while loop will loop until the split fossils have been processed
data.insert(r,(nektonic[f])) #each split fossil will be inserted into a row below
f=f+1 #the fossil index moves on to the next fossil
r=r+1 #the next fossil will be inserted into the row below the previous fossil
return f
return r
splitfossils(f, r)
The current error message is that the list index is out of range (highlighting row 19 and 34).
I tried playing around for a while by passing various variables through the function to see if that made a difference, but the previous error I had was that the "for" loop would not iterate. The length of the "data" list was 29, but the only print I would print out nektonic[f] would be "Stomohamites Simplex", the only value from 1W in the spreadsheet.
I'm not really sure if all these loops within loops would work, like I said my knowledge is very basic. Could anyone tell me what's wrong with the code and what might've been an easier way to sort out this problem?
Thanks
Edit: I completed changed my approach to what I have done this instead. It now works, thank you very much for all of your help.
import csv
out=open("Test1.csv", "rb")
data=csv.reader(out)
data=[row for row in data]
out.close()
nektonic=[]
def splitfossils():
for row in data:
nektonic=row[22].split(',')
if len(nektonic)>1:
for fossil in nektonic:
newrow=[0 for i in range(22)]
newrow.append(fossil)
output.writerow(newrow)
else:
output.writerow(row)
return data
out=open("new_test2.csv", "wb")
output=csv.writer(out)
splitfossils()
Upvotes: 0
Views: 730
Reputation: 2864
The problem is that you are trying to modify the list you are iterating. I don't think this is a good approach in Python. Try to copy your data to the new list (it is memory efficient as the objects are referenced rather than copied). Something like this:
import csv
out=open("Test1.csv", "rb")
data=csv.reader(out)
data=[row for row in data]
out.close()
#nektonic=[row[22].split(',') for row in data]
#infaunal=[row[23].split(',') for row in data]
#epifaunal=[row[24].split(',') for row in data]
def splitfossils():
result = []
for row in data:
if row[22] == '':
continue #if no fossils are found, move onto the next row
else:
nektonic=[row[22].split(',')]
result.append(row)
result.append(nektonic)
return result
print splitfossils()
I am not sure if the above code is the direct answer to your problem, but try it in this way...
Upvotes: 0
Reputation: 94539
In Python, identation matters. Hence, the code
while f < len(nektonic): #the while loop will loop until the split fossils have been processed
data.insert(r,(nektonic[f])) #each split fossil will be inserted into a row below
f=f+1 #the fossil index moves on to the next fossil
r=r+1 #the next fossil will be inserted into the row below the previous fossil
return f
return r
returns after a single iteration, because the return f
is hit right away. You probably meant to indent that a bit further left (both of the returns
actually).
That being said, in Python you don't need to use indices to iterate an array, you would just do:
for fossil in nektonic:
data.insert(r, fossil)
Same for the outer loop which iterates the rows.
Upvotes: 4