user3325965
user3325965

Reputation: 55

CSV file parsing (python)

I am having some issues parsing a csv file with 14 columns.

for row in training_set_data:
    if skiprow:
            skiprow = False
    else:
            for r in range(len(row)):
                    row[r] = float(row[r])
            training_set.append(row)

this seems to be working to just get a list of the vectors, but the next thing I want to do is collect the first 13 entries in each row and make one set of vectors, and then collect the last column and make a separate set of vectors of that. My code currently looks like this for the 13 entry vectors:

def inputVector(inputs):
    for r in inputs:
        inputs.pop(13)
    return inputs

This is not working and when I go to print it, it is still 14 entries long. Can anyone tell me what I am doing wrong? Sorry if the question doesn't make too much sense, I am pretty new to coding.

Edit: First 11 lines of the csv file and the call to input vecto

53,1,3,130,197,1,2,152,0,1.2,3,0,3,0
42,1,4,136,315,0,0,125,1,1.8,2,0,6,1
46,1,4,140,311,0,0,120,1,1.8,2,2,7,1
42,1,4,140,226,0,0,178,0,0,1,0,3,0
54,1,4,140,239,0,0,160,0,1.2,1,0,3,0
67,0,3,115,564,0,2,160,0,1.6,2,0,7,0
65,0,3,140,417,1,2,157,0,0.8,1,1,3,0
56,0,4,134,409,0,2,150,1,1.9,2,2,7,1
65,0,3,160,360,0,2,151,0,0.8,1,0,3,0
57,0,4,120,354,0,0,163,1,0.6,1,0,3,0
55,0,4,180,327,0,1,117,1,3.4,2,0,3,1

inputV = inputVector(training_set)

Upvotes: 1

Views: 162

Answers (2)

demented hedgehog
demented hedgehog

Reputation: 7538

Try something like this:

first_13s = []
last_1s = []

for r in inputs:
    first_13s.append(r[:13])
    last_1s.append(r[13])

also you can replace a number of lines in your first block of code just by using training_set_data[1:]

python list slicing is very handy Explain Python's slice notation

also you can use list comprehensions for your float conversion:

for r in range(len(row)):
    row[r] = float(row[r])

becomes

row = [float(r) for r in row]

so the first block can be done like this:

for row in training_set_data[1:]:
    row = [float(r) for r in row]
    training_set.append(row)

Upvotes: 2

mhlester
mhlester

Reputation: 23211

The problem is this code:

def inputVector(inputs):
    for r in inputs:
        inputs.pop(13)
    return inputs

You're iterating over all inputs, and removing elements from inputs rather than from r. To remove element 13 from each row, do this instead:

def inputVector(inputs):
    for r in inputs:
        r.pop(13)  # <-- replaced inputs with r
    return inputs

Upvotes: 2

Related Questions