Neeraj
Neeraj

Reputation: 11

sorting csv python 2.7

I am trying to sort a csv file on column 3. python sorts the csv but for two rows. Really confused

here is the code i am using.

import csv
import operator
import numpy

sample = open('data.csv','rU')
csv1 = csv.reader(sample,delimiter=',')
sort=sorted(csv1,key=lambda x:x[3])
for eachline in sort:
    print eachline

and here is the o/p From the third row the O/P looks good. Any ideas ?

['6/23/02', 'Julian Jaynes', '618057072', '12.5']
['7/15/98', 'Timothy "The Parser" Campbell', '968411304', '18.99']
['10/4/04', 'Randel Helms', '879755725', '4.5']
['9/30/03', 'Scott Adams', '740721909', '4.95']
['10/4/04', 'Benjamin Radcliff', '804818088', '4.95']
['1/21/85', 'Douglas Adams', '345391802', '5.95']
['12/3/99', 'Richard Friedman', '60630353', '5.95']
['1/12/90', 'Douglas Hofstadter', '465026567', '9.95']
['9/19/01', 'Karen Armstrong', '345384563', '9.95']
['6/23/02', 'David Jones', '198504691', '9.95']
['REVIEW_DATE', 'AUTHOR', 'ISBN', 'DISCOUNTED_PRICE']

Upvotes: 1

Views: 190

Answers (1)

Padraic Cunningham
Padraic Cunningham

Reputation: 180401

You are sorting strings, you need to use float(x[3])

sort=sorted(csv1,key=lambda x:float(x[3]))

If you want to sort by the third column it is x[2], casting to int:

sort=sorted(csv1,key=lambda x:int(x[2]))

You will also need to skip the header to avoid a ValueError:

csv1 = csv.reader(sample,delimiter=',')
header = next(csv1)
sort=sorted(csv1,key=lambda x:int(x[2]))

Python will compare the strings character by character putting "2" after "12" unless you cast to int:

In [82]: "2" < "12"
Out[82]: False

In [83]: int("2") < int("12")
Out[83]: True

Upvotes: 1

Related Questions