Reputation: 21
I am relatively new to python and I find it to be useful since I do need to routinely find values out of a large csv file thus I tried to use it.
This is my csv file:
Name, Tag, Size, Height1, Height2,
Name1, B1, 244,42798,5900
Name2, B4, 200,22798,2234
Name3, B5, 240,25798,2745
Name4, B7, 220,32798,4590
I tried to use this code but I still get them jumbled up.
import csv
input = open('file.csv','r')
number_top_values = raw_input(‘How many top values you need to find?’) #number of top values
file = csv.reader(input)
line1 = file.next()
height = [(row[3],(row[4])) for row in file]
height.sort(key = lambda x: x[1])
height.reverse()
height = height[:number_top_values]
print height
I need to find in the column the top values for Height1 and Height2 (top 2 or top 3 so on depending on how many top values I need to find) and get the entire row having these top values. Any suggestions or possible answers will be a great help. Thanks.
Upvotes: 2
Views: 1002
Reputation: 129001
You're currently using this:
height = [(row[3],(row[4])) for row in file]
height.sort(key = lambda x: x[1])
height.reverse()
In the first line there, you remove some of the data you need (since you need the whole row). The third line can be made redundant with a modification to the second line. Together:
height = list(file)
height.sort(key=lambda x: int(x[3]), reverse=True)
This sorts on Height1
. If you want to sort on Height2
, change 3
to 4
. If you want to sort on one and then the other, you can sort twice or do something a little trickier:
height.sort(key=lambda x: (int(x[3]), int(x[4])), reverse=True)
Upvotes: 2
Reputation: 10526
Mostly what icktoofay said :)
Working code:
import csv
inputfile = open('file.csv','r')
#don't forget int() to convert input to integer for slicing
while(True):
try:
number_top_values = int(raw_input('How many top values you need to find?')) #number of top values
except ValueError:
print "Invalid input! Please try again"
else:
break
csvfile = csv.reader(inputfile)
height = list(csvfile)[1:] #exclude first line for labels
height1 = sorted(height, key = lambda x: int(x[3]), reverse=True)[:number_top_values]
height2 = sorted(height, key = lambda x: int(x[4]), reverse=True)[:number_top_values]
# or height.sort(key = lambda x: (int(x[3]),int(x[4])), reverse=True) for double sort
print height1
print height2
inputfile.close()
Upvotes: 1