Reputation: 16656
I would like to generate a report where my first column would contain the duration of my SQL queries. That should be sorted by highest duration to the lowest one.
Code:
import os
directory = "./"
results = {}
def isfloat(value):
try:
float(value)
return True
except ValueError:
pass
for root,dirs,files in os.walk(directory):
for file in files:
if file.endswith(".csv"):
input_file=open(file, 'r')
for line in input_file:
if line:
try:
duration=line.split(',')[13].split(' ')[1]
if isfloat(duration): # check if string is a float
results[duration]=line
except:
pass
output_file = open('report.csv', 'w')
for k,v in sorted(results.items()):
print k
output_file.write(k + ',' + v)
output_file.close()
output:
1266.114
1304.450
1360.771
1376.104
1514.518
500.105
519.432
522.594
522.835
528.622
529.664
I wonder why is the sorted()
function sorting function is messing my results ?
Upvotes: 1
Views: 70
Reputation: 85612
You can actually convert the strings to floats:
if isfloat(duration): # check if string is a float
results[float(duration)] = line
or:
try:
results[float(duration)] = line
except ValueError:
pass
So you don't need your isfloat()
function here.
This should give you properly sorted output.
Upvotes: 1
Reputation: 1125058
Your keys are strings, not numbers. They are sorted lexicographically.
Convert to a number first if you want numeric sorting:
for k,v in sorted(results.items(), key=lambda k_v: float(k_v[0])):
Upvotes: 4