Reputation: 91
I wrote this code, but it is very slow. Is there a way to make the code faster?
import csv
from statistics import median
with open('test.txt') as f:
reader = csv.reader(f, skipinitialspace=True, delimiter=' ')
next(reader)
grades = [float(row[2]) for row in reader]
for mean_list_value in grades:
normalization = (mean_list_value / median(grades)) * 500
print(normalization)
test.txt looks like (with approx. 50000 lines):
Nr Name Grade
2 Max 5.7
5 Linda 6.9
6 Lena 8.0
10 Daniel 4.5
11 Michelle 9.1
.
.
.
Thank you for all your help.
Upvotes: 0
Views: 76
Reputation: 1654
In your code you're calculating the median 50K times despite being always the same. Since computing the median requires sorting your 50K values, this ends up being pretty intensive.
Below you find a numpy-based snippet.
data = np.loadtxt('text.txt', dtype=str)
grades = [float(g) for g in data[1:, 2]]
norm_grades = grades / np.median(grades) * 500
print(norm_grades)
Upvotes: 3