Reputation: 24621
I have example:
for line in IN.readlines():
line = line.rstrip('\n')
mas = line.split('\t')
row = ( int(mas[0]), int(mas[1]), mas[2], mas[3], mas[4] )
self.inetnums.append(row)
IN.close()
If ffilesize == 120mb, script time = 10 sec. Can I decrease this time ?
Upvotes: 0
Views: 90
Reputation: 63707
You may gain some speed if you use a List Comprehension
inetnums=[(int(x) for x in line.rstrip('\n').split('\t')) for line in fin]
Here is the profile information with two different versions
>>> def foo2():
fin.seek(0)
inetnums=[]
for line in fin:
line = line.rstrip('\n')
mas = line.split('\t')
row = ( int(mas[0]), int(mas[1]), mas[2], mas[3])
inetnums.append(row)
>>> def foo1():
fin.seek(0)
inetnums=[[int(x) for x in line.rstrip('\n').split('\t')] for line in fin]
>>> cProfile.run("foo1()")
444 function calls in 0.004 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.003 0.003 0.004 0.004 <pyshell#362>:1(foo1)
1 0.000 0.000 0.004 0.004 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
220 0.000 0.000 0.000 0.000 {method 'rstrip' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'seek' of 'file' objects}
220 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
>>> cProfile.run("foo2()")
664 function calls in 0.006 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.005 0.005 0.006 0.006 <pyshell#360>:1(foo2)
1 0.000 0.000 0.006 0.006 <string>:1(<module>)
220 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
220 0.001 0.000 0.001 0.000 {method 'rstrip' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'seek' of 'file' objects}
220 0.001 0.000 0.001 0.000 {method 'split' of 'str' objects}
>>>
Upvotes: 2
Reputation: 133504
Remove the readlines()
Just do
for line in IN:
Using readlines
you are creating a list of all lines from the file and then accessing each one, which you don't need to do. Without it the for loop simply uses the generator which returns a line each time from the file.
Upvotes: 4