Reputation: 168
I have a python script that reads two tiff images and finds unique combinations, counts the observations and saves the count to a txt file.
You can find the full script in www.spatial-ecology.net
The result is:
tif1
2 2 3
0 0 3
2 3 3
tif2
2 2 3
3 3 4
1 1 1
result
2 2 2
3 3 1
0 3 2
3 4 1
2 1 1
3 1 2
The script works fine. This is how it is implemented.
read line by line (for irows in range(rows):) in order do not load the full image in the memory (eventually a flag option can be insert to read 10 by 10 lines)
go trough the arrays and create a tuple
check if the tuple is already stored in the dic()
My question is: which are the tricks in this case to speed up the process?
I tested to save the results in 2 dimension array rather than dic() but it slow down the process. I check this link and maybe the python map function can improve the speed. Is this the case?
Thanks in advance Giuseppe
Upvotes: 1
Views: 148
Reputation: 21914
If you want some real advice on performance you need to post the parts of your code that are directly relevant to what you want to do. If you won't at least do that there's very little we can do. That said, if you are having trouble discovering where exactly your inefficiencies are, I would highly recommend using python's cProfile
module. Usage is as follows:
import cProfile
def foo(*args, **kwargs):
#do something
cProfile.run("foo(*args, **kwargs)")
This will print a detailed time profile of your code that let's know you which steps of your code take up the most time. Usually it'll be a couple methods that either get called far more frequently than they should be getting called, or do some silly extra processing leading to a performance bottleneck.
Upvotes: 2