Reputation: 25
We have an application that so far has been kept exclusively on the front end, but when testing it with large grid configurations we have hit several processing and memory constraints. Thus we're exploring the option of pushing resource demanding tasks to the backend.
Thus I'm currently running some benchmarks to get some indications of the performance difference that we might see. More specifically, given that a good portion of the performance bottleneck is on nested loops, I wrote a simple snippet in js and python to test the performance of handling nested loops and building an array of varying sizes.
To my surprise, it looks like js is consistently faster regardless of anything permutations I have tested.
JS snippet:
var timeTotal = 0;
var benchmarkTests = 100;
var testSizes = [50, 100, 500, 1000, 3000];
for (var a = 0; a < testSizes.length; a++) {
var minX = 0;
var maxX = testSizes[a];
var minY = 0;
var maxY = testSizes[a];
var cellDimensionsX = 0.991; // cell width
var cellDimensionsY = 1.652; // cell height
var cellDimensionsXHalf = cellDimensionsX / 2; // half cell width
var cellDimensionsYHalf = cellDimensionsY / 2; // half cell height
var maxCellsXCount = Math.floor(maxX / cellDimensionsX);
var maxCellsYCount = Math.floor(maxY / cellDimensionsY);
console.log("X:", maxCellsXCount, "| Y", maxCellsYCount, "| Total", maxCellsXCount * maxCellsYCount);
for (var k = 0; k < benchmarkTests; k++) {
var cellCoordsTime = new Date().getTime();
var cellCoords = {};
var index = 0;
for (var i = 0; i < maxCellsXCount; i++) {
var xCoord = (i * cellDimensionsX) + cellDimensionsXHalf;
for (var d = 0; d < maxCellsYCount; d++) {
cellCoords[index] = {
x: xCoord,
y: (d * cellDimensionsY) + cellDimensionsYHalf
};
index++;
}
}
var thisTime = new Date().getTime() - cellCoordsTime;
timeTotal += thisTime;
// console.log('cellCoords', thisTime, 'for grid with', index, 'cells');
}
console.log('Testing with a', testSizes[a], '*', testSizes[a], 'grid area. Total time', timeTotal, '. Avg', (timeTotal / benchmarkTests), 'for', benchmarkTests, 'tests');
}
Python snippet:
#!/usr/bin/python
import time
timeTotal = 0
benchmarkTests = 100
testSizes = [50, 100, 500, 1000, 3000]
for a in range(len(testSizes)):
minX = 0
maxX = testSizes[a]
minY = 0
maxY = testSizes[a]
cellDimensionsX = 0.991 # cell width
cellDimensionsY = 1.652 # cell height
cellDimensionsXHalf = cellDimensionsX / 2 # half cell width
cellDimensionsYHalf = cellDimensionsY / 2 # half cell height
maxCellsXCount = int(maxX / cellDimensionsX)
maxCellsYCount = int(maxY / cellDimensionsY)
print("X: %s | Y %s | Total %s" % (maxCellsXCount, maxCellsYCount, maxCellsXCount * maxCellsYCount))
for k in range(benchmarkTests):
start = time.time()
cellCoords = {}
index = 0
for i in range(maxCellsXCount):
xCoord = (i * cellDimensionsX) + cellDimensionsXHalf
for d in range(maxCellsYCount):
cellCoords[index] = {'x': xCoord, 'y': (d * cellDimensionsY) + cellDimensionsYHalf}
index += 1
thisTime = (time.time() - start) * 1000;
timeTotal = timeTotal + thisTime;
# print("Elapsed Time: %s for grid with %s cells" % (thisTime, index))
print("Testing with a %s*%s grid area. Total time %s. Avg %s for %s tests" % (testSizes[a], testSizes[a], timeTotal, (timeTotal / benchmarkTests), benchmarkTests))
When running these I get:
JS:
Testing with a 50 * 50 grid area. Total time 26 . Avg 0.26 for 100 tests
Testing with a 100 * 100 grid area. Total time 85 . Avg 0.85 for 100 tests
Testing with a 500 * 500 grid area. Total time 4539 . Avg 45.39 for 100 tests
Testing with a 1000 * 1000 grid area. Total time 23160 . Avg 231.6 for 100 tests
Testing with a 3000 * 3000 grid area. Total time 243760 . Avg 2437.6 for 100 tests
Python:
Testing with a 50*50 grid area. Total time 50.8642196655. Avg 0.508642196655 for 100 tests
Testing with a 100*100 grid area. Total time 262.931108475. Avg 2.62931108475 for 100 tests
Testing with a 500*500 grid area. Total time 6338.83333206. Avg 63.3883333206 for 100 tests
Testing with a 1000*1000 grid area. Total time 30769.4478035. Avg 307.694478035 for 100 tests
Testing with a 3000*3000 grid area. Total time 304995.391846. Avg 3049.95391846 for 100 tests
All the times are in ms and all the tests were ran on the same localhost.
I would have expected python to be significantly faster than js. Is there anything I'm missing?
UPDATE #1 (Moved everything into a function | Overall gain ~17%):
Testing with a 50*50 grid area. Total time 41.2473678589. Avg 0.412473678589 for 100 tests
Testing with a 100*100 grid area. Total time 174.555540085. Avg 1.74555540085 for 100 tests
Testing with a 500*500 grid area. Total time 5617.09475517. Avg 56.1709475517 for 100 tests
Testing with a 1000*1000 grid area. Total time 21199.390173. Avg 211.99390173 for 100 tests
Testing with a 3000*3000 grid area. Total time 255921.251535. Avg 2559.21251535 for 100 tests
UPDATE #2 (Swapped range for xrange | Overall additional gain after UPDATE #1 ~15% | Overall gain compared to the initial code ~30%):
Testing with a 50*50 grid area. Total time 38.7289524078. Avg 0.387289524078 for 100 tests
Testing with a 100*100 grid area. Total time 176.453590393. Avg 1.76453590393 for 100 tests
Testing with a 500*500 grid area. Total time 5346.49443626. Avg 53.4649443626 for 100 tests
Testing with a 1000*1000 grid area. Total time 21618.1008816. Avg 216.181008816 for 100 tests
Testing with a 3000*3000 grid area. Total time 213622.769356. Avg 2136.22769356 for 100 tests
UPDATE #3 (Swapped dict for list | Overall additional gain after UPDATE #2 ~35% | Overall gain compared to the initial code ~55%):
Testing with a 50*50 grid area. Total time 20.7185745239. Avg 0.207185745239 for 100 tests
Testing with a 100*100 grid area. Total time 100.9953022. Avg 1.009953022 for 100 tests
Testing with a 500*500 grid area. Total time 3033.61153603. Avg 30.3361153603 for 100 tests
Testing with a 1000*1000 grid area. Total time 12399.708271. Avg 123.99708271 for 100 tests
Testing with a 3000*3000 grid area. Total time 140118.921518. Avg 1401.18921518 for 100 tests
Matching JS with Python's UPDATE #3 (Swapped object for array | Overall loss compared to the initial code ~165%):
Testing with a 50 * 50 grid area. Total time 30 . Avg 0.3 for 100 tests
Testing with a 100 * 100 grid area. Total time 48 . Avg 0.48 for 100 tests
Testing with a 500 * 500 grid area. Total time 12694 . Avg 126.94 for 100 tests
Testing with a 1000 * 1000 grid area. Total time 81402 . Avg 814.02 for 100 tests
Testing with a 3000 * 3000 grid area. Total time 625615 . Avg 6256.15 for 100 tests
UPDATE #4 (Pulled Cython into the fight | Overall additional gain after UPDATE #3 ~26% | Overall gain compared to the initial code ~66%):
Testing with a 50*50 grid area. Total time 30. Avg 0.300 for 100 tests
Testing with a 100*100 grid area. Total time 68. Avg 0.680 for 100 tests
Testing with a 500*500 grid area. Total time 2475. Avg 24.750 for 100 tests
Testing with a 1000*1000 grid area. Total time 9924. Avg 99.240 for 100 tests
Testing with a 3000*3000 grid area. Total time 101697. Avg 1016.970 for 100 tests
UPDATE #5 (Typing variables | Overall additional gain after UPDATE #4 ~23% | Overall gain compared to the initial code ~74%):
Testing with a 50*50 grid area. Total time 5. Avg 0.048 for 100 tests
Testing with a 100*100 grid area. Total time 43. Avg 0.426 for 100 tests
Testing with a 500*500 grid area. Total time 1851. Avg 18.511 for 100 tests
Testing with a 1000*1000 grid area. Total time 8020. Avg 80.202 for 100 tests
Testing with a 3000*3000 grid area. Total time 78350. Avg 783.502 for 100 tests
(HOPEFULLY FINAL) UPDATE #6 (Cython related optimizations, including parallelizing it over 2 cores | Overall additional gain after UPDATE #5 ~88% | Overall gain compared to the initial code ~97%):
Testing with a 50*50 grid area. Total time 0.668. Avg 0.007 for 100 tests
Testing with a 100*100 grid area. Total time 1.584. Avg 0.016 for 100 tests
Testing with a 500*500 grid area. Total time 57.374. Avg 0.574 for 100 tests
Testing with a 1000*1000 grid area. Total time 521.210. Avg 5.212 for 100 tests
Testing with a 3000*3000 grid area. Total time 10113.633. Avg 101.136 for 100 tests
The currently version is on average between 30 and 35 times faster compared to the original implementation. So calling it quits for the time being.
FINAL UPDATE (Tested on hyperthreaded 8 Core machine (No longer localhost and no longer apples to apples comparison) | Further cython related optimizations, and caching the value of y | Overall additional gain after UPDATE #6 ~78% | Overall gain compared to the initial code ~99.3%):
Testing with a 50*50 grid area. Total time 0.498. Avg 0.005 for 100 tests
Testing with a 100*100 grid area. Total time 1.146. Avg 0.011 for 100 tests
Testing with a 500*500 grid area. Total time 22.856. Avg 0.229 for 100 tests
Testing with a 1000*1000 grid area. Total time 113.819. Avg 1.138 for 100 tests
Testing with a 3000*3000 grid area. Total time 2228.098ms. Avg 22.281 for 100 tests
Testing with a 10000*10000 grid area. Total time 29407.874ms. Avg 294.79 for 100 tests
Testing with a 20000*20000 grid area. Total time 157185.469ms. Avg 1571.855 for 100 tests
Upvotes: 2
Views: 2075
Reputation: 27321
Using more appropriate data types for Python (a tuple instead of creating {x: .., y: ..}
dicts, using a list instead of a dict with consecutive integer indices), using xrange instead of range, and wrapping everything into a function, giving the following code:
import time
def foo():
timeTotal = 0
benchmarkTests = 100
testSizes = [50, 100, 500, 1000, 3000]
for a in range(len(testSizes)):
minX = 0
maxX = testSizes[a]
minY = 0
maxY = testSizes[a]
cellDimensionsX = 0.991 # cell width
cellDimensionsY = 1.652 # cell height
cellDimensionsXHalf = cellDimensionsX / 2 # half cell width
cellDimensionsYHalf = cellDimensionsY / 2 # half cell height
maxCellsXCount = int(maxX / cellDimensionsX)
maxCellsYCount = int(maxY / cellDimensionsY)
print("X: %s | Y %s | Total %s" % (maxCellsXCount, maxCellsYCount, maxCellsXCount * maxCellsYCount))
for k in xrange(benchmarkTests):
start = time.time()
# cellCoords = {}
cellCoords = []
for i in xrange(maxCellsXCount):
xCoord = (i * cellDimensionsX) + cellDimensionsXHalf
for d in xrange(maxCellsYCount):
# cellCoords[index] = {'x': xCoord, 'y': (d * cellDimensionsY) + cellDimensionsYHalf}
# cellCoords[index] = (xCoord, (d * cellDimensionsY) + cellDimensionsYHalf)
cellCoords.append((xCoord, (d * cellDimensionsY) + cellDimensionsYHalf))
# index += 1
thisTime = (time.time() - start) * 1000;
timeTotal = timeTotal + thisTime;
# print("Elapsed Time: %s for grid with %s cells" % (thisTime, index))
print("Testing with a %s*%s grid area. Total time %s. Avg %s for %s tests" % (testSizes[a], testSizes[a], timeTotal, (timeTotal / benchmarkTests), benchmarkTests))
foo()
gives about 50% speedup:
X: 50 | Y 30 | Total 1500
Testing with a 50*50 grid area. Total time 33.9999198914. Avg 0.339999198914 for 100 tests
X: 100 | Y 60 | Total 6000
Testing with a 100*100 grid area. Total time 191.999912262. Avg 1.91999912262 for 100 tests
X: 504 | Y 302 | Total 152208
Testing with a 500*500 grid area. Total time 4790.99988937. Avg 47.9099988937 for 100 tests
X: 1009 | Y 605 | Total 610445
Testing with a 1000*1000 grid area. Total time 24529.9999714. Avg 245.299999714 for 100 tests
X: 3027 | Y 1815 | Total 5494005
Testing with a 3000*3000 grid area. Total time 201085.000038. Avg 2010.85000038 for 100 tests
compared to the original code on my machine:
X: 50 | Y 30 | Total 1500
Testing with a 50*50 grid area. Total time 94.0001010895. Avg 0.940001010895 for 100 tests
X: 100 | Y 60 | Total 6000
Testing with a 100*100 grid area. Total time 495.000123978. Avg 4.95000123978 for 100 tests
X: 504 | Y 302 | Total 152208
Testing with a 500*500 grid area. Total time 10732.0001125. Avg 107.320001125 for 100 tests
X: 1009 | Y 605 | Total 610445
Testing with a 1000*1000 grid area. Total time 49074.0001202. Avg 490.740001202 for 100 tests
X: 3027 | Y 1815 | Total 5494005
Traceback (most recent call last):
File "pyperf-orig.py", line 31, in <module>
cellCoords[index] = {'x': xCoord, 'y': (d * cellDimensionsY) + cellDimensionsYHalf}
MemoryError
Upvotes: 2
Reputation: 16515
One improvement is to put everything into a function, because, apparently, Python is not so good with global vars:
Running your code:
Testing with a 50*50 grid area. Total time 52.17409133911133. Avg 0.5217409133911133 for 100 tests
Testing with a 100*100 grid area. Total time 262.12358474731445. Avg 2.6212358474731445 for 100 tests
Testing with a 500*500 grid area. Total time 6206.289052963257. Avg 62.06289052963257 for 100 tests
Testing with a 1000*1000 grid area. Total time 30345.27611732483. Avg 303.4527611732483 for 100 tests
When your code is put inside a function, it becomes a lot faster:
Testing with a 50*50 grid area. Total time 37.11581230163574. Avg 0.3711581230163574 for 100 tests
Testing with a 100*100 grid area. Total time 176.71799659729004. Avg 1.7671799659729004 for 100 tests
Testing with a 500*500 grid area. Total time 4659.825325012207. Avg 46.59825325012207 for 100 tests
Testing with a 1000*1000 grid area. Total time 23246.346712112427. Avg 232.46346712112427 for 100 tests
Upvotes: 1