Creating a very large 2D array without a major impact to code run time in Python

Question

I've been doing competitive programming (USACO) for a couple of months now, in which there are time constraints you cannot exceed. I need to create a large matrix, or 2d array, the dimensions being 2500x2500, in which each value is [0,0]. Using list comprehension is taking too much time, and I needed an alternative (you cannot import modules so numpy isn't an option). I had initially done this:

grid = [[[0,0] for i in range(2500)] for i in range(2500)] but it was taking far too much time, so I tried:

grid= [[[0,0]]*2500]*2500,

which gives the same result initially, but whenever I try to change a value, for example: grid[50][120][0]= 1, it changes the 0th index position of every [0,0] to False in the entire matrix instead of the specific coordinate at the [50][120] position, and this isn't the case when I use list comprehension. Does anybody know what's happening here? And any solution that doesn't involve a crazy run time? I started python just a couple of months before competitive programming so I'm not super experienced.

LTJ · Accepted Answer

grid = [[[0,0] for i in range(2500)] for i in range(2500)]

takes around 2.1 seconds on my PC, timing with PowerShell's Measure-Command. Now if the data specifications are strict, there is no magical way to make this faster. However, if the goal is to make this representation generate faster there is a better solution: use tuple instead of list for the inner data (0, 0).

grid = [[(0, 0) for i in range(2500)] for i in range(2500)]

This snippet generates the same informational value in under quarter of the time (0.48 s). Now what you have to consider here is what comes next. When updating these values in the grid, you need to always create a new tuple to replace the old one - which will always be slower than just updating the list value in the original sample code. This is because tuple doesn't support item assignment operation. Replacing a single value is still as easy as grid[50][120] = (1, grid[50][120][1]).

Fast generation - slow replacement, might be handy if there aren't tons of value changes. Hope this helps.

Creating a very large 2D array without a major impact to code run time in Python

Answers (2)

Related Questions