Reputation: 651
Given a list of n
integer arrays of variable lengths (length always less than 500), my goal is to form a single matrix of size (n, 500)
, the arrays shorter than 500 will be padded with a given constant in the front. However, I noticed that np.pad, which is designed to pad values, is actually very slow compared to other methods, see benchmark code below:
import random
import time
import numpy as np
def pad(arr):
retval = np.empty((500,), dtype=np.int64)
idx = 500 - len(arr)
retval[:idx] = 100001 # pad value
retval[idx:] = arr # original array
return retval
a = [np.random.randint(low=0, high=100000, size=(random.randint(5, 500),), dtype=np.int64) for _ in range(32)]
# approach 1: np.pad
t = time.time()
for _ in range(10000):
b = np.array([np.pad(cur, pad_width=(500 - len(cur), 0), mode='constant', constant_values=100001) for cur in a])
print(time.time() - t)
# approach 2: np.concatenate
t = time.time()
for _ in range(10000):
b = np.array([np.concatenate((np.full((500 - len(cur),), 100001), cur)) for cur in a])
print(time.time() - t)
# approach 3: assign to an empty array
t = time.time()
for _ in range(10000):
b = np.array([pad(cur) for cur in a])
print(time.time() - t)
b1 = np.array([np.pad(cur, pad_width=(500 - len(cur), 0), mode='constant', constant_values=100001) for cur in a])
b2 = np.array([np.concatenate((np.full((500 - len(cur),), 100001), cur)) for cur in a])
b3 = np.array([pad(cur) for cur in a])
print(np.allclose(b1, b2))
print(np.allclose(b1, b3))
print(np.allclose(b2, b3))
Output:
5.376873016357422
1.297654151916504
0.5892848968505859
True
True
True
Why is np.pad so slow? (actually 10 times slower than assigning to empty array). The above custom pad() can actually be more optimized by simply creating a single np.empty of size (n, 500)
, which is even faster, but for comparison fairness, I still did padding per row. I have also tried commenting others out and benchmarking one by one, but the result is similar, so it probably isn't something like caching issue
Upvotes: 1
Views: 692
Reputation: 11613
I know this isn't what you were asking about but if you need a faster way this is probably about as fast as you can get with numpy:
t = time.time()
for _ in range(10000):
b = np.full((len(a), 500), 100001)
for i, v in enumerate(a):
b[i, -len(v):] = v
print(time.time() - t)
It's faster because you allocate all the memory in advance and don't do any unnecessary copying.
Upvotes: 2
Reputation: 168967
Probably because the function is pretty complex to begin with and does a whole bunch of validation and preparation before it gets to the actual padding.
Then again, np.concatenate
can't do e.g. any of the various mode
s np.pad
can.
Upvotes: 1