seermer
seermer

Reputation: 651

np.pad significantly slower than concatenate or assignment

Given a list of n integer arrays of variable lengths (length always less than 500), my goal is to form a single matrix of size (n, 500), the arrays shorter than 500 will be padded with a given constant in the front. However, I noticed that np.pad, which is designed to pad values, is actually very slow compared to other methods, see benchmark code below:

import random
import time

import numpy as np


def pad(arr):
    retval = np.empty((500,), dtype=np.int64)
    idx = 500 - len(arr)
    retval[:idx] = 100001  # pad value
    retval[idx:] = arr  # original array
    return retval


a = [np.random.randint(low=0, high=100000, size=(random.randint(5, 500),), dtype=np.int64) for _ in range(32)]

# approach 1: np.pad
t = time.time()
for _ in range(10000):
    b = np.array([np.pad(cur, pad_width=(500 - len(cur), 0), mode='constant', constant_values=100001) for cur in a])
print(time.time() - t)

# approach 2: np.concatenate
t = time.time()
for _ in range(10000):
    b = np.array([np.concatenate((np.full((500 - len(cur),), 100001), cur)) for cur in a])
print(time.time() - t)

# approach 3: assign to an empty array
t = time.time()
for _ in range(10000):
    b = np.array([pad(cur) for cur in a])
print(time.time() - t)

b1 = np.array([np.pad(cur, pad_width=(500 - len(cur), 0), mode='constant', constant_values=100001) for cur in a])
b2 = np.array([np.concatenate((np.full((500 - len(cur),), 100001), cur)) for cur in a])
b3 = np.array([pad(cur) for cur in a])
print(np.allclose(b1, b2))
print(np.allclose(b1, b3))
print(np.allclose(b2, b3))

Output:

5.376873016357422
1.297654151916504
0.5892848968505859
True
True
True

Why is np.pad so slow? (actually 10 times slower than assigning to empty array). The above custom pad() can actually be more optimized by simply creating a single np.empty of size (n, 500), which is even faster, but for comparison fairness, I still did padding per row. I have also tried commenting others out and benchmarking one by one, but the result is similar, so it probably isn't something like caching issue

Upvotes: 1

Views: 692

Answers (2)

Bill
Bill

Reputation: 11613

I know this isn't what you were asking about but if you need a faster way this is probably about as fast as you can get with numpy:

t = time.time()
for _ in range(10000):
    b = np.full((len(a), 500), 100001)
    for i, v in enumerate(a):
        b[i, -len(v):] = v
print(time.time() - t)

It's faster because you allocate all the memory in advance and don't do any unnecessary copying.

Upvotes: 2

AKX
AKX

Reputation: 168967

Probably because the function is pretty complex to begin with and does a whole bunch of validation and preparation before it gets to the actual padding.

Then again, np.concatenate can't do e.g. any of the various modes np.pad can.

Upvotes: 1

Related Questions