Bram Vanroy
Bram Vanroy

Reputation: 28524

Numpy pad zeroes of given size

I have read most related questions here, but I cannot seem to figure out how to use np.pad in this case. Maybe it is not meant for this particular problem.

Let's say I have a list of Numpy arrays. Every array is the same length, e.g. 2. The list itself has to be padded to be e.g. 5 arrays and can be transformed into a numpy array as well. The padded elements should be arrays filled with zeroes. As an example

arr = [array([0, 1]), array([1, 0]), array([1, 1])]
expected_output = array([array([0, 1]), array([1, 0]), array([1, 1]), array([0, 0]), array([0, 0])])

The following seems to work, but I feel there must be a better and more efficient way. In reality this is run hundreds of thousands if not millions of times so speed is important. Perhaps with np.pad?

import numpy as np

def pad_array(l, item_size, pad_size=5):
  s = len(l)

  if s < pad_size:
    zeros = np.zeros(item_size)
    for _ in range(pad_size-s):
      # not sure if I need a `copy` of zeros here?
      l.append(zeros)

  return np.array(l)

B = [np.array([0,1]), np.array([1,0]), np.array([1,1])]
AB = pad_array(B, 2)

print(AB)

Upvotes: 1

Views: 3421

Answers (2)

doca
doca

Reputation: 1558

It seems like you want to pad zeros at the end of the axis 0, speaking in numpy terms. So what you need is,

output = numpy.pad(arr, ((0,2),(0,0)), 'constant')

The trick is the pad_width parameter, which you need to specify as pad_width=((0,2),(0,0)) to get your expected output. This is you telling pad() to insert 0 padding at the beginning and 2 padding at the end of the axis 0, and to insert 0 padding at the beginning and 0 padding at the end of the axis 1. The format of pad_width is ((before_1, after_1), … (before_N, after_N)) according to the documentation

mode='constant' tells pad() to pad with the value specified by parameter constant_values which defaults to 0.

Upvotes: 2

Dani Mesejo
Dani Mesejo

Reputation: 61930

You could re-write your function like this:

import numpy as np


def pad_array(l, item_size, pad_size=5):

    if pad_size < len(l):
        return np.array(l)

    s = len(l)
    res = np.zeros((pad_size, item_size))  # create an array of (item_size, pad_size)
    res[:s] = l  # set the first rows equal to the elements of l

    return res


B = [np.array([0, 1]), np.array([1, 0]), np.array([1, 1])]
AB = pad_array(B, 2)

print(AB)

Output

[[0. 1.]
 [1. 0.]
 [1. 1.]
 [0. 0.]
 [0. 0.]]

The idea is to create an array of zeroes and then fill the first rows with the values from the input list.

Upvotes: 1

Related Questions