zachvac
zachvac

Reputation: 730

Python Numpy - Create 2d array where length is based on 1D array

Sorry for confusing title, but not sure how to make it more concise. Here's my requirements:

arr1 = np.array([3,5,9,1])
arr2 = ?(arr1)

arr2 would then be:

[
[0,1,2,0,0,0,0,0,0],
[0,1,2,3,4,0,0,0,0],
[0,1,2,3,4,5,6,7,8],
[0,0,0,0,0,0,0,0,0]
]

It doesn't need to vary based on the max, the shape is known in advance. So to start I've been able to get a shape of zeros:

arr2 = np.zeros((len(arr1),max_len))

And then of course I could do a for loop over arr1 like this:

for i, element in enumerate(arr1):
    arr2[i,0:element] = np.arange(element)

but that would likely take a long time and both dimensions here are rather large (arr1 is a few million rows, max_len is around 500). Is there a clean optimized way to do this in numpy?

Upvotes: 3

Views: 273

Answers (3)

hilberts_drinking_problem
hilberts_drinking_problem

Reputation: 11602

I am adding a slight variation on @hpaulj's answer because you mentioned that max_len is around 500 and you have millions of rows. In this case, you can precompute a 500 by 500 matrix containing all possible rows and index into it using arr1:

import numpy as np
np.random.seed(0)

max_len = 500
arr = np.random.randint(0, max_len, size=10**5)

# generate all unique rows first, then index
# can be faster if max_len << len(arr)
# 53 ms
template = np.tril(np.arange(max_len)[None,:].repeat(max_len,0), k=-1)
res = template[arr,:]

# 173 ms
res1 = np.arange(max_len)[None,:].repeat(arr.size,0)
res1[res1>=arr[:,None]] = 0

assert (res == res1).all()

Upvotes: 0

Akshay Sehgal
Akshay Sehgal

Reputation: 19332

Try this with itertools.zip_longest -

import numpy as np
import itertools

l = map(range, arr1)
arr2 = np.column_stack((itertools.zip_longest(*l, fillvalue=0)))
print(arr2)
array([[0, 1, 2, 0, 0, 0, 0, 0, 0],
       [0, 1, 2, 3, 4, 0, 0, 0, 0],
       [0, 1, 2, 3, 4, 5, 6, 7, 8],
       [0, 0, 0, 0, 0, 0, 0, 0, 0]])

Upvotes: 1

hpaulj
hpaulj

Reputation: 231510

Building on a 'padding' idea posted by @Divakar some years ago:

In [161]: res = np.arange(9)[None,:].repeat(4,0)
In [162]: res[res>=arr1[:,None]] = 0
In [163]: res
Out[163]: 
array([[0, 1, 2, 0, 0, 0, 0, 0, 0],
       [0, 1, 2, 3, 4, 0, 0, 0, 0],
       [0, 1, 2, 3, 4, 5, 6, 7, 8],
       [0, 0, 0, 0, 0, 0, 0, 0, 0]])

Upvotes: 2

Related Questions