Nice
Nice

Reputation: 21

Produce an Array from a Lambda Function, dimension shape and dtype

I need to create a function that takes a lambda, a dimension shape and the Numpy dtype, and produces an array. I know theres the fromfunction that will do just that but I cant use that. I guess one way of looking at it is that I need to hardcode a fromfunction. The problem I'm having is that the lambda function can't be passed as its a function. Ive tried using loops and using indices and I'm new to the latter idea so I might not be doing that properly. Basically I need to create the function. The inputs are given as is the (expected) result in the comment.

import numpy as np
def array_function(f, d, dtype=None):

    return x

print(array_function(lambda i,j: (i - j)**2, [4, 4]))

# Expected Result
#[[0. 1. 4. 9.]
# [1. 0. 1. 4.]
# [4. 1. 0. 1.]
# [9. 4. 1. 0.]]

Upvotes: 0

Views: 1271

Answers (3)

hpaulj
hpaulj

Reputation: 231385

For this lambda, fromfunction works fine:

In [1]: foo = lambda i,j: (i-j)**2                                              
In [2]: np.fromfunction(foo,(4,4))                                              
Out[2]: 
array([[0., 1., 4., 9.],
       [1., 0., 1., 4.],
       [4., 1., 0., 1.],
       [9., 4., 1., 0.]])

fromfunction generates a 'grid' of indices from the shape:

In [7]: np.indices((4,4))                                                       
Out[7]: 
array([[[0, 0, 0, 0],
        [1, 1, 1, 1],
        [2, 2, 2, 2],
        [3, 3, 3, 3]],

       [[0, 1, 2, 3],
        [0, 1, 2, 3],
        [0, 1, 2, 3],
        [0, 1, 2, 3]]])

and passes the two planes (1st dimension) to your function. Your function, as written, works with arrays such as these 2d grids. meshgrid and mgrid (and ogrid) generate similar indices.

But I could just as well create two arrays directly and pass them to foo:

In [8]: foo(np.arange(4)[:,None], np.arange(4))                                 
Out[8]: 
array([[0, 1, 4, 9],
       [1, 0, 1, 4],
       [4, 1, 0, 1],
       [9, 4, 1, 0]])

These two input arrays broadcast against each other just as the 2 planes in Out[7] do. They are, in effect, the (4,1) and (4) shaped equivalents.

Note that in Python a lambda is just an anonymous function. Here I assigned it to a variable, giving a name (of sorts). A def function could be used just as well.

So as long as your function works with the required 2d index arrays, you don't need any special coding.

If the function only works with scalar values of i and j, then you have to resort to something that iterates at a Python level (as opposed to using compiled numpy functions).

The list comprehension version:

In [6]: np.array([[foo(i,j) for j in range(4)] for i in range(4)])              
Out[6]: 
array([[0, 1, 4, 9],
       [1, 0, 1, 4],
       [4, 1, 0, 1],
       [9, 4, 1, 0]])

I rather like frompyfunc, which would be used as:

In [9]: f = np.frompyfunc(foo, 2,1)                                             
In [10]: f(np.arange(4)[:,None], np.arange(4))                                  
Out[10]: 
array([[0, 1, 4, 9],
       [1, 0, 1, 4],
       [4, 1, 0, 1],
       [9, 4, 1, 0]], dtype=object)

Note it returns an object dtype. This can be modified with an astype. It could also be passed to fromfunction if you are too 'lazy' to write your own broadcastable I and J arrays.

In my experience the frompyfunc approach is marginally faster than the list comprehension (upto about 2x). On the other hand if foo works with arrays as in [8], then the speed ratio is more like 10x. So performance wise you'll be happiest if you can write functions that work with whole arrays rather than scalar indices.

Upvotes: 1

Mark
Mark

Reputation: 92440

lambda function can't be passed

A lambda function is a function, which is a first-class object in python. It's no trouble to pass it as an argument to another function. You can just make a look or comprehension over the two dimensions and make and array, then reshape:

import numpy as np
def array_function(f, d, dtype=None):
    a = np.array([f(i, j) for i in range(d[0]) for j in range(d[1])], dtype)
    return a.reshape(d)

print(array_function(lambda i,j: (i - j)**2, [4, 4]))

result:

[[0 1 4 9]  
 [1 0 1 4]  
 [4 1 0 1]  
 [9 4 1 0]]  

it's a little trickier if you want array_function to take arrays of arbitrary size. One option would be to make the array of the size and then enumerate over all the elements to call the function:

import numpy as np
def array_function(f, d, dtype=None):
    a = np.zeros(d)
    for coord, val in np.ndenumerate(a):
        a[coord] = f(*coord)    
    return a

# three dimensions    
print(array_function(lambda i,j, k: k+(i - j)**2, [4, 5,2], np.float))

Edit based on comment

You can build an iterator with starmap and product from itertools. I'm not sure an iterator buys you much with numpy since you usually want to know the size of the array upon creation. You can pass it a length, which isn't required but improves performance:

from itertools import product, starmap
import numpy as np
from operator import mul
from functools import reduce

def array_function(f, d, dtype=None):
    length = reduce(mul, d)
    iterator = starmap(f, product(*[range(x) for x in d]))

    a = np.fromiter(iterator, dtype, length)
    return a.reshape(d)

print(array_function(lambda i,j: (i - j)**2, [4, 4], np.float))

Upvotes: 0

gmds
gmds

Reputation: 19885

How about creating a np.ufunc from your lambda and using reduce to apply it over multiple dimensions?

from functools import reduce
import numpy as np

def apply(f, shape, dtype=None):
    ufunc = np.frompyfunc(f, 2, 1)
    ranges = (np.arange(dim) for dim in shape)
    return reduce(ufunc.outer, ranges).astype(dtype)

print(apply(lambda i, j: (i - j) ** 2, (4, 4)))

Output:

[[0. 1. 4. 9.]
 [1. 0. 1. 4.]
 [4. 1. 0. 1.]
 [9. 4. 1. 0.]]

Upvotes: 0

Related Questions