ched
ched

Reputation: 83

Vectorizing a for loop which contains an iterator (Numpy Array)

I am trying to make a python function run faster by taking advantage of the vectorization features in Numpy.

The aim of the function is that at the end of each iteration of the for loop, the location arr[i, 0] (i.e first entry of each row) will contain the number of rows in the 2D array (data), whose value if at position 0 is less than or equal to that or array i

# Data is a 2D array
def function(data)
    n = len(data)
    arr = np.zeros(shape=(n, 9))
    for i, sample in enumerate(data):
        arr[i][0] = np.count_nonzero(data[:, 0] <= data[i][0])

I am currently using a for loop. The runtime is horribly slow as this is being used in a machine learning algorithm. I have been told that I can further improve this by vectorizing the for loop.

I have tried this, in my attempt to use list comprehension:

arr[:, 0] = ( np.count_nonzero(data[:, x] >  data[i][x]) for i in gini_arr[:, 0] )

But I get a generator error:

TypeError: float() argument must be a string or a number, not 'generator'

How can I vectorize this for loop?

Upvotes: 1

Views: 1234

Answers (1)

Valdi_Bo
Valdi_Bo

Reputation: 30991

In your comment to my initial solution you wrote: in column 0, no element is less than 0. So this brings me to a conclusion that in your function there should be < instead of <=.

Another weird point is that you created a loop with one of control variables named sample but you never use this variable. A more natural solution is to iterate over a range(n) (you computed n earlier, as the number of rows in data).

The last detail to correct in your code is that your function does not return anything.

And one improvement: As you compute the count of elements meeting some condition, the result is by definition an array of integers, not float.

So the corrected content of your function (operating the old way) should be:

def function(data):
    n = len(data)
    arr = np.zeros((n, 9), dtype=int)
    for i in range(n):
        arr[i, 0] = np.count_nonzero(data[:, 0] < data[i, 0])
    return arr

How to compute this result in a vectorized way

You can get just the same result, when you define your function as:

def fn(data):
    dd = data[:, 0]  # The first column
    res = np.less(dd[np.newaxis, :], dd[:, np.newaxis]).sum(axis=1)
    return np.hstack([res[:, np.newaxis], np.zeros((dd.size, 8), dtype=int)])

Note that res (a 1-D array) contains the result of comparison of each element in dd with all elements in dd.

The final result is generated by:

  • conversion of res to an array with a single column,
  • "addition" of 8 columns filled with zeroes.

To test this function I created a test array:

data = np.arange(25).reshape(5,-1)
data[3,0] = 1

so that it contains:

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [ 1, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

Then I ran fn(data) getting:

array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
       [2, 0, 0, 0, 0, 0, 0, 0, 0],
       [3, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0, 0],
       [4, 0, 0, 0, 0, 0, 0, 0, 0]])

For the above test data (5 rows) my function computed only 5 % faster (107 µs yours and 101 µs mine).

But for a bigger number of rows, e.g. 60, your function completed in 1.24 ms, whereas my function in 165 µs, so 7 times faster.

For yet biger number of rows, the advantage of my code should be yet more visible.

Edit following the comment as of 2020-09-07

Explanation of how np.newaxis works:

  1. dd (in fn function) contains array([ 0, 5, 10, 1, 20]) - the first column of data, but here it is a 1-D array.

  2. dd[np.newaxis, :] is a 2-D array, containing just the same numbers, in a single row: array([[ 0, 5, 10, 1, 20]]).

  3. dd[:, np.newaxis] is also a 2-D array, but this time the same numbers are the content of a single column:

    array([[ 0],
           [ 5],
           [10],
           [ 1],
           [20]])
    
  4. Now run np.less(dd[np.newaxis, :], dd[:, np.newaxis]).astype(int) (to facilitate the assessment of what has been computed, I added conversion to int):

    array([[0, 0, 0, 0, 0],
           [1, 0, 0, 1, 0],
           [1, 1, 0, 1, 0],
           [1, 0, 0, 0, 0],
           [1, 1, 1, 1, 0]])
    

    The above result is a 2-D array, the result of comparison (left_argument < right_argument) where:

    • row number is the number of element from "row argument",
    • column number is the number of element "column argument".

    So the whole array is the result of "each to each" comparison.

    Example: The first element in row (0) is not less than any element of the column, so the first row of the result contains all zeroes.

  5. And the last step is sum(axis=1), i.e. compute sum of the above array along each row.

This way the final result is: array([0, 2, 3, 1, 4]) and it will be the first column of the whole result of fn function.

If you still need more explanation, search the Web for np.newaxis. Even on SO you can find many explanations on this matter.

Upvotes: 2

Related Questions