Noel DSouza
Noel DSouza

Reputation: 71

Error replacing numbers in an array with predefined values in python

I'm trying to replace values in an array (0-99 repeating) with a new set based on the value of n. For eg. if n=0, values 0,10,20..90 should be replaced by 0,1,2..9 and the rest should be 10. The following code works okay for all values of n (0-8) except 9. For 9 it gives the message long() argument must be a string or a number, not 'NoneType' I've tried a lot debugging this but can't seem to find what the problem is.

import numpy as np
arr1=[[19][29][ 0][11][ 1][86][90][28][23][31][39][96][82][17][71][39][ 8][97]]
n = 9
d = {}
for i, j in zip(range(n, 100, 10), range(10)):
    d[i] = j
arr2 = np.vectorize(d.get)(arr1)
arr2[arr2 == None] = 10

arr1 is the original array and arr2 is the new array.

output should be

arr2=[[ 1] [ 2] [10] [10] [10] [10] [10] [10] [10] [10] [ 3] [10] [10] [10] [10] [ 3] [10] [10]]

Upvotes: 0

Views: 284

Answers (2)

hpaulj
hpaulj

Reputation: 231665

Correction:

arr1=[[19],[29],[ 0],[11],[ 1],[86],[90],[28],[23],[31],[39],[96],[82],[17],[71],[39],[ 8],[97]]

d is:

{9: 0, 19: 1, 29: 2, 39: 3, 49: 4, 59: 5, 69: 6, 79: 7, 89: 8, 99: 9}

The error, with full traceback, is:

Traceback (most recent call last):
  File "stack53618793.py", line 8, in <module>
    arr2 = np.vectorize(d.get)(arr1)
  File "/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py", line 1972, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py", line 2051, in _vectorize_call
    res = array(outputs, copy=False, subok=True, dtype=otypes[0])
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

With n=8 d is {8: 0, 18: 1, 28: 2, 38: 3, 48: 4, 58: 5, 68: 6, 78: 7, 88: 8, 98: 9}. arr2 has a lot of None because that's the default for get.

vectorize performs a test calculation with the first element of arr1, and uses the result to set the return dtype.

With n=8, get(19) returns None, so the return dtype is set to object.

With n=9, get(19) returns integer 1 (it's in d), so the return dtype is int. That produces an error when another get returns None.

One fix is to set the otypes.

arr2 = np.vectorize(d.get, otypes=[object])(arr1)

Another possibility is to replace get with a `get(

arr2 = np.vectorize(lambda x: d.get(x,10))(arr1)

Then you don't need the None replacement step.

This vectorized get is probably not the fastest way of doing this replacement. But if you do use vectorize you need to watch out for traps like this automatic otypes.

When you ask about an error, you should include the full traceback, or at least enough so we know exactly where the error occurs. It wasn't obvious to me until I ran the test case.

Upvotes: 1

Tarifazo
Tarifazo

Reputation: 4343

You can use np.putmask (see here) to replace specific values with a formula based on those values (see here).

As for your case, you can modulus: it's easier and faster than using a dictionary. Does this represent your desired input/output?

import numpy as np
n = 9

arr1=np.random.randint(0, 100, size=20)
arr2 = arr1.copy()
np.putmask(arr2, (arr1-n)%10 == 0, arr1 % 10)

print(arr1)
print(arr2)

[69 70 63 52 27 96 0 40 2 90 36 24 17 90 67 58 74 50 11 58]

[ 9 70 63 52 27 96 0 40 2 90 36 24 17 90 67 58 74 50 11 58]

Edited for your desired output:

n = 9 
arr1=np.random.randint(0, 100, size=20)
arr2 = arr1.copy()
mask = (arr1-n)%10 == 0
np.putmask(arr2, mask , arr1 // 10)
np.putmask(arr2, ~mask , 10)
print(arr1)
print(arr2)

[28 72 87 31 87 3 34 96 61 14 25 79 74 25 38 87 38 8 6 8] [10 10 10 10 10 10 10 10 10 10 10 7 10 10 10 10 10 10 10 10]

If you want to use a dictionary, set the default value in the .get method

arr2 = np.vectorize(lambda x: d.get(x,10))(arr1)

Upvotes: 2

Related Questions