Akashdeep Saluja
Akashdeep Saluja

Reputation: 3089

How to convert from boolean array to int array in python

I have a Numpy 2-D array in which one column has Boolean values i.e. True/False. I want to convert it to integer 1 and 0 respectively, how can I do it?

E.g. my data[0::,2] is boolean, I tried

data[0::,2]=int(data[0::,2])

, but it is giving me error:

TypeError: only length-1 arrays can be converted to Python scalars

My first 5 rows of array are:

[['0', '3', 'True', '22', '1', '0', '7.25', '0'],
 ['1', '1', 'False', '38', '1', '0', '71.2833', '1'],
 ['1', '3', 'False', '26', '0', '0', '7.925', '0'],
 ['1', '1', 'False', '35', '1', '0', '53.1', '0'],
 ['0', '3', 'True', '35', '0', '0', '8.05', '0']]

Upvotes: 21

Views: 67036

Answers (5)

Old Q but, for reference - a bool can be converted to an int and an int to a float

data[0::,2]=data[0::,2].astype(int).astype(float)

Upvotes: 0

aslan
aslan

Reputation: 840

boolarrayvariable.astype(int) works:

data = np.random.normal(0,1,(1,5))
threshold = 0
test1 = (data>threshold)
test2 = test1.astype(int)

Output:

data = array([[ 1.766, -1.765,  2.576, -1.469,  1.69]])
test1 = array([[ True, False,  True, False,  True]], dtype=bool)
test2 = array([[1, 0, 1, 0, 1]])

Upvotes: 15

Mr. B
Mr. B

Reputation: 2696

If I do this on your raw data source, which is strings:

data = [['0', '3', 'True', '22', '1', '0', '7.25', '0'],
        ['1', '1', 'False', '38', '1', '0', '71.2833', '1'],
        ['1', '3', 'False', '26', '0', '0', '7.925', '0'],
        ['1', '1', 'False', '35', '1', '0', '53.1', '0'],
        ['0', '3', 'True', '35', '0', '0', '8.05', '0']]

data = [[eval(x) for x in y] for y in data]

..and then follow that with:

data = [[float(x) for x in y] for y in data]
# or this if you prefer:
arr = numpy.array(data)

..then the problem is solved. ..you can even do it as a one-liner (I think this makes ints, though, and floats are probably needed): numpy.array([[eval(x) for x in y] for y in data])

..I think the problem is that numpy is keeping your numeric strings as strings, and since not all of your strings are numeric, you can't do a type conversion on the whole array. Also, if you try to do a type conversion just on the parts of the array with "True" and "False", you're not really working with booleans, but with strings. ..and the only ways I know of to change that are to do the eval statement. ..well, you could do this, too:

booltext_int = {'True': 1, 'False': 2}
clean = [[float(x) if x[-1].isdigit() else booltext_int[x]
          for x in y] for y in data]

..this way you avoid evals, which are inherently insecure. ..but that may not matter, since you may be using a trusted data source.

Upvotes: 2

jamylak
jamylak

Reputation: 133554

Using @kirelagin's idea with ast.literal_eval

>>> import ast
>>> import numpy as np
>>> arr = np.array(
        [['0', '3', 'True', '22', '1', '0', '7.25', '0'],
        ['1', '1', 'False', '38', '1', '0', '71.2833', '1'],
        ['1', '3', 'False', '26', '0', '0', '7.925', '0'],
        ['1', '1', 'False', '35', '1', '0', '53.1', '0'],
        ['0', '3', 'True', '35', '0', '0', '8.05', '0']])
>>> np.vectorize(ast.literal_eval, otypes=[np.float])(arr)
array([[  0.    ,   3.    ,   1.    ,  22.    ,   1.    ,   0.    ,
          7.25  ,   0.    ],
       [  1.    ,   1.    ,   0.    ,  38.    ,   1.    ,   0.    ,
         71.2833,   1.    ],
       [  1.    ,   3.    ,   0.    ,  26.    ,   0.    ,   0.    ,
          7.925 ,   0.    ],
       [  1.    ,   1.    ,   0.    ,  35.    ,   1.    ,   0.    ,
         53.1   ,   0.    ],
       [  0.    ,   3.    ,   1.    ,  35.    ,   0.    ,   0.    ,
          8.05  ,   0.    ]])

Upvotes: 1

kirelagin
kirelagin

Reputation: 13616

Ok, the easiest way to change a type of any array to float is doing:

data.astype(float)

The issue with your array is that float('True') is an error, because 'True' can't be parsed as a float number. So, the best thing to do is fixing your array generation code to produce floats (or, at least, strings with valid float literals) instead of bools.

In the meantime you can use this function to fix your array:

def boolstr_to_floatstr(v):
    if v == 'True':
        return '1'
    elif v == 'False':
        return '0'
    else:
        return v

And finally you convert your array like this:

new_data = np.vectorize(boolstr_to_floatstr)(data).astype(float)

Upvotes: 25

Related Questions