Nathan Krowitz
Nathan Krowitz

Reputation: 43

Non-looping way in Numpy to convert a string of letters into a boolean array (corresponding to each letter of the string)

I have an array of strings, and I'd like to take those strings and treat them each as boolean arrays corresponding to the alphabet (A-Z).

My goal is to do this in vectorized way and avoid any looping.

E.g.

Input:
A = np.array(['A'])
B = np.array(['AB'])
C = np.array(['AZ'])
D = np.array(['AZ','BAZ'])


Output:
A = np.array([1,0,0,0,...0]) 
B = np.array([1,1,0,0,...0])
C = np.array([1,0,0,0,...1])
D = np.array([[1,0,0,0,...1], [1,1,0,0,...1]])

Upvotes: 2

Views: 118

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71707

map with MultiLabelBinarizer.transform

We can fit the MultiLabelBinarizer on the capital letters from A-Z, then transform the arrays A, B, C, and D using the transform method of MultiLabelBinarizer

import string
from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer().fit([*string.ascii_uppercase])
A, B, C, D = map(mlb.transform, (A, B, C, D))

>>> A
array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0]])

>>> B
array([[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0]])

Upvotes: 1

Related Questions