Reputation: 21
I have a boolean pandas DataFrame, as follow
aaa = pd.DataFrame([[False,False,False], [True,True,True]])
I want to convert it to a binary number array, for this DataFrame "aaa", the result is [000,111]
How can I implement this conversion?
Any help will be greatly appreciated. Thanks
Upvotes: 2
Views: 2448
Reputation: 2109
I would do one of the following:
a.astype(int).astype(str).sum(axis=1).astype(int).astype(str)
but this is a bit too much retyping to my taste.
Another possibility is to use apply:
a.astype(int).astype(str).apply(lambda x: ''.join(list(x)))
But what seems cleanest to me is to obtain the desired number by multiplication, then convert it to binary:
a.dot([4, 2, 1]).map(lambda x: bin(x))
of course, if you don't want the '0b' on the beginning, you just use
a.dot([4, 2, 1]).map(lambda x: bin(x)[2:])
Upvotes: 0
Reputation: 294546
You can multiply by a bit shifted operator to simulate powers of two, sum, then convert to binary
aaa.mul(np.arange(3)[::-1] << 1).sum(1).apply(bin)
0 0b0
1 0b110
dtype: object
Notice how np.arange(3)[::-1] << 1
is successive powers of 2
array([4, 2, 0])
You can take this further by manipulating with str
operations
aaa.mul(
np.arange(3)[::-1] << 1
).sum(1).apply(bin).str.replace('0b', '').str.zfill(3)
0 000
1 110
dtype: object
Upvotes: 1
Reputation: 863741
You can convert after int
and str
to numpy array
by values
and then sum
:
print (aaa.astype(int).astype(str).values.sum(axis=1))
['000' '111']
Upvotes: 1
Reputation: 19664
You can do:
aaa = pd.DataFrame([[False,False,False],
[True,True,True]])
aaa=aaa.astype(int)
Then aaa
is
0 1 2
0 0 0 0
1 1 1 1
If you want to get the array ['000','111']
you can do:
aaa = pd.DataFrame([[False,False,False],
[True,True,True]])
aaa=aaa.astype(int).astype(str)
[''.join(i) for i in aaa.values.tolist()]
Upvotes: 5