Reputation: 1197
Objective: to use NumPy in a similar way as Pandas with "select_dtypes".
Setting up a dataframe like the following:
>>> df = pd.DataFrame({'a': [1, 2] * 3,
... 'b': [True, False] * 3,
... 'c': [1.0, 2.0] * 3})
>>> df
a b c
0 1 True 1.0
1 2 False 2.0
2 1 True 1.0
3 2 False 2.0
4 1 True 1.0
5 2 False 2.0
I am looking for something like this but with NumPy:
>>> df.select_dtypes(include=['float64'])
c
0 1.0
1 2.0
2 1.0
3 2.0
4 1.0
5 2.0
Any help would be appreciated.
Upvotes: 2
Views: 840
Reputation: 24691
Numpy arrays have elements which all have the same underlying type. Those are essentially C language arrays (and their data type has to be the same for all elements).
You can check it using .dtype
attribute, like so:
import numpy as np
a = np.array([1.5, 2, 3])
print(a.dtype)
Would give you np.float64
, even though two elements are inserted as int
s
If you want to check whether a certain float
could be an int
(like 2 and 3 in the above example), you shouldn't do that, as floating point precision might be an issue.
If you really insist, you can use np.isclose
to get a boolean array indicating whether each float
element is close enough to it's floored int
counterpart and those might be castable without too big loss in precision:
# For example above, e.g. [1.5, 2, 3]
print(np.isclose(np.floor(a), a))
Would give you [False, True, True]
, meaning second and third element could be casted. Once again, I advise you not to do so.
EDIT: If you have boolean numpy array casted to np.float there is no way to get it back, as you cannot differentiate between bool
casted to float
and int
casted to float
with if int has either 0
or 1
value.
Upvotes: 3