Reputation: 73
I am working on determining correlation for a set of data containing boolean values. The ideal situation would be to replace all instances of booleans with 1's and 0's. How can I most efficiently parse through my numPy array and replace these values? Bellow is what I have to work with and the output...
def findCorrelation(csvFileName):
data = pd.read_csv(csvFileName)
data = data.values
df = pd.DataFrame(data=data)
npList = np.asarray(df)
print npList
print df.corr()
Output:
[[320 True]
[400 False]
[350 True]
[360 True]
[340 True]
[340 True]
[425 False]
[380 False]
[365 True]]
Empty DataFrame
Columns: []
Index: []
Success
Process finished with exit code 0
Upvotes: 4
Views: 5785
Reputation: 173
The function you're looking for is astype
(documentation).
Example:
import numpy as np
a = np.asarray([[320, True], [400, False], [350, True], [360, True], [340, True], [340, True], [425, False], [380, False], [365, True]]).astype(int)
print (a)
Output:
[[320 1]
[400 0]
[350 1]
[360 1]
[340 1]
[340 1]
[425 0]
[380 0]
[365 1]]
Upvotes: 5