Reputation: 1893
I'm examining the Accidental Drug Related Deaths dataset. The following is a list of all drugs:
20 Heroin 2529 non-null object
21 Cocaine 1521 non-null object
22 Fentanyl 2232 non-null object
23 FentanylAnalogue 389 non-null object
24 Oxycodone 607 non-null object
25 Oxymorphone 108 non-null object
26 Ethanol 1247 non-null object
27 Hydrocodone 118 non-null object
28 Benzodiazepine 1343 non-null object
29 Methadone 474 non-null object
30 Amphet 159 non-null object
31 Tramad 130 non-null object
32 Morphine_NotHeroin 42 non-null object
33 Hydromorphone 25 non-null object
34 Other 435 non-null object
35 OpiateNOS 88 non-null object
36 AnyOpioid 2466 non-null object
The dataset is sparse, with Y
in place for each drug cause-of-death. For example, the following is deaths['Heroin'].head()
:
0 NaN
1 NaN
2 Y
3 Y
4 NaN
I'm trying to convert this to
0. 0
1 0
2 1
3 1
4 0
To convert the Y
to 1
, I've used deaths = deaths.replace(to_replace={'Y':1})
. I'm now attempting to change the NaN
to 0
. I'm trying to use np.nan_to_num()
, but my code doesn't seem to do anything.
I'm using the following:
deaths.loc[:,'Heroin':'AnyOpioid'] = np.nan_to_num(deaths.loc[:,'Heroin':'AnyOpioid'])
This outputs no change to the original dataset, with deaths['Heroin'].head()
appearing as
0 NaN
1 NaN
2 Y
3 Y
4 NaN
(after the prior deaths.replace()
function).
What is the mechanic that is causing this to happen? I'm assuming it's related to the .loc
, but I'm not sure what to look at first or how to correct. Removing the .loc
gives me a TypeError: cannot do slice indexing on <class 'pandas.core.indexes.range.RangeIndex'> with these indexers [Heroin] of <class 'str'>
.
Upvotes: 1
Views: 283