Reputation: 7733
I would like to replace multiple values in the data frame column as shown below
df['label'] = ['Sodium', 'Bicarbonate', 'White Blood Cells', 'Hemoglobin',
'Glucose', 'Lactate', 'pH', 'Potassium, Whole Blood',
'Sodium, Whole Blood', 'Lactate Dehydrogenase (LD)',
'Bilirubin, Direct', 'Alkaline Phosphatase',
'Alanine Aminotransferase (ALT)',
'Asparate Aminotransferase (AST)', 'Potassium', 'Phosphate',
'Creatinine', 'C-Reactive Protein', 'pCO2',
'Calculated Bicarbonate, Whole Blood', 'Bilirubin, Total',
'Albumin', 'Bilirubin, Indirect', 'Urine Volume', 'WBC Count',
'Urine Volume, Total', 'Phosphate, Body Fluid']
In the below code, am trying to replace Sodium
and Sodium, Whole Blood
with just Sodium
.
Similarly, I do the same for the rest of the measurements as well
df['label'] = df['label'].replace(dict.fromkeys(['Sodium','Sodium, Whole Blood'], 'Sodium'))
df['label'] = df['label'].replace(dict.fromkeys(['Bicarbonate','Calculated Bicarbonate, Whole Blood'], 'Bicarbonate'))
df['label'] = df['label'].replace(dict.fromkeys(['Bicarbonate','Bilirubin, Indirect'], 'Bicarbonate'))
df['label'] = df['label'].replace(dict.fromkeys(['Bilirubin, Direct','Bilirubin, Total','Calculated Bicarbonate, Whole Blood'], 'Bilirubin'))
df['label'] = df['label'].replace(dict.fromkeys(['Urine Volume, Total','Urine Volume'], 'Urine Volume'))
df['label'] = df['label'].replace(dict.fromkeys(['White Blood Cells','WBC Count'], 'WBC'))
df['label'] = df['label'].replace(dict.fromkeys(['Potassium, Whole Blood','Potassium'], 'Potassium'))
df['label'] = df['label'].replace(dict.fromkeys(['Phosphate','Phosphate, Body Fluid'], 'Phosphate'))
Though the above code works perfectly fine, is there any other efficient way to replace efficiently instead of repeating the same line of code multiple times?
Upvotes: 2
Views: 99
Reputation: 150785
One way is to create the big dictionary and replace once:
# add more of your stuff here
lst = [(['Sodium','Sodium, Whole Blood'], 'Sodium'),
(['Bicarbonate','Calculated Bicarbonate, Whole Blood'], 'Bicarbonate')
]
repl_dict = {}
for x,y in lst:
repl_dict.update(dict.fromkeys(x,y))
df['label'] = df['label'].replace(repl_dict)
Upvotes: 3