Reputation: 763
There are a few questions & answers on both halves of this question, having issues pulling these together. Take the below snippet, how would one go about creating a new column pulling just the value between the brackets?
Household Income
'Over $200,000 ($250,000)
$160,000-$199,000 ($180,000)
NaN
I have a feeling it's using something along this line:
s[s.find("(")+1:s.find(")")]
I'm just not sure on how to apply it to:
df['Income'] = df['Household Income'].*some magic*
EDIT:
Solution would be
Income
250000
180000
NaN
Upvotes: 1
Views: 1540
Reputation: 863166
Use str.extract
:
df['Household Income'] = df['Household Income'].str.replace(',','').str.extract(r"\(\$(.*)\)")
print (df)
Household Income
0 250000
1 180000
2 NaN
And last if need convert to numeric:
df['Household Income'] = (df['Household Income'].str.replace(',','')
.str.extract(r"\(\$(.*)\)")
.astype(float))
print (df)
Household Income
0 250000.0
1 180000.0
2 NaN
Upvotes: 2