Reputation: 1467
I'm having a pandas DataFrame df
. I want to replace ↑
(a space after ↑) with +
, and ↓
(a space after ↓) with -
. For example, df.a[0]
(values ↑ 0.69%
) replace with +0.69%
.
df['last_month'] = df['last_month'].replace(r"↑ ","")
is not right. Why?
data = [{"a":"↑ 0.69%","b":"↓ 9.93%"},{"a":"↓ 0.46%","b":"↑ 3.3%"},{"a":"↓ 0.78%","b":"↓ 3.43%"}]
df = pd.DataFrame(data)
df
a b
0 ↑ 0.69% ↓ 9.93%
1 ↓ 0.46% ↑ 3.3%
2 ↓ 0.78% ↓ 3.43%
In my raw data, ↑
is an unicode, so it didn't work. In the demo data, ↑
is a str(bytes), so df['last_month'] = df['last_month'].replace(r"↑ ","")
works actually like MaxU's. But how to replace when DataFrame values are unicode?
Upvotes: 1
Views: 2427
Reputation: 210832
IIUC:
In [28]: df.replace(['↑\s*', '↓\s*'], ['+', '-'], regex=True)
Out[28]:
a b
0 +0.69% -9.93%
1 -0.46% +3.3%
2 -0.78% -3.43%
For Python 2x:
In [80]: %paste
data = [{"a":u"↑ 0.69%","b":u"↓ 9.93%"},{"a":u"↓ 0.46%","b":u"↑ 3.3%"},{"a":u"↓ 0.78%","b":u"↓ 3.43%"}]
df = pd.DataFrame(data)
df
## -- End pasted text --
Out[80]:
a b
0 ↑ 0.69% ↓ 9.93%
1 ↓ 0.46% ↑ 3.3%
2 ↓ 0.78% ↓ 3.43%
In [81]: %paste
df = df.replace([u'↑\s*', u'↓\s*'], [u'+', u'-'], regex=True)
print(df)
## -- End pasted text --
a b
0 +0.69% -9.93%
1 -0.46% +3.3%
2 -0.78% -3.43%
Upvotes: 4
Reputation: 1467
I got it, df.replace([u'↑ ', u'↓ '], [u'+', u'-'], regex=True)
works.
Upvotes: 0
Reputation: 294218
you can stack
then unstack
with the str
accessor.
df.stack().str.replace("↑ ","+").str.replace("↓ ", "-").unstack()
Upvotes: 2