Reputation: 11
I have a column with the headlines of articles. The headlines are like this(in Greek): [\n, [Μητσοτάκης: Έχει μεγάλη σημασία οι φωτισ..
How can I remove this character: [\n, ?
I have tried this but nothing happened:
df['Title'].replace('\n', '', regex=True)
Upvotes: 1
Views: 4300
Reputation: 11063
.replace() does not change the dataframe by default, it returns a new dataframe. Use the inplace pararameter.
>>> import pandas
>>> df = pandas.DataFrame([{"x": "a\n"}, {"x": "b\n"}, {"x": "c\n"}])
>>> df['x'].replace('\n', '', regex=True) # does not change df
0 a
1 b
2 c
Name: x, dtype: object
>>> df # df is unchanged
x
0 a\n
1 b\n
2 c\n
>>> df['x'].replace('\n', '', regex=True, inplace=True)
>>> df # df is changed
x
0 a
1 b
2 c
Upvotes: 2
Reputation: 25489
You're looking for
df['Title'].str.replace('\n', '')
Also remember that this replacement doesn't happen in-place. To change the original dataframe, you're going to have to do
df['Title'] = df['Title'].str.replace('\n', '')
df.str
provides vectorized string functions to operate on each value in the column. df.str.replace('\n', '')
runs the str.replace()
function on each element of df
.
df.replace()
replaces entire values in the column with the given replacement.
For example,
data = [{"x": "hello\n"}, {"x": "yello\n"}, {"x": "jello\n"}]
df = pd.DataFrame(data)
# df:
# x
# 0 hello\n
# 1 yello\n
# 2 jello\n
df["x"].str.replace('\n', '')
# df["x"]:
# 0 hello
# 1 yello
# 2 jello
df["x"].replace('yello\n', 'bello\n')
# df["x"]:
# 0 hello\n
# 1 bello\n
# 2 jello\n
Upvotes: 0