youngguv
youngguv

Reputation: 101

How do I pad all punctuation with a whitespace for every row of text in a pandas dataframe?

I have a data frame with df['text'].

A sample value of df['text'] could be:

"The quick red.fox jumped over.the lazy brown, dog."

I want the output to be:

"The quick red . fox jumped over . the lazy brown , dog . "

I've tried using the str.replace() method, but I don't quite understand how to make it do what I'm looking for.

import pandas as pd

# read csv into dataframe
df=pd.read_csv('./data.csv')

#add a space before and after every punctuation
df['text'] = df['text'].str.replace('.',' . ')
df['text'].head()

# write dataframe to csv
df.to_csv('data.csv', index=False)

Upvotes: 1

Views: 340

Answers (3)

BENY
BENY

Reputation: 323356

Try with

df['text'] = df['text'].replace({'.':' . ',', ':' , '},regex=True)

Upvotes: 1

jezrael
jezrael

Reputation: 863301

For replace all punctuation use regex from this with \\1 for add spaces before and after values:

df['text'] = df['text'].str.replace(r'([^\w\s]+)', ' \\1 ')

Upvotes: 1

Erfan
Erfan

Reputation: 42926

You have to use the escape operator to literally match a point, using .str.replace

df['Text'].str.replace('\.', ' . ').str.replace(',', ' , ')

0    The quick red . fox jumped over . the lazy brown ,  dog . 
Name: Text, dtype: object

Upvotes: 1

Related Questions