DataSwede
DataSwede

Reputation: 5591

Pandas to_csv with escape characters and other junk causing return to next line

I have a dataframe with a column that has a bunch of manually entered text, some of which contains various escape characters.

Currently, there are a couple of lines where the output creates a new row. The one that is causing the most problems are the <br/> in the middle and at the end of the text. I'm looking to clean the text just enough so that a new line is not created

EDIT Here's some examples of strings that are causing problems

Example<br/>
Example sentence (number two)\r<br/>That caused an issue

Upvotes: 1

Views: 1515

Answers (1)

yemu
yemu

Reputation: 28259

try using converters for the read_csv, adapt the example below to your needs:

def remove_br(x):
    return x.replace('<br \>','')

convert_dict = {'col_name':remove_br}

df = pd.read_csv('file.csv', converters=converter_dict)

Upvotes: 1

Related Questions