Marco Scarselli
Marco Scarselli

Reputation: 1224

pandas to_csv split some rows in 2 lines

i have a problem with pandas.to_csv

pandas dataframe work correctly and pd.to_excel work well too.

when i try to use .to_csv some rows splitted in two (i see it in wordpad and excel)

for example:

line 1: provincia;comune;Ragione sociale;indirizzo;civico;destinazione;sup_coperta

line2: AR;CHIUSI DELLA VERNA;ex sacci;LOC. CORSALONE STRADA REGIONALE

line3: 71;;SITO DISMESSO;

my code toscana.to_csv("toscana.csv", index = False, encoding = "utf-8", sep=";")

EDIT: i add some line with the problem (thx to all for the comments!)

enter image description here ` how i can remove line breaks in values? I found \r in a cell splitted in 2 csv lines: Out[17]: u'IMPIANTI SPORTIVI: CIRCOLO CULTURALE RICREATIVO \rPESTELLO'

i solve with

def replace(x): if type(x) == str or type(x) == unicode: x = x.replace('\r', '') else: x = x[0].replace('\r', '') return x

toscana["indirizzo"] = toscana["indirizzo"].map(lambda x: x.replace('"', '')) toscana["indirizzo"] = toscana["indirizzo"].map(lambda x: replace(x))

toscana["Ragione sociale"] = toscana["Ragione sociale"].map(lambda x: x.replace('"', '')) toscana["Ragione sociale"] = toscana["Ragione sociale"].map(lambda x: replace(x))

there is smarter methods to do it?

Upvotes: 4

Views: 2169

Answers (1)

Umesh Jiddi
Umesh Jiddi

Reputation: 21

You can use the Pandas Replace method to achieve this rather than creating a new function.

Pandas Replace Method

It includes regex so you can include expressions in the replace such as | for Or

In the example we will use regex=True and replace the \\ with a \ using regex

adding inplace = True will change the value without adding / removing any data from the position in the table.

r"\\t|\\n|\\r" is the same as "\\t" or "\\n" or "\\r" and we replace with the single \

df.replace(to_replace=[r"\\t|\\n|\\r", "\t|\n|\r"], value=["",""], regex=True, inplace=True)

Upvotes: 1

Related Questions