phillyooo
phillyooo

Reputation: 1703

Importing csv file with line breaks to R or Python Pandas

I have a csv file that includes line breaks within columns:

"id","comment","x"
1,"ABC\"xyz",123
2,"xyz\"abc",543
3,"abc
xyz",483

ID 3, for example contains such a line break.

How can this be imported into python or R? Also, I don't mind if those line breaks were to be replaced by a space, for example.

Upvotes: 9

Views: 17093

Answers (3)

Ali Faizan
Ali Faizan

Reputation: 502

You can also use python pandas library read_csv function. Make sure to specify escape char.

import pandas as pd
df = pd.read_csv('path_to_csv', sep=',', escapechar='\\')

Please note second backslash escaping first one. It has nothing to do with pandas or csv.

Upvotes: 11

phillyooo
phillyooo

Reputation: 1703

the problem seemed to be not the line breaks, but rather the escaped upper quotes within the columns: \".

Python: zvone's answer worked fine!

import csv

with open(filename) as f:
    reader = csv.reader(f)
    csv_rows = list(reader)

R: readr::read_csv worked without having to change any of the defaults.

Upvotes: 4

zvone
zvone

Reputation: 19352

Python has built-in CSV reader which handles that for you. See csv documentation.

import csv

with open(filename) as f:
    reader = csv.reader(f)
    csv_rows = list(reader)

Upvotes: 5

Related Questions