Reputation: 1703
I have a csv file that includes line breaks within columns:
"id","comment","x"
1,"ABC\"xyz",123
2,"xyz\"abc",543
3,"abc
xyz",483
ID 3, for example contains such a line break.
How can this be imported into python or R? Also, I don't mind if those line breaks were to be replaced by a space, for example.
Upvotes: 9
Views: 17093
Reputation: 502
You can also use python pandas library read_csv function. Make sure to specify escape char.
import pandas as pd
df = pd.read_csv('path_to_csv', sep=',', escapechar='\\')
Please note second backslash escaping first one. It has nothing to do with pandas or csv.
Upvotes: 11
Reputation: 1703
the problem seemed to be not the line breaks, but rather the escaped upper quotes within the columns: \"
.
Python: zvone's answer worked fine!
import csv
with open(filename) as f:
reader = csv.reader(f)
csv_rows = list(reader)
R: readr::read_csv
worked without having to change any of the defaults.
Upvotes: 4
Reputation: 19352
Python has built-in CSV reader which handles that for you. See csv documentation.
import csv
with open(filename) as f:
reader = csv.reader(f)
csv_rows = list(reader)
Upvotes: 5