Reputation: 4576
I'm using pandas to merge two csv files on top of each other that may have different column headers. Problem I'm having is it seems to be splitting to a new line randomly.
File 1:
ID, Height
0 , 1
1 , 2
2 , 3
File 2:
ID, Message
0 , "Long string message"
1 , "May include tabs, multiple lines \n
that go on for a while"
2 , "More of the same"
Result Should Be:
ID, Height, Message
0, 1, '',
1, 2, '',
2, 3, '',
0, '', "Long string message",
1, '', "May include tabs, multiple lines \n
that go on for a while",
2, '', "More of the same"
What I'm getting back is:
ID, Height, Message
0, 1, '',
1, 2, '',
2, 3, '',
0, '', "Long string message",
1, '', "May include tabs, multiple lines"
"that go on for a while", '', '',
2, '', "More of the same"
I'm getting it to work for the most part with the following:
first = pd.read_csv('file1.csv')
second = pd.read_csv('file2.csv')
merged = pd.concat([first, second], axis=0, ignore_index=True)
merged.to_csv('test.csv')
Looks like if there is an extra line in the message field, it splits to a new line. How Can I stop it from delimiting based upon a new line in the message field?
Upvotes: 1
Views: 3753
Reputation: 5136
From the short example you gave it looks like it is starting a new row on the new line character \n
you could try using first = pd.read_csv('file1.csv', delim_whitespace = True)
try changing the separator
, lineterminator
, or delimiter-like parameters in read_csv
here.
Upvotes: 1