Austin
Austin

Reputation: 4576

Python Pandas to_csv getting extra rows

I'm using pandas to merge two csv files on top of each other that may have different column headers. Problem I'm having is it seems to be splitting to a new line randomly.

File 1:
ID, Height
0 , 1
1 , 2
2 , 3

File 2:

ID, Message
0 , "Long string message"
1 , "May include tabs, multiple lines \n
     that go on for a while"
2 , "More of the same"

Result Should Be:

ID, Height, Message
0,    1,     '',
1,    2,     '',
2,    3,     '',
0,    '',    "Long string message",
1,    '',    "May include tabs, multiple lines \n
              that go on for a while",
2,    '',    "More of the same"

What I'm getting back is:

ID, Height, Message
0,    1,     '',
1,    2,     '',
2,    3,     '',
0,    '',    "Long string message",
1,    '',    "May include tabs, multiple lines"
"that go on for a while", '', '',
2,    '',    "More of the same"

I'm getting it to work for the most part with the following:

first = pd.read_csv('file1.csv')
second = pd.read_csv('file2.csv')

merged = pd.concat([first, second], axis=0, ignore_index=True)
merged.to_csv('test.csv')

Looks like if there is an extra line in the message field, it splits to a new line. How Can I stop it from delimiting based upon a new line in the message field?

Upvotes: 1

Views: 3753

Answers (1)

MattR
MattR

Reputation: 5136

From the short example you gave it looks like it is starting a new row on the new line character \n

you could try using first = pd.read_csv('file1.csv', delim_whitespace = True)

try changing the separator, lineterminator, or delimiter-like parameters in read_csv here.

Upvotes: 1

Related Questions