Reputation: 69
I have two dataframes in python:
df1
Column1 | Column2 | Column3 | Column4 | Column5 |
---|---|---|---|---|
1a | 123 | RJ | 12.1 | test1 |
2a | 432 | MT | 23.2 | test3 |
3a | 234 | DR | 34.3 | test5 |
df2
Column1 | Column3 | Column6 |
---|---|---|
1a | RJ | 695 |
2a | MT | 568 |
3a | DR | 232 |
And I wish to append them together and save them as CSV, separated by pipe.
But when I simply append, there are lots of columns with nulls:
df3 = df1.append(df2, ignore_index=True)
Column1 | Column2 | Column3 | Column4 | Column5 | Column6 |
---|---|---|---|---|---|
1a | 123 | RJ | 12.1 | test1 | |
2a | 432 | MT | 23.2 | test3 | |
3a | 234 | DR | 34.3 | test5 | |
1a | RJ | 695 | |||
2a | MT | 568 | |||
3a | DR | 232 |
Then when write into CSV give me this result:
df3.to_csv('df3.csv', sep='|', index=False) #I will also remove header ',header=False'
Column1 | Column2 | Column3 | Column4 | Column5|Column6|
1a| 123|RJ| 12.1| test1||
2a| 432|MT| 23.2| test3||
3a|234|DR| 34.3|test5||
1a| |RJ|| |695|
2a| |MT|| |568|
3a| |DR|| |232 |
But what I need as results is this output, ignoring nulls (don't worry about header):
Column1 | Column2 | Column3 | Column4 | Column5|Column6|
1a|123|RJ|12.1|test1
2a|432|MT|23.2|test3
3a|234|DR|34.3|test5
1a|RJ|695
2a|MT|568
3a|DR|232
Any Ideas? Thanks in advance.
Upvotes: 0
Views: 157
Reputation: 3010
So your desired output isn't really csv with a pipe separator because that format assumes that each line has the same number of fields and that you will include separators for blank/null/empty values.
You can convert each row to a pipe-delimited string with a lambda function and then write the results to disk in your desired format.
pipe_delim_rows = df.apply(lambda x: '|'.join([str(v) for v in x.values if v not in (np.nan, '', None)]), axis=1)
with open('file.txt', 'w') as f:
for item in pipe_delim_rows:
f.write(item + '\n')
Upvotes: 1