Reputation: 332
I have been trying to combine two csv files with alphanumeric data based on the column names they share, so that I can perform something like a join over them using terminal.
Here's what I tried: (my first column of both the files are identical)
join -t , -1 1 -2 1 file_1.csv file_2.csv > file_3.csv
The merge happens alright and my columns are combined but not in the format I want.
Issue: The file_3 consists of lines from both the files, though separated by commas, but in separate lines.
Example:
Columns from file_1
,Columns from file_2
Row1 from file_1
,Row1 from file_2
Row2 from file_1
,Row2 from file_2
What do I do to get the file_3 data in one single line for each row-merge? Any pointers to proceed would be appreciated.
file_1.csv: (sample data)
Id,Age,Employment,Education,Marital,Occupation,Income,Gender,Deductions,Hours,Adjusted
1,38,Private,College,Unmarried,Service,81838,Female,0,72,0
2,35,Private,Associate,Absent,Transport,72099,Male,0,30,0
3,32,Private,HSgrad,Divorced,Clerical,154676.74,Male,0,40,0
file_2.csv: (sample data)
Id,Adjusted,Predicted_Adjusted,Probability_0,Probability_1
1,0,0,0.952957896225136,0.0470421037748636 .
2,0,0,0.973664421132328,0.0263355788676716 .
3,0,0,0.966224074718457,0.0337759252815426
Erroneous join:
Id,Age,Employment,Education,Marital,Occupation,Income,Gender,Deductions,Hours,Adjusted
,Adjusted,Predicted_Adjusted,Probability_0,Probability_1
1,38,Private,College,Unmarried,Service,81838,Female,0,72,0
,0,0,0.952957896225136,0.0470421037748636
2,35,Private,Associate,Absent,Transport,72099,Male,0,30,0
,0,0,0.973664421132328,0.0263355788676716
3,32,Private,HSgrad,Divorced,Clerical,154676.74,Male,0,40,0
,0,0,0.966224074718457,0.0337759252815426
Expected output: Every two lines is actually one row and so the expected output should not have the rows split into two, but should represent a homogenous merge of two csv files i.e file_1 and file_2
Upvotes: 1
Views: 643
Reputation: 19982
Are the files with windows line-endings \r
?
You can try dos2unix file_1.csv
and dos2unix file_2.csv
?
Upvotes: 2
Reputation: 1081
This should work:
join -t , -1 1 -2 1 file_1.csv file_2.csv|paste -d' ' - - > file_3.csv
Upvotes: 1