Ashu
Ashu

Reputation: 49

Sed command to replace comma-separated values with pipe-separated values

My CSV file has records in the following format:

571283,1,"R","01/15/2002","IBMS,SL"

I want to convert them to the following format:

571283|1|R|01/15/2002|IBMS,SL

I tried this:

sed -e 's/ //g' -e 's/\"\,\"/\|/g' -e 's/\,\"/\|/g' -e 's/\"$//' test.csv

but the output I am getting is:

571283,1|R|01/15/2002|IBMS,SL

Please advise.

Upvotes: 1

Views: 7498

Answers (4)

DyLan
DyLan

Reputation: 1

sed -e 's/ //g' -e 's/\"\,\"/\|/g' -e 's/\,\"/\|/g' -e 's/\"$//' -e 's/\,/\|/g' btest.txt

Upvotes: 0

mklement0
mklement0

Reputation: 437953

Generally, using a bona fide CSV parser is the simplest and most robust choice when it comes to parsing CSV data.

For instance, python's CSV parsing provides a straightforward solution:

$ python -c 'import csv,sys; reader=csv.reader(sys.stdin)
for row in reader:
  print("|".join(row))' < test.csv
571283|1|R|01/15/2002|IBMS,SL

As a one-liner (in bash, ksh, or zsh):

python -c $'import csv,sys; reader=csv.reader(sys.stdin)\nfor row in reader:\n  print("|".join(row))' < test.csv

Upvotes: 0

White Feather
White Feather

Reputation: 2783

This works:

sed -e 's/,/|/' -e 's/\"\,\"/\|/g' -e 's/\,\"/\|/g' -e 's/\"$//' test.csv

Result is:

571283|1|R|01/15/2002|IBMS,SL

Your first sequence:

-e 's/ //g'

has to be changed in:

-e 's/,/|/'

Expanding to reply to your comment.

First of all you have to take care of the fact that sed is sequential, so the order of the transformations is important.

In your string:

Market Basket - WF Note A-2,RECM-PS Transfer,09/22/2015,"330930929, 330931800",, 

You have same characters that you want to transform in different ways. The use of g for global and the sequence of transformations are therefore very important.

Let us build the sequence:

first of all let us get rid of ",, that we want to transform in ||:

sed -e 's/\"\,\,/||/' test.csv

will give us:

Market Basket - WF Note A-2,RECM-PS Transfer,09/22/2015,"330930929, 330931800||

Then we do the same with ," that we want to become |:

sed -e 's/\"\,\,/||/' -e 's/\,\"/|/' test.csv

gives:

Market Basket - WF Note A-2,RECM-PS Transfer,09/22/2015|330930929, 330931800||

Now we still have 2 commas we want to transform in | but not the third, the simply way is to repeat twice that transformation:

sed -e 's/\"\,\,/||/' -e 's/\,\"/|/' -e 's/,/|/' -e 's/,/|/' test.csv

Market Basket - WF Note A-2|RECM-PS Transfer|09/22/2015|330930929, 330931800||

That is!

Advise:

  1. Think to your sequence of transformations strategy;
  2. Apply one transformation at the time and look at the result, remembering that that will be fed to the next one.

But finally I think you need to transform BOTH strings at the same time, so that:

Market Basket - WF Note A-2,RECM-PS Transfer,09/22/2015,"330930929, 330931800",,
571283,1,"R","01/15/2002","IBMS,SL"

becomes:

Market Basket - WF Note A-2|RECM-PS Transfer|09/22/2015|330930929, 330931800||
571283|1|R|01/15/2002|IBMS,SL

This is the sequence of transformations that does that:

sed  -e 's/\",\"/|/' -e 's/\"\,\,/||/' -e 's/\,\"/|/' -e 's/,/|/' -e 's/,/|/' -e 's/\"//g' test.csv

Regards

Upvotes: 1

anubhava
anubhava

Reputation: 785186

Using gnu-awk with FPAT:

awk -v FPAT='"[^"]+"|[^,]+' -v OFS='|' '{for(i=1; i<=NF; i++) gsub(/"/, "", $i)} 1' file

571283|1|R|01/15/2002|IBMS,SL

In case gnu-awk is unavailable use this perl command:

perl -pe 's/(?=(([^"]*"){2})*[^"]*$),/|/g; s/"//g' file

571283|1|R|01/15/2002|IBMS,SL

Upvotes: 4

Related Questions