Reputation: 276
I have a comma delimited file (CSV file) test.csv
as shown below.
FHEAD,1,2,3,,,,,,
FDEP,2,3,,,,,,,,
FCLS,3,,,4-5,,,,,,,
FDETL,4,5,6,7,8,
FTAIL,5,67,,,,,,
I wanted to remove the empty columns only from 2nd and 3rd row of the file i.e. were ever the records starts with FDEP
and FCLS
only in those rows I wanted to remove the empty columns (,,).
after removing the empty columns the same file test.csv
should look like
FHEAD,1,2,3,,,,,,
FDEP,2,3
FCLS,3,4-5
FDETL,4,5,6,7,8,
FTAIL,5,67,,,,,,
How can I do this in Unix???
Upvotes: 1
Views: 372
Reputation: 85767
Here's one way to do it, using sed:
sed '/^F\(DEP\|CLS\),/ { s/,\{2,\}/,/g; s/,$// }'
We use a range of /^F\(DEP\|CLS\),/
, i.e. the following command will only process lines matching ^F\(DEP\|CLS\),
. This regex matches beginning-of-string, followed by F
, followed by either DEP
or CLS
, followed by ,
. In other words, we look for lines starting with FDEP,
or FCLS,
.
Having found such a line, we first substitute (s
command) all runs (g
flag, match as many times as possible) of 2 or more (\{2,\}
) commas (,
) in a row by a single ,
. This squeezes ,,,
down to a single ,
.
Second, we substitute ,
at end-of-string by nothing. This gets rid of any trailing comma.
Upvotes: 1