Reputation: 1131
I need to remove any letters that occur after the first comma in a line
some.file
JAN,334X,333B,337A,338D,332Q,335H,331U
Expected Result:
JAN,334,333,337,338,332,335,331
Code:
sed -i 's/\[0-9][0-9][0-9].*,/[0-9][0-9][0-9],/g' some.file
What am I doing wrong?
Upvotes: 2
Views: 445
Reputation: 47089
No need for sed, coreutils will do:
paste -d, <(cut -d, -f1 data) <(cut -d, -f2- data | tr -d 'A-Z')
This takes .3 seconds on my computer when run on the data file generated in ceving's answer.
Upvotes: 2
Reputation: 23774
Try this
$ sed 's/,\([0-9]*\)[^,]*/,\1/g' <<<'JAN,334X,333B,337A,338D,332Q,335H,331U'
JAN,334,333,337,338,332,335,331
You need to capture the digits with round parenthesis in order to use the captured string in the replacement. The option g
does this for every occurrence.
Comparison of the different answers
Test data:
$ > data; for ((x=1000000;x>0;x--)); do echo 'JAN,334X,333B,337A,338D,332Q,335H,331U' >> data; done
My answer is the slowest:
$ time sed 's/,\([0-9]*\)[^,]*/,\1/g' < data >/dev/null real 0m16.368s user 0m16.296s sys 0m0.024s
Michael is a bit faster:
$ time sed ':;s/[A-Z],/,/2;t;s/[A-Z]$//' < data >/dev/null real 0m9.669s user 0m9.624s sys 0m0.012s
But Sundeep is the fastet:
$ time sed 's/[A-Z]//4g' < data >/dev/null real 0m4.905s user 0m4.856s sys 0m0.028s
Upvotes: 2
Reputation: 23667
Since question is tagged linux
, this GNU sed
option comes in handy
$ echo 'JAN,334X,333B,337A,338D,332Q,335H,331U' | sed -E 's/[A-Z](,|$)/\1/2g'
JAN,334,333,337,338,332,335,331
2g
means replace from 2nd match onwards till end of lineIf number of letters is known for first column, this can be simplified to
$ echo 'JAN,334X,333B,337A,338D,332Q,335H,331U' | sed 's/[A-Z]//4g'
JAN,334,333,337,338,332,335,331
Upvotes: 2
Reputation: 3363
You could also use a small loop (this is GNU sed
);
sed ':;s/[A-Z],/,/2;t;s/[A-Z]$//'
It only deletes the second letter preceding a comma, and loops. Finally, it deletes the letter at the line's end, if there is one.
Upvotes: 4
Reputation: 4867
You should omit the *
and the first \
looks like a mistake i.e.
sed -i 's/[0-9][0-9][0-9].,/[0-9][0-9][0-9],/g' some.file
but I think you also want to capture the number ...
sed -i 's/\([0-9][0-9][0-9]\).,/\1,/g' some.file
Would be helpful if you posted your actual output as well ...
Upvotes: 2
Reputation: 14949
Some issues are:
No need to escape [
.
Your replace
value is wrong. Ex: s/regex/replace/g
Use this:
sed -e 's/\([0-9]\+\)[a-zA-Z],/\1,/g' -e 's/\([0-9]\+\)[a-zA-Z]$/\1/g' file
Upvotes: 2