Ryan
Ryan

Reputation: 159

sed or awk - deleting strings between patterns

I have a CSV file with lines like this:

AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC.DDD,C-name,num1,num2,num3
EEE.FFF.GGGG,E-name,num1,num2,num3    
HHH.H-name,num1,num2,num3
...

Some lines have one identifier (like AAA); some have two (like CCC); some have three or more (like EEE). And some identifiers are not three characters. I need to remove all but the first identifier from each line of the line (such that the first period and anything that comes after it is deleted until the first comma is encountered), producing this:

AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH,H-name,num1,num2,num3
...

I've tried a few pattern-replace methods but am getting tripped up. Does anyone have the syntax I need?

Upvotes: 2

Views: 2184

Answers (3)

brandizzi
brandizzi

Reputation: 27090

Just remove everything between a dot and the first colon. For the file

$ cat foo
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC.DDD,C-name,num1,num2,num3
EEE.FFF.GGGG,E-name,num1,num2,num3    
HHH.H-name,num1,num2,num3

use this sed command:

$ sed 's/\.[^,]*//' foo
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3    
HHH,num1,num2,num3

However, it will remove an H at the last line. This seems to be a typo in your example, however.

Upvotes: 2

Fredrik Pihl
Fredrik Pihl

Reputation: 45670

Using perl

$ perl -pe 's/\.[A-Z.]*?,/,/' input
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3

sed

$ sed 's/\.[A-Z.]*,/,/' input
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3

and awk

$ awk '/\./{sub(/\.[A-Z.]*,/, ",", $0)}{print}' input
AAA,A-name,num1,num2,num3
BBB,B-name,num1,num2,num3
CCC,C-name,num1,num2,num3
EEE,E-name,num1,num2,num3
HHH.H-name,num1,num2,num3

Upvotes: 1

Michael J. Barber
Michael J. Barber

Reputation: 25052

sed 's/^\([^.]\{1,\}\)[^,]*/\1/'

Upvotes: 2

Related Questions