Reputation: 836
I have a single column csv that looks something like this:
KFIG
KUNV
K~LK
K7RT
3VGT
Some of the datapoints are garbled in transmission. I need to keep only the entries that begin with a capital letter, then the other three digits could be a capital letter OR a number. For example, in the list above I would have to delete K~LK
and 3VGT
.
I know that to delete all but capital letters I can write
sed -n '/[A-Z]\{4,\}/p'
I just want to adjust this to where the last three digits could be capital letters or numbers. Any help would be appreciated.
Upvotes: 1
Views: 29
Reputation: 1689
Just use:
sed -n '/[A-Z][A-Z0-9]\{3,\}/p'
However, if these identifiers are really all that there is in the file, I would propose the following command (it will assure that the whole line is matched, so it will reject for example identifiers more than 4 characters long):
sed -n '/^[A-Z][A-Z0-9]\{3\}$/p'
^
means "match zero-length string at the beginning of line";\{3\}
means "match exactly 3 occurences of the previous atom", the previous atom being [A-Z0-9]
;$
means "match zero-length string at the end of line".Upvotes: 2