Reputation: 33

Sed script that removes numbers from file

I am given data in the following format:

comp.os.linux announce 0000002587 02190 m

comp.arch 00000 28874 y

utsa.cs.3423 00000000004 000000000001 y

I am supposed to process it so that it looks like:

comp.os.linux announce m

comp.arch y

utsa.cs.3423 y

I have tried s/^[0-9]//g and it seems to work well but the last line is missing the 4 numbers

Upvotes: 1

Answers (2)

Cyrus

Reputation: 88999

With sed:

sed 's/ [0-9 ]\+[0-9]\+//' file

Output:

comp.os.linux announce m
comp.arch y
utsa.cs.3423 y

Upvotes: 1

heemayl

Reputation: 42137

With awk, printing the first and last field, including the second field if it's comprised of alphabetic characters only:

awk '$2~/^[[:alpha:]]+$/ {print $1, $2, $NF; next} {print $1, $NF}' file.txt

If you insist on using sed:

sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/'

For the lines that do not have only alphabetic second field, this will have two spaces between the two fields, you could tack another sed for that:

sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/; s/  / /'

Example:

% cat file.txt                                              
comp.os.linux announce 0000002587 02190 m
comp.arch 00000 28874 y
utsa.cs.3423 00000000004 000000000001 y

% awk '$2~/^[[:alpha:]]+$/ {print $1, $2, $NF; next} {print $1, $NF}' file.txt
comp.os.linux announce m
comp.arch y
utsa.cs.3423 y

% sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/' file.txt
comp.os.linux announce m
comp.arch  y
utsa.cs.3423  y

% sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/; s/  / /' file.txt
comp.os.linux announce m
comp.arch y
utsa.cs.3423 y

Upvotes: 1

Sed script that removes numbers from file

Answers (2)

Related Questions