Reputation: 27

How to delete words which start with some specific pattern in a file in unix

I want to delete all words in my file that start with 3: and 4:

For Example -

Input is

13 1:12 2:14 3:11
10 1:9 2:7 4:10 5:2
16 3:7 8:24
7 4:7 6:54

Output should be

13 1:12 2:14
10 1:9 2:7 5:2
14 8:24
7 6:54

Can someone tell me if it is possible using sed command or awk command.

Upvotes: 1

Answers (5)

Roger Lindsjö

Reputation: 11543

Assuming all words contains : and has at least one digit after the :

sed "s/ \([34]:[^\b]+\)//g" inputfile

This matches SPACE, 3 or 4, colon and then at least one non word boundary. It replaces it forth nothing and does so for the whole line.

Upvotes: 0

NeronLeVelu

Reputation: 10039

sed 's/[[:blank:]][34]:[^[:blank:]]\{1,\}[[:blank:]]*/ /g' YourFile

Posix compliant and assuming there is no (as in sample) first word stating with 3: or 4:.

Upvotes: 0

potong

Reputation: 58420

This might work for you (GNU sed):

sed 's/\b[34]:\S*\s*//g' file

Looks for a word boundary and then either 3 or 4 followed by : and zero or more non-spaces followed by zero or more spaces and deletes them throughout the line.

Upvotes: 3

fedorqui

Reputation: 289725

With awk:

awk '{for (i=1; i<=NF; i++)
        {if (! sub("^[34]:", "", $i)) d=d$i" "}
        print d; d=""
     }' file

It loops through the fields and just store in the variable d those that do not start with 3: or 4:. This is done by checking if sub() function returns true or not. When the loop through the line is done, the d variable is printed.

For your given file:

$ awk '{for (i=1; i<=NF; i++) {if (! sub("^[34]:", "", $i)) d=d$i" "} print d; d=""}' file
13 1:12 2:14 
10 1:9 2:7 5:2 
16 8:24 
7 6:54

Upvotes: 1

user2772936

Reputation: 55

With sed

sed -r 's/ 3:[0-9]*| 4:[0-9]*//g'


$ cat input.txt
13 1:12 2:14 3:11 10 1:9 2:7 4:10 5:2 16 3:7 8:24 7 4:7 6:54


$ cat input.txt | sed -r 's/ 3:[0-9]*| 4:[0-9]*//g'
13 1:12 2:14 10 1:9 2:7 5:2 16 8:24 7 6:54

Explanation:

-r = Regex search
3:[0-9]*: Search for a space, then 3, then :, then [0-9] or a number between 0 and 9, the * means that he will search for zero or more hits in the pervious regex search, which is [0-9], so * means on this case that will search for zero or more numbers behind the first number after :
| : Means OR
4:[0-9]*: Same as above except that instead of 3 it will search for 4
//: The substitution strings, if you put POTATOE behind bars it will type it, on this case, sed will simply don't type anything.
/g: Search in all the input passed to sed.

Upvotes: 1

How to delete words which start with some specific pattern in a file in unix

Answers (5)

Related Questions