Extract substring from a string using awk

Question

I have string which can be one of the following two formats :

dts12931212112 : some random message1 : abc, xyz
nodts : some random message2

I need to extract substring from these two string which doesn't have 'dts' part i.e. it should return :

some random message1 : abc, xyz
some random message2

I need to do this inside a bash script.

Can you help me with the awk command, which does this operation for both kind of strings?

Avinash Raj · Accepted Answer

Through awk's gsub function.

$ awk '{gsub(/^[^:]*dts[^:]*:|:[^:]*dts[^:]*/, "")}1' file
 some random message1 : abc, xyz
 some random message2
$ awk '{gsub(/^[^:]*dts[^:]*:[[:blank:]]*|:[^:]*dts[^:]*/, "")}1' file
some random message1 : abc, xyz
some random message2

You could apply the same regex in sed also, but you need to enable -r --regexp-extended parameter.

^ asserts that we are at the start. [^:]* negated character class which matches any character but not of :, zero or more times. So this ^[^:]*dts[^:]*: would match the substring at the start which contain dts. It it won't touch if the substring is present at the middle. This :[^:]*dts[^:]* pattern matches the middle or last substring which has dts. Finally replacing the matched chars with an empty string will give you the desired output.

Update:

$ awk '{gsub(/^[^[:space:]]*dts[^[:space:]]*[[:space:]:]*|[[:space:]:]*[^[:space:]]*dts[^[:space:]]*/, "")}1' file
some random message1 : abc, xyz
some random message2

Extract substring from a string using awk

Answers (2)

Related Questions