Reputation: 27

How do I insert a string right after pattern in awk

I want to insert a string right after pattern match without specifying the field (if possible) since the regex can be at unknown position in file.txt.

file.txt

This is *regex* in 3rd field
*regex* in first field
another line with *regex*

I want the output to be something like this

This is regex *new string* in 3rd field
regex *new string* in first field
another line with regex *new string*

This is what I have tried so far

awk ' /regex/ { print $1, $2, $3 " new string", $4, $5, $6} ' file.txt

however, I know it isn't the suitable way of doing this.

I'm new to awk, by the way.

Upvotes: 1

Answers (4)

RARE Kpop Manifesto

Reputation: 2807

if `regex` as a fixed string actually exists in the input :

 echo 'This is *regex* in 3rd field
*regex* in first field
another line with *regex*' |

 nawk NF=NF FS='[*]regex[*]' OFS='regex new string'

This is regex new string in 3rd field
regex new string in first field
another line with regex new string

if there are no edge cases to deal with, like having `"Petersburg"` within the input, even for mixed case input

echo 'Peter is a man
it was Peter who first invented spam
peter is a naughty boy'

 gawk NF=NF FS='eter' OFS='eter Parker'

Peter Parker is a man
it was Peter Parker who first invented spam
peter Parker is a naughty boy

this works for all instances except when the matching pattern is exactly at end of line, since 2nd `gsub()` in the main code pads on an extra trailing space that needs to be removed via another call to `sub()`

 echo 'Peter is a man
it was Peter who first invented spam Petersburg
peter is a naughty boy' |

 mawk 'gsub("[Pp]eter( |$)", "&\5Parker \5")^_ + gsub("[ ]?\5"," ")'

Peter Parker is a man
it was Peter Parker who first invented spam Petersburg
peter Parker is a naughty boy

Upvotes: 1

Gilles Quénot

Reputation: 184975

Like this, using `field` as required

 awk '/regex/{for (i=1; i<=NF; i++) if ($i == "regex") $i=$i" new string"}1' file

Output

This is regex new string in 3rd field
regex new string in first field
another line with regex new string

Note: if the `` are literal in your input, use `($i == "regex*")`

Upvotes: 1

Ted Lyngmo

Reputation: 117178

You can use sub (or gsub if you want multiple replacements on the same line) to make a regex substitution on the whole line ($0):

awk '{ sub(/\*regex\*/, "regex *new string*") }1'

Output

This is regex *new string* in 3rd field
regex *new string* in first field
another line with regex *new string*

The same thing using sed:

sed 's/\*regex\*/regex *new string*/'   # sub
sed 's/\*regex\*/regex *new string*/g'  # gsub

Upvotes: 3

Dave Pritlove

Reputation: 2687

The value of the matched text can be back referenced in a gsub replacement using the & operator, allowing the match to be replaced with a copy of itself followed by the field separator and the required new string:

test file (peter.txt)

Peter is a man
it was Peter who first invented spam
peter is a naughty boy

(note Peter occurs with both upper and lower case P)

awk command

awk '{gsub(/[Pp]eter/, "& new string", $0)}1' peter.txt

The & is substituted for the value that matched the regex

output

Peter new string is a man
it was Peter new string who first invented spam
peter new string is a naughty boy

The awk command could be refined to allow the replacement text to be passed as an argument:

awk -v 'txt=Parker' '{gsub(/[Pp]eter/, "& "txt, $0)}1' peter.txt

output

Peter Parker is a man
it was Peter Parker who first invented spam
peter Parker is a naughty boy

Upvotes: 2

How do I insert a string right after pattern in awk

Answers (4)

if *regex* as a fixed string actually exists in the input :

if there are no edge cases to deal with, like having "Petersburg" within the input, even for mixed case input

this works for all instances except when the matching pattern is exactly at end of line, since 2nd gsub() in the main code pads on an extra trailing space that needs to be removed via another call to sub()

Like this, using field as required

Output

Note: if the * are literal in your input, use ($i == "*regex*")

Related Questions

if `regex` as a fixed string actually exists in the input :

if there are no edge cases to deal with, like having `"Petersburg"` within the input, even for mixed case input

this works for all instances except when the matching pattern is exactly at end of line, since 2nd `gsub()` in the main code pads on an extra trailing space that needs to be removed via another call to `sub()`

Like this, using `field` as required

Note: if the `` are literal in your input, use `($i == "regex*")`