Sharad Shrestha
Sharad Shrestha

Reputation: 169

Replace each occurrence of word that starts with # from a file

Replace words or strings that starts with # from a file.

For example here is the file with following lines:

$ cat file.txt
#apple sd#kf #banana adfe
#apple we#re #banana cow

here is the expected output

$ cat output.txt 
fruit=apple sd#kf fruit=banana adfe
fruit=apple we#re fruit=banana cow

when I use following

awk '{for (i=1; i<=NF; i++) $i~/^#/ && $i="fruit="$i }1'

I get following output:

fruit=#apple sd#kf fruit=#banana adfe
fruit=#apple we#re fruit=#banana cow

when I use following

sed 's/^#/fruit=/g' 

I get

fruit=apple sd#kf #banana adfe
fruit=apple we#re #banana cow

How do I get expected result using awk, sed, grep?

Upvotes: 1

Views: 183

Answers (4)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2855

echo "${input_data...}" | 

{m,g}awk 'sub("^ ?",_, $!(NF=NF) )' FS='(^|[ \t])[#]' OFS=' fruit='    

_

fruit=apple sd#kf fruit=banana adfe
fruit=apple we#re fruit=banana cow
 

Idea is to use FS + OFS to handle the bulk of the situations, and only 1 sub() call afterwards to trim out any excess spaces created at the leading edge.

The question mark in the regex "^ ?" allows for rows that didn't get any replacements at all to still get printed.

No back-references needed.

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163457

The ^ asserts the start of the string, that is why you only see a single replacement.

You might replace the ^ by \B to assert a non word boundary.

sed -E 's/\B#/fruit=/g' file

Note that in this case it can also match a single #

If there should be a word character following, you can use a capture group and match at least a single word character.

sed -E 's/\B#([[:alnum:]_])/fruit=\1/g' file

Upvotes: 5

markp-fuso
markp-fuso

Reputation: 34916

One awk idea:

$ awk '{for (i=1;i<=NF;i++) if ($i ~ /^#[[:alpha:]]/) $i="fruit=" substr($i,2)}1' file.txt
fruit=apple sd#kf fruit=banana adfe
fruit=apple we#re fruit=banana cow

One sed idea:

$ sed -r 's/(^|[[:space:]])#([[:alpha:]])/\1fruit=\2/g' file.txt
fruit=apple sd#kf fruit=banana adfe
fruit=apple we#re fruit=banana cow

Upvotes: 3

potong
potong

Reputation: 58473

This might work for you (GNU sed):

sed -E 's/(^|\s)#(\S)/\1fruit=\2/g' file

If a # is at the beginning of a line or after whitespace and is followed by non-whitespace, replace it by fruit=.

Upvotes: 3

Related Questions