Niwatori
Niwatori

Reputation: 23

how to replace repetitive string of variable length with another string in bash?

I have files where missing data is inserted as '+'. So lines look like this:

substring1+++++substring2++++++++++++++substring3+substring4

I wanna replace all repetitions of '+' >5 with 'MISSING'. This makes it more readable for my team and makes it easier to see the difference between missing data and data entered as '+' (up to 5 is allowed). So far I have:

while read l; do
  echo "${l//['([+])\1{5}']/'MISSING'}"
done < /path/file.txt

but this replaces every '+' with 'MISSING'. I need it to say 'MISSING' just once.

Thanks in advance.

Upvotes: 2

Views: 56

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627103

You can't use regex in Bash variable expansion.

In your loop, you may use

sed 's/+\{1,\}/MISSING/g' <<< "$l"

Or, you may use sed directly on the file

sed 's/+\{1,\}/MISSING/g' /path/file.txt

The +\{1,\} POSIX BRE pattern matches a literal + (+) 1 or more times (\{1,\}).

See the sed demo online

sed 's/+\{1,\}/MISSING/g' <<< "substring1+++++substring2++++++++++++++substring3+substring4"
# => substring1MISSINGsubstring2MISSINGsubstring3MISSINGsubstring4

If you need to make changes to the same file use any technique described at sed edit file in place.

Upvotes: 4

Related Questions