Reputation: 23
I have files where missing data is inserted as '+'. So lines look like this:
substring1+++++substring2++++++++++++++substring3+substring4
I wanna replace all repetitions of '+' >5 with 'MISSING'. This makes it more readable for my team and makes it easier to see the difference between missing data and data entered as '+' (up to 5 is allowed). So far I have:
while read l; do
echo "${l//['([+])\1{5}']/'MISSING'}"
done < /path/file.txt
but this replaces every '+' with 'MISSING'. I need it to say 'MISSING' just once.
Thanks in advance.
Upvotes: 2
Views: 56
Reputation: 627103
You can't use regex in Bash variable expansion.
In your loop, you may use
sed 's/+\{1,\}/MISSING/g' <<< "$l"
Or, you may use sed
directly on the file
sed 's/+\{1,\}/MISSING/g' /path/file.txt
The +\{1,\}
POSIX BRE pattern matches a literal +
(+
) 1 or more times (\{1,\}
).
See the sed
demo online
sed 's/+\{1,\}/MISSING/g' <<< "substring1+++++substring2++++++++++++++substring3+substring4"
# => substring1MISSINGsubstring2MISSINGsubstring3MISSINGsubstring4
If you need to make changes to the same file use any technique described at sed edit file in place.
Upvotes: 4