Reputation: 2545
I am asking for your help with sed. I need to remove duplicate underscores and underscores from beginning and end of string.
For example:
echo '[Lorem] ~ ipsum *dolor* sit metus !!!' | sed 's/[^ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789._()-]/_/g'
Produces:
_Lorem____ipsum__dolor__sit_metus____
But I need to further format this string to: Lorem_ipsum_dolor_sit_metus
In other words, remove any underscores from beginning and end of string, and reduce multiple consecutive underscore symbols into just one, preferably using another pipes.
Do you have any idea how to do that?
Thank you.
Upvotes: 2
Views: 7398
Reputation: 360143
All you need to do is add a "+" after your bracket expression to eliminate runs of multiple underscores. Then you can delete the beginning and ending ones. Also, as ladenedge suggested, you can use a character class to shorten your list.
sed 's/[^[:alnum:].()-]\+/_/g;s/^_\(.*\)_$/\1/'
Upvotes: 1
Reputation: 67839
Just add ;s/__*/_/g;s/^_//;s/_$//
just after g
in your sed command.
Upvotes: 3