user4540334
user4540334

Reputation:

Bash SED string replacement - removes characters before and after Regex

I have this simple bash (3) script to scan through all the files in the directory and replace some old CSS classes with new ones.

export LC_ALL=C

ARRAY=(
    "a-oldclass:new-class"
    "m-oldclass:new-class"
)

for className in "${ARRAY[@]}" ; do
    REGEX=[^a-zA-Z0-9]${className%%:*}[^a-zA-Z0-9]
    CHANGE="s/${REGEX}/${className##*:}/g"

    find src -type f -exec sed -i '' "${CHANGE}" '{}' +
done

It is a combination of key:value pairs and a regular expression. The problem is that it also removes special characters before and after the matching pattern, like:

class="a-oldclass" => class=new-class (Quotes are gone)

class=" a-oldclass " => class="new-class" (spaces are gone)

I need this outcome:

class="a-oldclass m-oldclass" => class="new-class new-class".

[^a-zA-Z0-9] is necessary to avoid this scenario: I want to replace a-oldclass with new-class, but I don't want to touch class data-oldclass. Since this string contains a-oldclass it would be modified. So with [^a-zA-Z0-9] I exclude this kind of scenarios.

Upvotes: 0

Views: 98

Answers (1)

Quasímodo
Quasímodo

Reputation: 4004

This should be the regular expression:

REGEX='\([^a-zA-Z0-9]\)'"${className%%:*}"'\([^a-zA-Z0-9]\)'
CHANGE="s/${REGEX}/\1${className##*:}\2/g"

This uses \( \) and \1 \2 to reproduce the matches before and after the classname.

Additionally, I recommend against using all-capital-variables as they may conflict with BASH default variables.


In case you also need to match newline terminated strings, you can add

REGEX='\([^a-zA-Z0-9]\)'"${className%%:*}"'\([^a-zA-Z0-9]\)'
CHANGE="s/${REGEX}/\1${className##*:}\2/g"
REGEXNL='\([^a-zA-Z0-9]\)'"${className%%:*}"'$'
CHANGENL="s/${REGEXNL}/\1${className##*:}/g"

and change the sed command to

sed -i -e "${CHANGE}" -e "${CHANGENL}"

I bet there is a more elegant solution, but this sed survived the -posix test.

Upvotes: 1

Related Questions