Reputation: 495
I would like to surround multiple words with quotes. Easily done task with sed and grouping.
Except that my words are located in an attribute of an xml tag.
<daddy>
<son name="blabla">
<belongs having="car cat doll" color="yellow" />
</son>
</daddy>
I want having
attribute to be postprocessed to "'car' 'cat' 'doll'"
.
having
is a uniquely affected attribute name.
So, no danger to match only this word, it will automatically be part of a belongs
tag.
I think this is a good start to be able to use sed here, and don't do hard things with heavy tools, and xml readers.
My first attempt was to match the pattern to filter the lines, and try to surround the words. But it surrounds them, matching in the whole line, and not only in the first pattern. Which is what I wanted.
sed "/having=\"[a-z ]\+\"/ s/\([a-z]\+\)/'\1'/g"
.
<daddy>
<son name="blabla">
<'belongs' 'having'="'car' 'cat' 'doll'" 'color'="'yellow'" />
</son>
</daddy>
My second attempt, with group matching led me no more further...
sed "s/havings=\"\(\([a-z]\+\) \?\)*\"/havings=\"'\2'\"/g"
.
<daddy>
<son name="blabla">
<belongs having="'doll'" color="yellow"/>
</son>
</daddy>
Upvotes: 1
Views: 102
Reputation: 495
I decided to give up using only sed... I did something which is awful and tends to produce errors in substitutions... But I will diff my ouputs afterwards.
#!/bin/bash
O=$IFS
# For every file passed in argument
for f in "$@"
do
IFS=$(echo -en "\n\b")
# For every field content
for p in $(egrep -o 'having="[^"]*"' $f | egrep -o '".*"' | grep -v '"e;' | sort -u);
do
# Match every occurrence of this content on the lines of "having" and surround its words
sed "/having/ s/$p/$(echo $p | sed 's/\([a-z]\+\)/\"e;\1\"e;/g')/" $f -i
done
IFS=$O
done
Upvotes: 0
Reputation: 10039
sed ":a
/having/ {
s/\"\(\( *'[^ ]\{1,\}'\)* *\)\([^ '\"]\{1,\}\)\([^\"]*\)\"/\"\1'\3'\4\"/
t a
}" YourFile
replace each group of word (char that are not space or quote or double quote) by itself surrounded by simple quote. use a recursif to change word that are between a double quote after all group of word surrounded by simple quote. This because, option g
cannot be used with back reference, so work around use groupe by taking a big group of all word that are previously quoted, cycling until ther is no more unquoted word
I assume that the content is on 1 line (because of sed default behaviour) and the same line as having
Upvotes: 1