Reputation: 103
I'm dealing with a body of XML files containing unstructured texts with semantic markup for personal names.
For reasons to do with the stylesheet that will eventually show them via a web application, I need to replace:
<persName>Fred</persName>'s
<persName>Wilma</persName>'s
with
<persName>Fred's</persName>
<persName>Wilma's</persName>
I have a single line in a shell script, being run in Gitbash for Windows, below. It runs OK, but has no effect. I suppose I'm missing something obvious, perhaps to do with escaping characters, but any help appreciated.
sed -i "s/<\/persName>\'s/\'s<\/persName>/g" test.xml
Upvotes: 1
Views: 128
Reputation: 627083
You may use
sed -i "s,</persName>'s,'s</persName>,g" test.xml
Details
s
- we want to replace,
- a delimiter</persName>'s
- this string to find,
- delimiter's</persName>
- replace with this string,
- delimiterg
- multiple times if more than one is foundThe -i
option makes the replacements directly in the file.
Note that you do not have to escape '
when defining the sed command inside a double quoted string.
It is a good idea to use a delimiter char other than the common /
if there are /
chars inside the regex or/and replacement pattern.
Upvotes: 4
Reputation: 51908
The comment on your question suggests an easier solution, but I guess, that there might be names where the suffix 's
differs, like names ending with an s
. So I chose a solution where you grab what's right and put it in the middle.
As separator for the search and replace command in sed you can choose whatever you want. I've chosen #
, so you don't have to escape the backslashes in the text. The escaped parantheses store what's inside in variables \1
and \2
.
sed 's#<persName>\(.*\)</persName>\(.*\)#<persName>\1\2</persName>#g' testfile
Result:
<persName>Fred's</persName>
<persName>Wilma's</persName>
If you want to replace it in file, you can use the -i
parameter. But be sure to check the result first.
Upvotes: 1