Reputation: 20223
I have an issue I am trying to solve with sed. My goal is to quote a the content after content= if the content is not already quoted.
Here is the concrete example:
<meta name="ProgId" content=Word.Document>
<meta name="Generator" content="Microsoft Word 15">
I would like to add quotes around Word.Document so at the end have:
<meta name="ProgId" content="Word.Document">
<meta name="Generator" content="Microsoft Word 15">
I was trying with
sed -i 's@content="\(.*\)"@content="\1"/@g' "$1"
However this is not working.
Thank you.
Upvotes: 0
Views: 39
Reputation: 977
This should work:
sed -E 's/content=([^">]+)/content="\1"/'
Explanation:
In this way, you tell sed to substitute everything is after content=
and before >
only if it doesn't start with "
. I used regex grouping to replace the content with itself surrounded by "
.
Input:
<meta name="ProgId" content=Word.Document>
<meta name="Generator" content="Microsoft Word 15">
Output:
<meta name="ProgId" content="Word.Document">
<meta name="Generator" content="Microsoft Word 15">
Upvotes: 1
Reputation: 140890
There is no "
in the input behind content=
, so you shouldn't match it. You could match up until a space or >
.
sed 's@content=\([^"][^ >]*\)@content="\1"@'
Note that you should use XML aware tools to parse XML documents.
Upvotes: 1