Reputation: 1365
Trying to reformat tags in an xlm file with gnu sed v4.7 on win10 (shoot me). sed
is in the path and run from the Command Prompt. Need to escape some windows command-line characters with ^
.
sourcefile
BEGIN
...
<trn:description>V7906 03/11 ALFREDOCAMEL HATSWOOD 74564500125</trn:description>
...
END
(There are three spaces at the start of the line.)
Expected output:
BEGIN
...
<trn:description>V7906 03/11 Alfredocamel Hatswood 74564500125</trn:description>
...
END
I want Title Case but this does in-place to lower case:
sed -i 's/^<trn:description^>\(.*\)^<\/trn:description^>$/^<trn:description^>\L\1^<\/trn:description^>/g' sourcefile
This command changes to Title Case:
sed 's/.*/\L^&/; s/\w*/\u^&/g' sourcefile
Can this be brought together as a one-liner to edit the original sourcefile in-place?
I want to use sed because it is available on the system and the code is consistently structured. I'm aware I should use a tool like xmlstarlet as explained:
sed ... code can't distinguish a comment that talks about sessionId tags from a real sessionId tag; can't recognize element encodings; can't deal with unexpected attributes being present on your tag; etc.
Upvotes: 0
Views: 161
Reputation: 1365
Thanks to Whirlpool Forum members for the answer and discussion.
It was too hard to achieve pattern matching "within the tags" in sed and the file was well formed so the required lines were changed:
sed -i.bak '/^<trn:description^>/s/\w\+/\L\u^&/g; s/^&.*;\^|Trn:Description/\L^&/g' filename
Explanation
.bak
extension<trn:description>
&
and ending with ;
or Trn:Description
filename
Note: ^
is windows escape character and is not required in other implementations
Upvotes: 0