Reputation: 11
I have a huge XML file with longer lines (5000-10000 characters per line) with following text:
Pattern="abc"
and I want to replace it with
Pattern="def"
As the line sizes are huge, I have no choice but to use awk. Please suggest how this can be achieved. I tried with the below but it is not working:
CMD="{sub(\"Pattern=\"abc\"\",\"Pattern=\"def\"\"); print}"
echo "$CMD"
awk "$CMD" "Some File Name.xml"
Any help is highly appreciated.
Upvotes: 1
Views: 435
Reputation: 203491
I don't understand why you said "As the line sizes are huge, I have no choice but to use awk". AFAIK sed is no more limited on line length than awk is and since this is a simple substitution on a single line, sed is the better choice of tool:
$ cat file
Pattern="abc"
$ sed -r 's/(Pattern=")[^"]+/\1def/' file
Pattern="def"
If the pattern occurs multiple times on the line, add a "g" to the end of the line.
Since you mention in your comment being stuck with a sed that can't handle long lines, let's assume you can't install GNU tools so you'll need a non-GNU awk solution like this:
$ awk '{sub(/Pattern="[^"]+/,"Pattern=\"def")}1' file
Pattern="def"
If you LITERALLY mean you only want to replace Pattern="abc"
then just do:
$ awk '{sub(/Pattern="abc"/,"Pattern=\"def\"")}1' file
Pattern="def"
Upvotes: 2
Reputation: 3756
one suggestion with awk
BEGIN {FS="\""; OFS=""}
/Pattern="abc"/{$2="\"def\""}1
Upvotes: 2
Reputation: 7610
If You have bash you can try this:
Create file with long lines (>10_000 chars):
for((i=0;i<2500;++i));{ s="x$s";}
l="${s}Pattern=\"abc\"$s"
for i in {1..5}; { echo "$l$l";} >infile
The script:
while read x; do echo "${x//Pattern=\"abc\"/Pattern=\"def\"}";done <infile
This replaces all occurrences of Pattern="abc"
to Pattern="def"
in each line.
Upvotes: 0