Reputation: 6729
I want to convert this piece of xml:
<v1:table>
<v1:tr>
<v1:td>Apples</v1:td>
<v1:td>Bananas</v1:td>
</v1:tr>
</v1:table>
into the following by removing the namespace prefixes (i.e. v1
) and get the following by using sed:
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
Is it possible?
EDIT: I also want to state that the xml is kept in a file.
Upvotes: 3
Views: 1570
Reputation: 52181
Here's how you could do it with hxpipe
and hxunpipe
from the W3C HTML-XML-utils (packaged for many distributions):
$ hxpipe infile | sed 's/^\([()]\)v1:/\1/g' | hxunpipe
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
hxpipe
parses XML/HTML and turns it into an awk/sed-friendly line based format:
$ hxpipe infile
(v1:table
-\n
(v1:tr
-\n
(v1:td
-Apples
)v1:td
-\n
(v1:td
-Bananas
)v1:td
-\n
)v1:tr
-\n
)v1:table
-\n
where lines starting with (
and )
are opening and closing tags, so removing the first v1:
from lines starting with (
or )
(which is what the sed command above does) achieves the desired effect. Notice that text lines start with a -
, so there can't be any false positives.
Upvotes: 4
Reputation: 785316
This sed works for your example:
sed -E 's~(</?)v1:~\1~g' file
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
However just a note that sed
is not the best tool for parsing HTML/XML. Consider using HTML parsers.
Upvotes: 1