Reputation: 573
I have a file full of lines extracted from an XML file using "gsed regexp -i FILENAME". The lines in the file are all of one of either format:
<field number='1' name='Account' type='STRING'W/>
<field number='2' name='AdvId' type='STRING'W>
I've inserted a 'W' in the end which represents optional whitespace. The order and number of properties are not necessarily the same in all lines throughout the file although "number" is always before "type".
What I'm searching for is a regular expression "regexp" that I can give to gnu sed so that this command:
gsed regexp -i FILENAME
gives me a file with lines looking like this:
1 STRING
2 STRING
I don't care about the amount of whitespace in the result as long as there is some after the number and a newline at the end of each line.
I'm sure it is possible, but I just can't figure out how in a reasonable amount of time. Can anyone help?
Thanks a lot, jules
Upvotes: 0
Views: 275
Reputation: 12333
I'm sure this can be optimized, but it works for me and answers your question:
sed "s/^.*number='\([0-9]*\)'.*type='\(.*\)'.*$/\1 \2/" <filename>
Saying that, I think the others are right, if you have an XML-file you should use an XML-parser.
Upvotes: 1
Reputation: 75458
sed -ni "/<field .*>/s@^.*[[:space:]]number='\\([^']\\+\\).*[[:space:]]type='\\([^']\\+\\).*@\1 \2@p" FILENAME
Or if you don't mind contents of number and type to be optional:
sed -ni "/<field .*>/s@^.*[[:space:]]number='\\([^']*\\).*[[:space:]]type='\\([^']*\\).*@\1 \2@p" FILENAME
Just change from [^']\\+
to [^']*
at your preference.
Upvotes: 0
Reputation: 666
You would be better off using an XML parser, but if you had to use sed:
sed 's/<field number=\'(.*?)\'.*?type=\'(.*?)\'/\1 \2
Upvotes: 0
Reputation: 89547
You can use this:
sed -r "s/<field [^>]*?number='([0-9]+)'[^>]*?type='([^']+)'[^>]*>/\1 \2/"
Upvotes: 0
Reputation: 2116
Simple cut should work for you:
cut -f2,6 -d"'" --output-delimiter=" "
If you really want sed:
sed -r "s/.'(.)'.type='(.)'.*/\1 \2/"
Upvotes: 0
Reputation: 241768
Using xsh, a Perl wrapper around XML::LibXML:
open file.xml ;
for //field echo @number @type ;
Upvotes: 2
Reputation: 272237
I think you're much better off using a command line XML tool such as XMLStarlet. That will integrate well with the shell and let you perform XPath searches. It's XML-aware so it'll handle character encodings, whitespace correctly etc.
Upvotes: 1