Reputation: 51
I have an XML file like this:
<fruit><apple>100</apple><banana>200</banana></fruit>
<fruit><apple>150</apple><banana>250</banana></fruit>
Now I want delete all the text in the file except the words in tag apple. That is, the file should contain:
100
150
How can I achive this?
Upvotes: 2
Views: 219
Reputation: 28934
In this case, one can use the general technique for collecting pattern matches explained in my answer to the question "How to extract regex matches using Vim".
In order to collect and store all of the matches in a list, run the Ex command
:let t=[] | %s/<apple>\(.\{-}\)<\/apple>\zs/\=add(t,submatch(1))[1:0]/g
The command purposely does not change the buffer's contents, only collects the matched text. To set the contents of the current buffer to the newline-separated list of matches, use the command
:0pu=t | +,$d_
Upvotes: -2
Reputation: 5277
I personally use this:
%s;.*<apple>\(\d*\)</apple>.*;\1;
Since the text contain '/' which is the default seperator,and by using ';' as sep makes the code clearer. And I found that non-greedy match @Conspicuous Compiler mentioned should be
\{-}
instead of "{-}" in Vim. However, I after change Conspicuous' solution to
%s/.*apple>(.\{-\})<\/apple.*/\1^M/g
my Vim said it can't find the pattern.
Upvotes: 0
Reputation: 8125
:%s/.*apple>\(.*\)<\/apple.*/\1/
That should do what you need. Worked for me.
Basically just grabbing everything up to and including the tag, then backreferences everything between the apple begin and end tag, and matches to the rest of the line. Replaces it with the first backreference, which was the stuff between the apple tags.
Upvotes: 5