Olivier Pons
Olivier Pons

Reputation: 15778

vim search replace including newline

I've been googling for 3 hours now without success. I have a huge file which is a concatenation of many XML files.

Thus I want to search replace all occurences of <?xml [whatever is between]<body> (including those words).

And then same for </body>[whatever is between]</html> (including those words).

The closest I came from is

:%s/<?xml \(.*\n\)\{0,180\}\/head>//g

FYI If I try this:

:%s/<?xml \(\(.*\)\+\n\)\+\/head>\n//g

I get a E363: pattern uses more memory than 'maxmempattern'. I've tried to follow this without success.

Upvotes: 2

Views: 864

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626804

To match any number of symbols including a newline between <?xml and <body>, you can use

:%s/<?xml \_.*<\/head>//g

The \_.* can be used to match any symbols including a newline. To match as few symbols as possible, use .\{-}: :%s/<?xml \_.\{-}<\/head>//g.

See Vim wiki, Patterns including end-of-line section:

\_.
Any character including a newline

And from the Vim regex help, 4.3 Quantifiers, Greedy and Non-Greedy section:

\{-}
matches 0 or more of the preceding atom, as few as possible

UPDATE

As far as escaping regex metacharacters is concerned, you can refer to Vim Regular Expression Special Characters: To Escape or Not To Escape help page. You can see } is missing on the list. Why? Because a regex engine is usually able to tell what kind of } it is from the context. It knows if it is preceded with \{ or not, and can parse the expression correctly. Thus, there is no reason to escape this closing brace, which keeps the pattern "clean".

Upvotes: 4

Related Questions