Reyan Tropia
Reyan Tropia

Reputation: 140

Regex Including the next occurence of word

The regex works perfectly but the problem is it also include the next occurrence instead of ending with the first occurrence then start again from the

Regex : (?=<appView)\s{0,1}(.*)(?<=<\/appView>)

String: <appView></appView> <appView></appView>

But my problem is it eat matches the whole word like

(Match 1)<appView></appView> <appView></appView>

I want it to search the group differently but i cant make it work.

Desired output : (Match 1) <appView></appView> (Match 2)<appView></appView>

Upvotes: 0

Views: 42

Answers (2)

Sumurai8
Sumurai8

Reputation: 20737

I fully recommend to switch from regex to an actual sequential xml parser. Regex is aweful for parsing xml based files, for example because of the problems below.

That said, you can "fix" your regex by using ([^<>]*). This will match all characters without < or >, which will make sure that no other tags are nested inside. If done with all tags, you cannot match something like <appview><unclosedTag></appView>, because it is invalid. If you can be certain that the structure is correct, this is slightly less of an issue.

Another problem your approach has is that if you have nested tags like so: <appView> something <appView> something else </appView> else </appView>, your approach will make you end up with [replaced] else </appView>.

Upvotes: 0

mickmackusa
mickmackusa

Reputation: 47894

\s{0,1} equals \s? You need to use (.*?) to be lazy instead of (.*)

Use this pattern: ~(?=<appView)\s?(.*?)(?<=</appView>)~

Demo Link

*note, you don't have to escape / in the closing tag if you use something other than a slash as your pattern delimiter. I am using ~ at the beginning and end of my pattern to avoid escaping.

Upvotes: 1

Related Questions