Reputation: 5419
I have an XML file with a bunch of <
, >
characters, accidentally, and I need to replace them with <
and >
. What kind of regex can select <
,>
, and ignore any string of the form <[any word]>
? It may not be possible, if so, regex that just ignores strings of the form <Abstract>
are also great.
Thanks
Upvotes: 2
Views: 201
Reputation: 5556
You can try this as a good start: /<(?![a-z\/])|(?<![a-z])>/g
.
See it working here: https://regex101.com/r/YPNEMU/1.
It will actually match every occurence of <
and >
that are not directly preceded by a letter or followed by either a letter or /
.
Now remain to match also if just next to a letter but missing opening or closing the tag!
[EDIT] improve regex
This one goes further with matching also <
occurences that are directly followed by a letter but non closing tag: /<(?![a-z\/][a-z\/ ]*?>)|(?<![a-z])>/g
See it working here: https://regex101.com/r/YPNEMU/2
[EDIT] best solution
I found it using (*SKIP)(*FAIL)
!
/(<[a-z\/][^<>]*?>)(*SKIP)(*FAIL)|[<>]/g
.
See it working here: https://regex101.com/r/YPNEMU/3
Upvotes: 1