Reputation: 2064
I am trying to find a valid regular expression that i can use to strip out all the white spaces or new line characters.
Below is something I tried.
((\s|\n|\r)?<(\s|\n|\r)?)|(\s|\n|\r)?>(\s|\n|\r)
on this document
< tag src="abc" testattribute >
<script > any script </script >
<tag2>what is this </tag2>
<tag>
I want the end result to be exactly this.
<tag src="abc" testattribute><script>any script</script><tag2>what is this</tag2><tag>
Upvotes: 0
Views: 73
Reputation: 70732
You can simply use \s
here to match for whitespace.
\s matches whitespace (\n, \r, \t, \f, and " ")
Depending on the language you are using, you can use assertions for this.
(?<=<|>)\s*|(?<!>|<)\s*(?![^><])
See live demo
Regular expression:
(?<= look behind to see if there is:
< '<'
| OR
> '>'
) end of look-behind
\s* whitespace (\n, \r, \t, \f, and " ") (0 or more times)
| OR
(?<! look behind to see if there is not:
> '>'
| OR
< '<'
) end of look-behind
\s* whitespace (\n, \r, \t, \f, and " ") (0 or more times)
(?! look ahead to see if there is not:
[^><] any character except: '>', '<'
) end of look-ahead
Upvotes: 2