Reputation: 763
I need to find a string that contains "script" with as many characters before or after, and enclosed in <
and >
. I can do this with:<*script.*>
I also want to match only when that string is NOT followed by a <
The closest I've come, so far, is with this: (<*script.*>)([^=?<*]*)$
However, that will fail for something like <script></script>
because the last >
isn't followed by a <
(so it doesn't match).
How can I check if only the the first >
is followed by <
or not?
For example,
<script> abc () ; </script>
MATCH
<< ScriPT >abc (”XXX”);//<</ ScriPT >
MATCH
<script></script>
DON'T MATCH
And, a case that I still am working on:
<script/script>
DON'T MATCH
Thanks!
Upvotes: 1
Views: 118
Reputation: 376
You were close with your Regex. You just needed to make your first query non-greedy using a ? after the second *. Try this out:
(?i)<*\s*script.*?>[^<]+<*[^>]+>
There is an app called Expresso that really helps with designing Regex strings. Give it a shot.
Explanation: Without the ? non-greedy argument, your second * before the first > makes the search go all the way to the end of the string and grab the > at the end right at that point. None of the other stuff in your query was even being looked at.
EDIT: Added (?i) at the beginning for case-insensitivity. If you want a javascript specific case-insensitive regex, you would do that like this:
/<*\s*script.*?>[^<]+<*[^>]+>/i
I noticed you have parenthesis in your regex to make groups but you didn't specifically say you were trying to capture groups. Do you want to capture what's between the <script>
and </script>
? If so, that would be:
/<*\s*script.*?>([^<]+)<*[^>]+>/i
Upvotes: 2
Reputation: 5274
If I understand what you are looking for give this a try:
regex = "<\s*script\s*>([^<]+)<"
Here is an example in Python:
import re
textlist = ["<script>show this</script>","<script></script>"]
regex = "<\s*script\s*>([^<]+)"
for text in textlist:
thematch = re.search(regex, text, re.IGNORECASE)
if thematch:
print ("match found:")
print (thematch.group(1))
else:
print ("no match sir!")
Explanation: start with < then possible spaces, the word script, possible spaces, a > then capture all (at least 1) non < and make sure that's followed by a <
Hope that helps!
Upvotes: 1
Reputation: 1642
This would be better solved by using substring() and/or indexOf() JavaScript methods
Upvotes: -1