Reputation: 23
I am very new to Regex, and I searched a long time for the equivilants in javascript, I would love it is somebody responded with a detailed explanation of the regex in javascript, converted from python.
import re
regex = r"""
^(
(?P<ShowNameA>.*[^ (_.]) # Show name
[ (_.]+
( # Year with possible Season and Episode
(?P<ShowYearA>\d{4})
([ (_.]+S(?P<SeasonA>\d{1,2})E(?P<EpisodeA>\d{1,2}))?
| # Season and Episode only
(?<!\d{4}[ (_.])
S(?P<SeasonB>\d{1,2})E(?P<EpisodeB>\d{1,2})
| # Alternate format for episode
(?P<EpisodeC>\d{3})
)
|
# Show name with no other information
(?P<ShowNameB>.+)
)
"""
test_str = ("archer.2009.S04E13\n"
"space 1999 1975\n"
"Space: 1999 (1975)\n"
"Space.1999.1975.S01E01\n"
"space 1999.(1975)\n"
"The.4400.204.mkv\n"
"space 1999 (1975)\n"
"v.2009.S01E13.the.title.avi\n"
"Teen.wolf.S04E12.HDTV.x264\n"
"Se7en\n"
"Se7en.(1995).avi\n"
"How to train your dragon 2\n"
"10,000BC (2010)")
matches = re.finditer(regex, test_str, re.MULTILINE | re.VERBOSE)
for matchNum, match in enumerate(matches):
matchNum = matchNum + 1
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
Upvotes: 1
Views: 7198
Reputation: 2627
Sadly there is no easy way to covert Python regex to Javascript regex because Python regex is much more robust than Javascript regex.
Javascript is missing functional things like negative look behinds and recursion, but it misses many more syntactical tools like verbose syntax and named capturing groups.
regular capture group = ()
named capture group = (?P<ThisIsAName>)
verbose regex = 'find me #this regex ignores comments and whitespace'
non verbose regex = 'this treats whitespace literally'
So if we convert your named capture groups to regular (numbered) capture groups
And if we convert the verbose syntax into regular syntax.
Then that regex would be valid Javascript regex, which, in Javascript would look like:
regex =
/^((.*[^ (_.])[ (_.]+((\d{4})([ (_.]+S(\d{1,2})E(\d{1,2}))?|(?<!\d{4}[ (_.])S(\d{1,2})E(\d{1,2})|(\d{3}))|(.+))/
// group 2 = ShowNameA
// group 4 = ShowYearA
// group 6 = SeasonB
// group 7 = EpisodeC
// group 8 = ShowNameB
As you can see the Javascript version is pretty ugly because it does not have the verbose syntax or named capture groups. However in this case is functional equivalent.
Javascript does not have a direct equivalent to findall so you'll have to make/find an equivalent to that. Here is an article explaining several such ways. https://www.activestate.com/blog/2008/04/javascript-refindall-workalike
In the future I also highly recommend going to regexr.com to learn regex, specifically javascript regex.
Upvotes: 3