Reputation: 9273
I am trying to match roman numerals from test strings like:
Series Name.disk_V.Episode_XI.Episode_name.avi
Series Name.Season V.Episode XI.Part XXV.Episode_name.avi
and a real-world example in which the XIII should not match:
XIII: The Series season II episode V.mp4
Following the logic in this fantastic thread and many experiments in an online regex debugger I came up with this:
(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s._-]\KM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})(?=[\s._-])
The last example returns two matches, "II" and "V", ignoring the XIII in the name part. Yay!
So then I tried it in a Swift playground:
let file = "Series Name.disk_V.Episode_XI.Episode_name.avi"
let p = #"(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s._-]\KM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})(?=[\s._-])"#
let r = try NSRegularExpression(pattern: p, options: [.caseInsensitive])
let nsString = file as NSString
let results = r.matches(in: suggestion, options: [], range: NSMakeRange(0, nsString.length))
The pattern parses without error but returns no matches. I found that it works if I remove the \K
, although that leaves the leading separator in the match. According to this thread, Obj-C (which I assume means NSRegex) supports \K
, so I'm not sure why this fails.
There are a number of similar-sounding threads here on SO, but they invariably have to do with patterns that fail to parse, mostly due to escaping. This is not the case here, it parses fine and I can see the pattern is correct (ie, no double-slashes) if you print(r)
. It just doesn't match.
Can anyone offer some insight or an alternative regex that does not use \K?
Upvotes: 1
Views: 44
Reputation: 9273
TheFourthBird's idea is the solution. I modified the pattern by removing the \K and making the entire roman section a named group:
(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s._-](?<roman>M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3}))(?=[\s._-])
To parse it, everything as above to start but then look for the matching items like this:
for result in results {
let nameRange = result.range(withName: "roman")
print(nsString.substring(with: nameRange))
}
Output:
V
XI
Bingo!
Upvotes: 1