timbit
timbit

Reputation: 353

Ignore groups while matching multiple Regex Patterns in Scala

My goal is to match several regex patterns to the same string in an elegant way. I understand that it is necessary to use groups for this type of Regex matching and that, in order to use matching functionality below, I need to explicitly capture each of those groups in the case SomePattern(_,_) statement (e.g. two groups, would require two _ in the case statement).

import scala.util.matching.UnanchoredRegex

val regexPattern1 = "(Some)|(Pattern)".r.unanchored
val regexPattern2 = "(That)|(Other)|(pattern)".r.unanchored
val regexPattern3 = "(A)|(Whole)|(Different)|(One)".r.unanchored

"Some string to match patterns against" match {
    case regexPattern1(_,_) => 1
    case regexPattern2(_,_,_) => 2
    case regexPattern3(_,_,_,_) => 3
}

Now, I have the following considerations:

  1. As may have become evident from the use of underscores, I don't care about catching the specific groups of the pattern, just any first match.
  2. My actual patterns are very complex, so for manageability, I would prefer to keep them as separate UnanchoredRegex objects, instead of treating them as different capture groups in the same regex pattern.
  3. Because of this complexity (nested groups), it can be hard to keep track of the amount of capture groups to put in the case SomePattern(_,_,...n) statement. If I don't get this right, the pattern will of course fail silently. This makes it annoying to update or tweak my patterns, and to subsequently debug the regex matching.
  4. I like the conciseness and elegance of the syntax above, by matching once against several patterns so I would prefer to retain this, as opposed to writing a match/if clause for each pattern.

Now, for my question: Is there a way to retain the syntax above, dispensing with the (to my purpose) useless _,_,_,... part, matching any first hit instead?

Upvotes: 1

Views: 517

Answers (1)

Kolmar
Kolmar

Reputation: 14224

Regex class implements matching with unapplySeq. This means, that you can ignore every group with a _* pattern:

"Some string to match patterns against" match {
    case regexPattern1(_*) => 1
    case regexPattern2(_*) => 2
    case regexPattern3(_*) => 3
}

Upvotes: 3

Related Questions