Thiago Alexandre
Thiago Alexandre

Reputation: 21

Scala list of regular expressions with pattern matching

I have some regular expressions that are stored in variables and a matching operation that returns a string depending on the pattern matched (displayed below). I'd like to convert these variables into an Array or a List of Regex, so they can be indexed and I can get the correspond groups and return the appropriate result.

Existing code:

def parseString(str : String) : String = {

    val ptrn2 = """/foo/bar/find/(apple|orange)/(\w+)$""".r
    val ptrn3 = """/foo/bar/(fruit|vegetable)/(banana|carrot)/(\w+)$""".r
    val ptrn4 = """/foo/bar/(volume|size)/(\d+)/(\d+)/(\d+)$""".r
    // more variables

    val response =  str match {
        case ptrn2(a,b) => "/foo/bar/"+ a +"/{id}"
        case ptrn3(a,b,c) => "/foo/bar/"+ a +"/" + b + "/{ref}"
        case ptrn4(a,b,c,d) => "/foo/bar/"+ a +"/" + (b.toInt*c.toInt*d.toInt)
        // more case statements
        case _ => str
     }
     return response 
}

I tried to use the syntax below to access a specific index, passing variables to get the groups, but this is incorrect. What's wrong with that?

 val prtn : List[Regex] = List(new Regex("regex here"),
  new Regex("regex2 here"),
  new Regex("regex3 here"))

 val response =  str match {
        case ptrn(0)(a,b) => "/foo/bar/"+ a +"/{id}"
        case ptrn(1)(a,b,c) => "/foo/bar/"+ a +"/" + b + "/{ref}"
        case ptrn(2)(a,b,c,d) => "/foo/bar/"+ a +"/" + (b.toInt*c.toInt*d.toInt)
        case _ => str
 }

There must be a way to access this via Arrays/List in the match block or it would be even better if a Map returned the appropriate result. Does anyone have any idea how to solve this in Scala?

Upvotes: 0

Views: 1644

Answers (2)

Tzach Zohar
Tzach Zohar

Reputation: 37822

Scala's pattern matching requires the matched expression to be a stable identifier (see definition here). In the first case, where the regular expression is a variable, each pattern is a stable identifier. But an element from a list isn't.

I don't think you can achieve this with Pattern Matching, you'll have to resort to Regex's API that doesn't involve unapply. Also, you'll be able to reduce verbosity by creating a list that not only contains the regular expressions, but also what to do with each one of them.

Here's one potential implementation:

// instead of a simple list of regular expressions, make this a list of Tuples of (regex, builder),
// where the builder is a function from the matched groups (List[String]) to the desired result (String)
val ptrn = List(
  (
    """/foo/bar/find/(apple|orange)/(\w+)$""".r, 
    (groups: List[String]) => s"/foo/bar/${groups.head}/{id}"
  ),
  (
    """/foo/bar/(fruit|vegetable)/(banana|carrot)/(\w+)$""".r, 
    (groups: List[String]) => s"/foo/bar/${groups.head}/${groups(1)}/{ref}"
  ),
  (
    """/foo/bar/(volume|size)/(\d+)/(\d+)/(\d+)$""".r, 
    (groups: List[String]) => s"/foo/bar/${groups.head}/${groups(1).toInt * groups(2).toInt * groups(3).toInt})"
  )
)

// for some input:
val str = "/foo/bar/fruit/banana/split"

// First, flatMap to tuples of (Regex.Match, builder) - 
// the flatMap will "filter out" the ons that didn't match because None results would be lost 
val res = ptrn.flatMap {
  case (reg, builder) => reg.findFirstMatchIn(str).map(m => (m, builder))
}.headOption.map { // then, take the first match and apply the builders to the matched groups
  case (m, builder) => builder.apply(m.subgroups)
}.getOrElse(str)   // if no match found, use the original String

println(res) // prints /foo/bar/fruit/banana/{ref} 

Upvotes: 1

Nicolas Cailloux
Nicolas Cailloux

Reputation: 448

I don't have time to test right now, so maybe someone will give you a more precise answer. But I think this is related to the way scala will do pattern matching with list: usually it is related to unapply, and with ptrn(0) you make an apply. Please try:

val response =  str match {
    case p(a,b) if p == ptrn(0) => "/foo/bar/"+ a +"/{id}"
    case p(a,b,c) if p == ptrn(1) => "/foo/bar/"+ a +"/" + b + "/{ref}"
    case p(a,b,c,d) if p == ptrn(2) => "/foo/bar/"+ a +"/" + (b.toInt*c.toInt*d.toInt)
    case _ => str
}

Upvotes: 0

Related Questions