reikje
reikje

Reputation: 3064

Why does Regex pattern matching not work sometimes in Scala

I am trying to extract the hostname of a url in Scala 2.11.8. For some reason the pattern matching approach doesn't work and I cannot figure out why :(

val HOSTNAME = "^http[s]:\\/?\\/?([^:\\/\\s]+)".r
val text = "https://foo-bar.hostname.com/"

// evaluates to None
val host: Option[String] = {
  text match {
    case HOSTNAME(h) => Some(h)
    case _ =>
      None
  }
}

// evaluates to Some(foo-bar.hostname.com)
val host: Option[String] = {
  val matcher = HOSTNAME.findAllIn(text)
  if (matcher.hasNext && matcher.groupCount > 0) {
    Some(matcher.group(1))
  } else {
    None
  }
}

Upvotes: 1

Views: 320

Answers (1)

Tzach Zohar
Tzach Zohar

Reputation: 37832

In Scala, regular expressions are anchored by default - if you make it unanchored this would work:

val HOSTNAME = "^http[s]:\\/?\\/?([^:\\/\\s]+)".r.unanchored

Result would be Some(foo-bar.hostname.com) (I'm assuming that's what you're trying to match).

Alternatively - add a part that would match anything after the next slash:

val HOSTNAME = "^http[s]:\\/?\\/?([^:\\/\\s]+)\\/.*".r

Would return the same result.

Lastly - if you just want to parse standard URIs, you can use java.net.URI:

URI.create(text).getHost // returns foo-bar.hostname.com

Upvotes: 6

Related Questions