T D
T D

Reputation: 1150

regex to find only date from a string

I have a string with below pattern. I want to only extract date from the string.

199.120.110.23 - - [01/Jul/1995:00:00:01 -0400] "GET /medium/1/ HTTP/1.0" 200 6245
199.120.110.22 - - [01/Jul/1995:00:00:06 -0400] "GET /medium/2/ HTTP/1.0" 200 3985
199.120.110.21 - - [01/Jul/1995:00:00:09 -0400] "GET /medium/3/stats/stats.html HTTP/1.0" 200 4085

Expected output

01/Jul/1995
01/Jul/1995
01/Jul/1995

Currently I am extracting with two steps.

  1. extract everything between square bracket. \[(.*?)\]
  2. extract the first 11 letters from the first step output string. ^.{1,11}

Wondering if it can be done in one step.

Upvotes: 0

Views: 85

Answers (2)

jwvh
jwvh

Reputation: 51271

If you aren't on Scala 2.13 yet, standard regex patterns still work.

val dateRE = "\\[([^:]+):".r.unanchored
List(
  """199.120.110.23 - - [01/Jul/1995:00:00:01 -0400] "GET /medium/1/ HTTP/1.0" 200 6245""",
  """199.120.110.22 - - [01/Jul/1995:00:00:06 -0400] "GET /medium/2/ HTTP/1.0" 200 3985""",
  """199.120.110.21 - - [01/Jul/1995:00:00:09 -0400] "GET /medium/3/stats/stats.html HTTP/1.0" 200 4085"""
) collect { case dateRE(date) => date }
//res0: List[String] = List(01/Jul/1995, 01/Jul/1995, 01/Jul/1995)

Upvotes: 1

Mario Galic
Mario Galic

Reputation: 48430

In Scala 2.13 consider pattern matching with interpolated string patterns, for example

List(
  """199.120.110.23 - - [01/Jul/1995:00:00:01 -0400] "GET /medium/1/ HTTP/1.0" 200 6245""",
  """199.120.110.22 - - [01/Jul/1995:00:00:06 -0400] "GET /medium/2/ HTTP/1.0" 200 3985""",
  """199.120.110.21 - - [01/Jul/1995:00:00:09 -0400] "GET /medium/3/stats/stats.html HTTP/1.0" 200 4085"""
) collect { case s"${head}[${day}/${month}/${year}:${tail}" => s"$day/$month/$year" }

outputs

res1: List[String] = List(01/Jul/1995, 01/Jul/1995, 01/Jul/1995)

Upvotes: 2

Related Questions