Surender Raja
Surender Raja

Reputation: 3599

Regex throws error while having pipe symbol on scala code

Please look at below code . This code will fetch the details such as id, mm,yr from the file name based on regex

Here , if i give the regex with only one extension then it works fine , but if i give Pipe symbol in regex to handle multiple files in single regex then it throws error

scala> val regex = "(?i)(\\d{3})(\\d{2})(\\d{2}).txt".r
regex: scala.util.matching.Regex = (?i)(\d{3})(\d{2})(\d{2}).txt

scala> val regex(id, mm, yr) = fileName
 id: String = 457
 mm: String = 11
 yr: String = 20

 scala> val regex = "(?i)(\\d{3})(\\d{2})(\\d{2})".r
 regex: scala.util.matching.Regex = (?i)(\d{3})(\d{2})(\d{2})

 scala> val fileName = "1234567.TXT"
 fileName: String = 1234567.TXT

 scala> val regex = "(?i)(\\d{3})(\\d{2})(\\d{2}).txt".r
 regex: scala.util.matching.Regex = (?i)(\d{3})(\d{2})(\d{2}).txt

scala>  val regex(id, mm, yr) = fileName
id: String = 123
mm: String = 45
yr: String = 67

scala>  val regex = "(?i)(\\d{3})(\\d{2})(\\d{2}).dat".r
regex: scala.util.matching.Regex = (?i)(\d{3})(\d{2})(\d{2}).dat

 scala>  val fileName = "8889990.dat"
 fileName: String = 8889990.dat

 scala>  val regex(id, mm, yr) = fileName
  id: String = 888
  mm: String = 99
  yr: String = 90

If i give regex like below then it throws error

  scala> val regex = "(?i)(\\d{3})(\\d{2})(\\d{2}).txt|(?i)(\\d{3})(\\d{2})(\\d{2}).dat".r
  regex: scala.util.matching.Regex = (?i)(\d{3})(\d{2})(\d{2}).txt|(?i)(\d{3})(\d{2})(\d{2}).dat

  scala> val fileName = "8889990.dat"
  fileName: String = 8889990.dat

 scala> val regex(id, mm, yr) = fileName
 scala.MatchError: 8889990.dat (of class java.lang.String)
 ... 49 elided

Upvotes: 0

Views: 95

Answers (1)

The fourth bird
The fourth bird

Reputation: 163362

It is because with the alternation, there are not 3 but 6 groups so might for example extend it to

val regex(idTxt, mmTxt, yrTxt, idDat, mmDat, yrDat) = fileName

But as the number format is the same, you can use an alternation for either txt or dat. (Note to escape the dot to match it literally.)

val regex = "(?i)(\\d{3})(\\d{2})(\\d{2})\\.(?:txt|dat)".r
val fileName = "8889990.dat"
val regex(id, mm, yr) = fileName

Output

id: String = 888
mm: String = 99
yr: String = 90

Upvotes: 2

Related Questions