Reputation: 23729
Exercise: given a string with a name, then space or newline, then email, then maybe newline and some text separated by newlines capture the name and the domain of email.
So I created the following:
val regexp = "^([a-zA-Z]+)(?:\\s|\\n)\\w+@(\\w+\\.\\w+)(?:.|\\r|\\n)*".r
def fun(str: String): String = {
val result = str match {
case regexp(name, domain) => name + ' ' + domain
case _ => "invalid"
}
result
}
And started testing:
scala> val input = "oleg [email protected]"
scala> fun(input)
res17: String = oleg email.com
scala> val input = "oleg\[email protected]"
scala> fun(input)
res18: String = oleg email.com
scala> val input = """oleg
| [email protected]
| 7bdaf0a1be3"""
scala> fun(input)
res19: String = oleg email.com
scala> val input = """oleg
| [email protected]
| 7bdaf0a1be3
| """
scala> fun(input)
res20: String = invalid
Why doesn't the regexp capture the string with the newline at the end?
Upvotes: 0
Views: 211
Reputation: 163362
This part (?:\\s|\\n)
can be shortened to \s
as it will also match a newline, and as there is still a space before the emails where you are using multiple lines it can be \s+
to repeat it 1 or more times.
Matching any character like this (?:.|\\r|\\n)*
if very inefficient due to the alternation. You can use either [\S\s]*
or use an inline modifier (?s)
to make the dot match a newline.
But using your pattern to just get the name and the domain of the email you don't have to match what comes after it, as you are using the 2 capturing groups in the output.
^([a-zA-Z]+)\s+\w+@(\w+\.\w+)
If you do want to match all that follows, you can use:
val regexp = """(?s)^([a-zA-Z]+)\s+\w+@(\w+\.\w+).*""".r
def fun(str: String): String = {
val result = str match {
case regexp(name, domain) => name + ' ' + domain
case _ => "invalid"
}
result
}
Note that this pattern \w+@(\w+\.\w+)
is very limited for matching an email
Upvotes: 2