Reputation: 275
I have to fetch First name, Middle and Last name from String based on special characters.
First name condition - if name_str contains comma(",") and ends with space+any single character+period(".")
For example:
name_str - SMITH, ANNE MARIE J.
Then First name - ANNE MARIE
Middle name condition - if name_str contains comma(",") and ends with space+any single character+period(".") Then take the substring single character before "." until space
For example:
name_str - SMITH, ANNE MARIE J.
Then Middle name - J.
Last name - SMITH
I tried below code to get First name, need to add more condition to check if name_str ends with Space+any Character+period(".")
if (",.".forall(name_str.contains(,)))
name_str.substring(name_str.indexOf(",") + 1, name_str.indexOf(" ")).trim
Upvotes: 1
Views: 625
Reputation: 51271
You could create a simple regex for each of the name formats that you're expected to parse.
val nameRE1 = "([^,]+),(.+) (.\\.)".r
val nameRE2 = "([^,]+),(.+)".r
val nameRE3 = "(.+) (.\\.) (.+)".r
val nameRE4 = "([^,]+) (.+)".r
List( "SMITH, ANNE MARIE J."
, "Michael J. Fox"
, "Van Halen, Eddie"
, "Jo Blow"
).map{
case nameRE1(ln, fn, mi) => List(fn.strip, mi, ln.strip)
case nameRE2(ln, fn) => List(fn.strip, "", ln.strip)
case nameRE3(fn, mi, ln) => List(fn.strip, mi, ln.strip)
case nameRE4(fn, ln) => List(fn.strip, "", ln.strip)
case nameX => List(nameX)
}
//res0: List[List[String]] = List(List(ANNE MARIE, J., SMITH)
// , List(Michael, J., Fox)
// , List(Eddie, "", Van Halen)
// , List(Jo, "", Blow))
Upvotes: 1
Reputation: 163207
Matching names can be really difficult. For the description in your question, you might use a broad pattern that approaches the given format as names can contain a lot of characters.
It matches the lastname part before the comma, the firstname part after the comma and before the single char dot pattern at the end.
^([^\s,][^,]*),\h*([^\s,].*?)\h+([^\s.]\.(?:[^\s.]\.)*)$
^
Start of string(
Capture group 1
[^\s,][^,]*
Match a single non whitespace char except for a comma, followed by matching any char except a comma)
Close group 1,\h*
Match a comma and optional spaces(
Capture group 2
[^\s,].*?
Match a single non whitespace char except for a comma
)
Close group 2\h+
Match 1+ spaces(
Capture group 3
[^\s.]\.
Match a single non whitespace char except for a dot, then match a dot(?:[^\s.]\.)*
Optionally repeat the same in case of multiple single characters followed by a dot)
Close group 3$
End of stringSee a regex demo or a Scala demo
val s = "SMITH, ANNE MARIE J."
val regex =
"""^([^\s,][^,]*),\h*([^\s,].*?)\h+([^\s.]\.(?:[^\s.]\.)*)$"""
.r("lastname", "firstname", "middlename")
regex.findFirstMatchIn(s) match {
case Some(m) => println(
s"Lastname: ${m.group("lastname")}, " +
s"Firstname: ${m.group("firstname")}, " +
s"Middlename: ${m.group("middlename")}"
)
case None => println("No match.")
}
Output
Lastname: SMITH, Firstname: ANNE MARIE, Middlename: J.
Upvotes: 1