Henrique
Henrique

Reputation: 221

How can I get the name and surname from string?

For example, I have this string:

String fullName = "Andre Santos Silva";

the name is Andre, the surname I want is Santos. so I'll return "Andre Santos"

But I have some issues, for example:

String fullName = "Andre di Santos Silva"

the name is "Andre" and I want my surname "di Santos" my return must be "Andre di Santos"

Another example:

String fullName = "Andre e Santos Silva"

my name is "Andre" and surname I will be "e Santos" my return must be "Andre e Santos"

how can I get this string with name and "first" surname ?

Upvotes: 1

Views: 1536

Answers (1)

rzwitserloot
rzwitserloot

Reputation: 102842

Completely impossible.

The concept 'first name' and 'last name' are not global, and names generally are. Even if you decide to just toss a middle finger to a significant chunk of the global population and act like those people just don't matter, from the remaining places that do have a first/last name scheme that roughly fits your evident idea of how the entire world names things, it's not consistent enough to be able to just determine first and last name from an input string unless you throw some quite significant pattern matching Artificial Intelligence algorithms at it.

SOLUTION: Stop worrying about it. There is no such thing as first and last name, there's just name. If you have some bass ackwards old timey system that must know, tell the developers of it to get with the program. If you can't tell them, then ask whatever you're getting this input from to give it to you separated out in 'first' and 'last' name. If you can't do that either, you're completely hosed; tell whomever gave you the instruction to build this software that it is not possible, and that the next step isn't technical/development, it's political/organizational: Convince the suppliers to change the process so the input is provided in first/last name form, or convince the ones you are passing this data to, to stop wanting it in first/lastname form.

Some example names to show why the world doesn't work the way you think it does. Please be the computer algorithm and explain to me exactly what the first and last names are of each of these full names. These are official names that e.g. show in passports where relevant.

  • IN: Prince Harry, Duke of Sussex. (Correct OUT: Henry, Mountbatten-Windsor, which clearly cannot possibly be derived from that name!)
  • IN: Ivan Ivanovich (Correct OUT: Ivan, and there is no last name here. That's a patronymic, which is not the same thing. Russian origin names usually do have an actual last name (in the sense that their parent or parents also had that name, a thing you can call a 'family name', but they don't commonly use that, and if they have to enter their full name in a form, you're likely getting first name + patronymic, and that's all.
  • IN: Nanna Bryndís Hilmarsdóttir (Correct OUT: Nanna Bryndís, Hilmarsdóttir - probably. But if you expect her father, mother, or hypothetical children to also have that last name, no they wont, and calling that their 'family name' is wrong. This too is a patronymic, but unlike in e.g. Russia family names aren't a thing, as far as I know, in Iceland - their patronymic is for all intents and purposes their last name. It's just.. not a family name).
  • IN Kim Jong-il. (Correct OUT: Jong-il Kim or possibly Yuri Kim or maybe Yuri Irsenovich Kim - note that the first substring in the input is the last name. This is common in many asian cultures, including Korea (both of them), china, and many more.
  • IN José Antonio Gómez Iglesias (OUT: Well, if this is spanish person, which the name certainly suggests, then the right breakdown is José Antonio and Gómez Iglesias, but it is rare but possible that the correct breakdown is José Antonio Gómez and Iglesias. There is absolutely no way to be sure. The first is by far the most likely but that's based on the fact that the name 'sounds spanish'. Which is where that whole 'you need a quite complicated AI ruleset to try to figure this out', which needs to match this behaviour: Check the name against a giant neural net or other database to guesstimate that it is highly likely to be spanish in origin, and that Gómez is a common surname).
  • IN: Johannes Vennegoor of Hesselink. (Correct OUT: Johannes, Vennegoor of Hesselink. Sort under 'V' if sorting on last name).
  • IN: Jan Willem Vergeer (Correct OUT: Jan Willem, Vergeer. Contrast to the previous answer. Completely impossible to separate out using basic string algorithms. Only way is to use an AI to determine that Jan Willem is a common dutch first name, and the official spelling is usually without a hyphen).
  • IN: Andries de Witt (Correct OUT: Andries de Witt, but de is an interstitial. If sorting, you must sort on W, and not d. In systems that can't handle this, it is common to split this out as Andries and Witt, de instead, and e.g. dutch phonebooks will take the latter approach).

Upvotes: 6

Related Questions