Reputation:
Consider a table with full human names:
create table names (full_name varchar not null);
insert into names (full_name)
values ('Jane Marie Doe'), ('John Doe');
In postgres, what is the simplest (most readable) way to split the names into first and last?
first_name | last_name
------------+-----------
Jane Marie | Doe
John | Doe
(2 rows)
This is what I came up with, but it seems really complicated. Surely there's a simpler way?
select
array_to_string(ary[1:len - 1], ' ') as first_name,
ary[len] as last_name
from (
select ary, array_length(ary, 1) as len
from (
select regexp_split_to_array(full_name, E'\\s+') as ary
from names
) sub1
) sub2;
Upvotes: 4
Views: 9846
Reputation: 699
The top answer has the correct regex for last name. For first name it is not accurate for the sample the OP provided like "Jane Marie Doe". You need a regex with a capturing group:
SELECT regexp_replace(full_name,'(.+)\s\S+$','\1') as first_name
SELECT regexp_replace(full_name,'.+[\s]','') as last_name
Upvotes: 0
Reputation: 449
A simple way to get first name and the rest of the name separately:
select regexp_replace('John Martin Doe','\\s.*','') as first_name, regexp_replace('John Martin Doe','[a-zA-Z]+','') as last_name;
Upvotes: 0
Reputation: 4306
You could try
SELECT regexp_split_to_array(full_name, E'\\s\\S+') as ary
instead, as that should split on the last space consistently. Then you know there will only be two members of ary
EDIT: On further reflection, I think the best approach might be to use
SELECT regexp_replace(full_name,'\s\S+','') as first_name
SELECT regexp_replace(full_name,'.+[\s]','') as last_name
Upvotes: 7
Reputation: 324485
For the record, the current design is correct, and you shouldn't change it.
If you attempt to do so the only vaguely right way to do it is ask each user what their name parts are. If you wish you can attempt to guess this for simple-looking two part names, but even then you don't know if "kim seolhyun" is "Kim, Seolhyun" or "Seolhyun, Kim".
A company I worked with did a name split recently. They did it in the least-awful way I've yet seen:
~~M^2
? Good for you, you're weird.de Vinh
are fineBasically, any attempt to use regex to split names is wrong. If you really insist on doing it, do it only for names that "look" simple, e.g.
SELECT regexp_matches('Fred Smith', '^([A-Z][[:alpha:]]+) ([A-Z][[:alpha:]]+)$');
which will only match names that have two obvious parts, both beginning with a capital letter, and contain only letters (accented or not). Even then, it doesn't solve the "given family" or "family given" problem. Other names won't match, and should be left as-is for the user to change.
Upvotes: 4
Reputation: 49260
One more way to do it with string functions. Find the first space from the end and select it as last name
and then replace
the last name
string in the original string with ''
.
with lname as
(select name,
reverse(substr(reverse(name),1,strpos(reverse(name),' '))) as last_name
from names)
select replace(name,last_name,'') as first_name, last_name
from lname
Upvotes: 1