Reputation: 21
I've used this:
String <- unlist(str_split(Invname,"[ ]",n=2))
To split the names that I have into Surnames and First Names, since the surnames come first. But I cannot figure out how to reassign the split Invname
into two lists, so that I can use only the surnames for the rest of my project. Right now I have this:
" [471] "KRUEGER" "MARCUS" "
And I would like to have the left side only assigned to a new variable, so that I can work further with mining the surnames for information.
Upvotes: 2
Views: 210
Reputation: 13
If only names were so straightforward! If there were few complications between strings then yes the answers below are good options. In my experience with name lists we get hyphenated names (both in "first" and "last"), "Middle" names, Titles and shortened name versions (Dr., Mr, Md), and many other variants. I first try to clean the strings before any splitting.
Here is just one idea using dplyr (explicit code provided for clarity)
Invnames <- c("Krueger Markus","Doe John","Tatum Jayson", "Taylor - Cline Jeff", "Davis - Freud Melvin- John")
df <- as.data.frame(Invnames, Invnames = Invnames) %>%
mutate(Invnames2 = gsub("- ","-",Invnames)) %>%
mutate(Invnames2 = gsub(" -","-",Invnames2)) %>%
mutate(surname = gsub(" .*", "", Invnames2))
Upvotes: 0
Reputation: 887611
With base R
, we can make use of read.table/read.csv
to separate the string into columns
read.table(text = Invnames, header = FALSE, col.names = c("Surnames", "Firstnames"))
# Surnames Firstnames
#1 Krueger Markus
#2 Doe John
#3 Tatum Jayson
Invnames <- c("Krueger Markus","Doe John","Tatum Jayson")
Upvotes: 1
Reputation: 757
Again using data from an earlier answer with dplyr
this time
library(tidyverse)
Invnames <- c("Krueger Markus","Doe John","Tatum Jayson")
Invnames <- data.frame(Invnames)
Invnames %>%
separate(Invnames, c('Surname', 'FirstName'), sep=" ")
Surname FirstName
1 Krueger Markus
2 Doe John
3 Tatum Jayson
Upvotes: 1
Reputation: 13319
Here is some sample data and a suggested solution. Data modified from @Rui Barradas' answer:
Invnames <- c("Krueger.$Markus","Doe.John","Tatum.Jayson")
sapply(strsplit(Invnames,"\\W"),"[")
Upvotes: 1
Reputation: 76605
Using the data in nate.edwinton's answer, there is no need to unlist
.
Invnames <- c("Krueger Markus","Doe John","Tatum Jayson")
String <- stringr::str_split(Invnames, "[ ]", n = 2)
Surnames <- sapply(String, '[', 1)
Firstnames <- sapply(String, '[', 2)
data.frame(Surnames, Firstnames)
# Surnames Firstnames
#1 Krueger Markus
#2 Doe John
#3 Tatum Jayson
Upvotes: 2
Reputation: 5281
As mentioned in the comments, it would be easier to help if you provided some data. Anyway, here might be a solution:
Assuming that Invnames
is a vector of where for every first name there is (exactly) one last name, you could do the following
# data
Invnames <- c("Krueger Markus","Doe John","Tatum Jayson")
# extraction
String <- unlist(stringr::str_split(Invnames,"[ ]",n=2))
# saving first and last names
lastNames <- String[seq(1,length(String),2)]
firstNames <- String[seq(2,length(String),2)]
# yields
> cbind(lastNames,firstNames)
lastNames firstNames
[1,] "Krueger" "Markus"
[2,] "Doe" "John"
[3,] "Tatum" "Jayson"
Upvotes: 1