Reputation: 3505
I have an R dataframe(df) which includes a factor column, Team
Team
Baltimore Orioles
Kansas City Chiefs
...
I just want to create a new column, nickname, which just refers to the last name
Nickname
Orioles
Chiefs
As a first stage, I have tried splitting the factor like this
df$Nickname <- strsplit(as.character(df$Team), " ")
which produces a list of character fields which I can reference thus
>df$Nickname[1]
[[1]]
[1] "Baltimore" "Orioles"
and
>str(df$Nickname[1])
List of 1
$ : chr [1:2] "Baltimore" "Orioles"
but then I do not know how to proceed. Trying to get the length
length(df$Nickname[1])
gives 1 - which flummoxes me
Upvotes: 1
Views: 3235
Reputation: 179388
Use a regular expression:
text <- c("Baltimore Orioles","Kansas City Chiefs")
gsub("^.*\\s", "", text)
[1] "Orioles" "Chiefs"
The regex searches for:
^
means the start of the string.*
means any character, repeated\\s
means a single white spacegsub
finds this pattern and replaces it with an empty string, leaving you with the last word of each string.
Upvotes: 7
Reputation: 5691
you just need to unlist the split strings and take the last one
full <- c("Baltimore Orioles","Kansas City Chiefs")
getlast <- function(x){
parts <- unlist(strsplit(x, split = " "))
parts[length(parts)]
}
sapply(full,getlast)
> Baltimore Orioles Kansas City Chiefs
> "Orioles" "Chiefs"
Upvotes: 4