Reputation: 3515

String manipulation in R to create dataframe column

I have an R dataframe(df) which includes a factor column, Team

Team
Baltimore Orioles
Kansas City Chiefs
...

I just want to create a new column, nickname, which just refers to the last name

Nickname
Orioles
Chiefs

As a first stage, I have tried splitting the factor like this

df$Nickname <- strsplit(as.character(df$Team), " ")

which produces a list of character fields which I can reference thus

>df$Nickname[1]

[[1]]
[1] "Baltimore" "Orioles"

and

>str(df$Nickname[1])

List of 1
 $ : chr [1:2] "Baltimore" "Orioles"

but then I do not know how to proceed. Trying to get the length

length(df$Nickname[1])

gives 1 - which flummoxes me

Upvotes: 1

Answers (3)

David

Reputation: 104

How about this?

require(plyr)
ldply(df$Nickname)

Upvotes: 0

Andrie

Reputation: 179558

Use a regular expression:

text <- c("Baltimore Orioles","Kansas City Chiefs")

gsub("^.*\\s", "", text)
[1] "Orioles" "Chiefs"

The regex searches for:

^ means the start of the string
.* means any character, repeated
\\s means a single white space

gsub finds this pattern and replaces it with an empty string, leaving you with the last word of each string.

Upvotes: 7

tim riffe

Reputation: 5691

you just need to unlist the split strings and take the last one

    full <- c("Baltimore Orioles","Kansas City Chiefs")
    getlast <- function(x){
    parts <- unlist(strsplit(x, split = " "))
    parts[length(parts)]
    }
    sapply(full,getlast)
    > Baltimore Orioles Kansas City Chiefs 
    > "Orioles"           "Chiefs"

Upvotes: 4

String manipulation in R to create dataframe column

Answers (3)

Related Questions