Split String In R Based On Character Location

Question

I'm trying to split these strings in R (column entries) into three separate columns:

João Moutinho Monaco, 30,  M(C) 
Clinton N'Jie Marseille, 23,  FW
Frederic Sammaritano Dijon, 30,  AM(LR)

to become

Player                Team           Pos
João Moutinho         Monaco         30,  M(C) 
Clinton N'Jie         Marseille      23,  FW
Frederic Sammaritano  Dijon          30,  AM(LR)

I can find the location of the characters using gregexpr and nchar, but but I'm not sure how to use strsplit for it. Or maybe another package is easier?

akrun · Accepted Answer

We can read the vectors in to a data.frame with read.csv after creating a delimiter using gsub

read.csv(text=gsub("^(\S+\s+\S+)\s+(\S+),\s+(.*)", 
       "\1;\2;\3", v1), sep=";", header=FALSE, 
       col.names = c("Player", "Team", "Pos"), stringsAsFactors=FALSE)
#                Player      Team         Pos
#1        João Moutinho    Monaco   30,  M(C)
#2        Clinton N'Jie Marseille     23,  FW
#3 Frederic Sammaritano     Dijon 30,  AM(LR)

Update

If we have more patterns and the "Team" names have only a single word (i.e. before the first ',')

read.csv(text= sub("(\s+[A-Za-z]+),(\s+\d+),(.*)", ";\1;\2\3", v2), 
      header=FALSE, sep=";", col.names = c("Player", "Team", "Pos"), stringsAsFactors=FALSE)
#                Player       Team         Pos
#1        João Moutinho     Monaco    30  M(C)
#2        Clinton N'Jie  Marseille      23  FW
#3 Frederic Sammaritano      Dijon  30  AM(LR)
#4       Angel Di María        PSG   28 M(CLR)
#5    Jean Michael Seri       Nice     25 M(C)

data

v1 <- c("João Moutinho Monaco, 30,  M(C)", "Clinton N'Jie Marseille, 23,  FW", 
                    "Frederic Sammaritano Dijon, 30,  AM(LR)")
v2 <- c(v1, "Angel Di María PSG, 28, M(CLR)","Jean Michael Seri Nice, 25, M(C)")

Split String In R Based On Character Location

Answers (2)

Update

data

Related Questions