Juan Avilez
Juan Avilez

Reputation: 43

How to remove n number of characters of a string in R after a specific character?

My data frame is:

df <- data.frame(player = c("Taiwo Awoniyi/e5478b87", "Jacob Bruun Larsen/4e204552", "Andi Zeqiri/d01231f0"), goals = c(2,5,7))

I want to remove all numbers after the "/" in the "player" column. To ideally have:

df <- data.frame(player = c("Taiwo Awoniyi", "Jacob Bruun Larsen", "Andi Zeqiri"), goals = c(2,5,7))

I am unsure of how to approach this since player names vary greatly in length and some numbers are larger than others.

Upvotes: 4

Views: 1283

Answers (4)

jay.sf
jay.sf

Reputation: 72828

Using base R.

transform(df, player=gsub('/.+', '', player))
#               player goals
# 1      Taiwo Awoniyi     2
# 2 Jacob Bruun Larsen     5
# 3        Andi Zeqiri     7

Upvotes: 4

TarJae
TarJae

Reputation: 78927

We could use separate, added extra = 'drop' (many thanks to Onyambu)

library(dplyr)
library(tidyr)

df %>% 
  separate(player, "player", sep="/", extra = 'drop')
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7

Upvotes: 5

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

You can backreference the substring you want to keep by a negative character class allowing any characters except the /:

df %>%
  mutate(player = sub("([^/]+).*", "\\1", player))
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7

More simply, you can just remove anything that's a / or a digit:

df %>%
  mutate(player = gsub("[/0-9]", "", player))

In base R syntax:

df$player <- gsub("[/0-9]", "", df$player)

Upvotes: 5

NelsonGon
NelsonGon

Reputation: 13319

Using dplyr for the pipe and mutate, we can gsub everything after /.

df %>% 
  mutate(player = gsub("\\/.*", "", player))
              player goals
1      Taiwo Awoniyi     2
2 Jacob Bruun Larsen     5
3        Andi Zeqiri     7

Upvotes: 4

Related Questions