Rubicon
Rubicon

Reputation: 155

Split column in dataframe in R at a '%' character

I cannot adapt the existing solutions on this forum that address splitting dataframe columns into two.

I have this dataframe (which funnily enough, has already been split to make it look like this), that I need to be split at the % symbol

enter image description here

The end result I would like, is the 32%, 35%, 54%... 55% will be deleted. So it is just two columns of data. This data is scraped from a website.

Thank you

Upvotes: 1

Views: 70

Answers (3)

MysticRenge
MysticRenge

Reputation: 395

df$Long<-sapply(strsplit(as.character(df$Long), split= "\\%"),'[',2)           
   Long Short
1  239   497
2  142   269
3  216   186
4   96    52
5   93   184
6  160   142
7   96    79

Upvotes: 2

gatsky
gatsky

Reputation: 1285

I would use tidyr and dplyr for this:

library(dplyr)
library(tidyr)

data.frame(Long = c("32% 239", "35% 142", "54% 216"), Short = c(497,269,186), stringsAsFactors = F) %>%
    separate(Long, c("Long_percent","Long_2"), sep = " ") %>%
    select(-Long_percent)

Or you can also use a regex, which could be useful if the data is not so well formed:

data.frame(Long = c("32% 239", "35% 142", "54% 216"), Short = c(497,269,186), stringsAsFactors = F) %>%
    mutate(Long = gsub("[0-9%]+ ", "", Long, perl = T))

Upvotes: 1

Carles Mitjans
Carles Mitjans

Reputation: 4866

This should do it:

df$Long <- paste0(unlist(lapply(strsplit(a, "%"), `[[`, 1)), "%")

It splits each string in Long column by "%" and gets the first element in each split. Then it adds the "%" to the end of the resulting vector.

Upvotes: 1

Related Questions