Reputation: 4055
I have data gathered through Amazon's Mechnical Turk that has a column vector called "LifeTimeApprovalRate". The column contains information
head(ES$LifetimeApprovalRate)
[1] [1] "100% (32/32)" "50% (16/32)" "100% (11/11)" "100% (4/4)"`
I would like to create three new variables using this information:
ES$rate: "100%" "50%" "100%" "100%"
ES$approve: "32" "16" "11" "4"
ES$total: "32" "32" "11" "4"
I am afraid just about anything I try creates these monstrous lists which are difficult to manage into anything useful.
Upvotes: 1
Views: 1528
Reputation: 887193
You can try strsplit
nm1 <- c('rate', 'approve', 'total')
ES[nm1] <- do.call(rbind,
strsplit(as.character(ES$LifetimeApprovalRate),'[()/ ]+'))
ES[nm1[-1]] <- lapply(ES[nm1[-1]], as.numeric)
ES
# LifetimeApprovalRate rate approve total
#1 100% (32/32) 100% 32 32
#2 50% (16/32) 50% 16 32
#3 100% (11/11) 100% 11 11
#4 100% (4/4) 100% 4 4
A similar option using the devel
version of data.table i.e. v1.9.5
is below. Instructions to install the devel version are here
. Here, we use tstrsplit
to split the column 'LifetimeApprovalRate' and assign the output columns to new columns ('nm1'). There is also option type.convert=TRUE
to convert the column classes.
library(data.table)#v1.9.5+
setDT(ES)[, (nm1):=tstrsplit(LifetimeApprovalRate,'[()/ ]+', type.convert=TRUE)]
# LifetimeApprovalRate rate approve total
#1: 100% (32/32) 100% 32 32
#2: 50% (16/32) 50% 16 32
#3: 100% (11/11) 100% 11 11
#4: 100% (4/4) 100% 4 4
ES <- structure(list(LifetimeApprovalRate = structure(c(2L, 4L, 1L,
3L), .Label = c("100% (11/11)", "100% (32/32)", "100% (4/4)",
"50% (16/32)"), class = "factor")), .Names = "LifetimeApprovalRate",
row.names = c(NA, -4L), class = "data.frame")
Upvotes: 4
Reputation: 173577
tidyr's separate
is also handy for this sort of thing:
library(tidyr)
> dat <- data.frame(x = 1:4,y = c("100% (32/32)", "50% (16/32)", "100% (11/11)", "100% (4/4)"))
> separate(dat,y,c("rate","approve","total"),sep = "[()/ ]+",extra = "drop")
x rate approve total
1 1 100% 32 32
2 2 50% 16 32
3 3 100% 11 11
4 4 100% 4 4
Upvotes: 4