Reputation:
I am trying to split data frame hire into 2 columns based of percentage.
group percentage
0 hired 60%
0 hired next_month 65%
0 or 1 hired 68%
0 or 1 hired next_month 70%
1 hired 79%
1 or 2 employee 80%
2 retired 85%
2 or 3 fired 92%
3 not-retired 96%
I want 2 columns group and decision output should be (column percentage and decision should be as it is no change, column group should be 0 if percentage is between 60% to 69% (3rd row), group should be 1 if percentage is between 70% to 79% (4th row), group should be 2 if percentage is between 80% to 89%, group should be 3 if percentage is between 90% to 99% )
group decision percentage
0 hired 60%
0 hired next_month 65%
0 hired 68%
1 hired next_month 70%
1 hired 79%
2 employee 80%
2 retired 85%
3 fired 92%
3 not-retired 96%
my code:
df1 <- structure(list(
group = c("0 hired", "0 hired next_month ", "0 or 1 hired",
"0 or 1 hired next_month", "1 hired", "1 or 2 employee",
"2 retired", "2 or 3 fired", "3 not-retired"),
percentage = c("60%", "65%", "68%", "70%", "79%", "80%", "89%", "90%", "96%") ),
.Names = c("group", "percentage"), class = "data.frame", row.names = c(NA, -9L))
df2 <- df1 %>% extract(group, into = c('group', 'decision'), "^(\\d+).*(hired|hired next_month|employee|retired|fired|not-retired)")%>% mutate(group = replace(group, parse_number(percentage)>=100, 3))
can anyone help. Thanks in advance
Upvotes: 0
Views: 56
Reputation: 37641
You can do this in base R like this
df2 = data.frame(percentage = df1$percentage)
df2$decision = sub(".*\\d\\s*", "", df1$group)
df2$group = as.numeric(cut(as.numeric(sub("%", "", df1$percentage)),
breaks = c(59, 69, 79,89,100))) - 1
df2 = df2[,3:1]
df2
group decision percentage
1 0 hired 60%
2 0 hired next_month 65%
3 0 hired 68%
4 1 hired next_month 70%
5 1 hired 79%
6 2 employee 80%
7 2 retired 89%
8 3 fired 90%
9 3 not-retired 96%
Upvotes: 1