Reputation: 603
I am selecting rows from my data based on either one of the conditions with the "filter" function:
Subset_data <- filter(Data, company_type == 3 & annualturnover %in% c(1,2,3) | company_type == 2 & annualturnover %in% c(1,2))
Now I want to add a column which has the value "0" when the row fulfills the first condition (company_type == 3 & annualturnover %in% c(1,2,3)), and the value "1" if the second condition is fulfilled (company_type == 2 & annualturnover %in% c(1,2)).
How can I do that efficiently (no looping if possible)?
Upvotes: 1
Views: 383
Reputation: 72803
You could simply use ifelse
then exclude NA
cases.
dat$cat <- with(dat, ifelse(company_type == 3 & annualturnover %in% 1:3, 0,
ifelse(company_type == 2 & annualturnover %in% 1:2, 1, NA)))
dat <- dat[!is.na(dat$cat), ]
dat
# company_type annualturnover cat
# 2 3 3 0
# 3 2 2 1
(Using @JonSpring's data.)
Upvotes: 1
Reputation: 66480
library(dplyr)
Subset_data <- Data %>%
filter(company_type == 3 & annualturnover %in% c(1,2,3) |
company_type == 2 & annualturnover %in% c(1,2)) %>%
mutate(category = case_when(
company_type == 3 & annualturnover %in% c(1,2,3) ~ 0L,
company_type == 2 & annualturnover %in% c(1,2) ~ 1L,
TRUE ~ NA_integer_))
Subset_data
## A tibble: 2 x 3
# company_type annualturnover category
# <int> <int> <int>
#1 3 3 0
#2 2 2 1
using this fake data
Data <- tribble(
~company_type, ~annualturnover,
1L, 2L,
3L, 3L,
2L, 2L,
2L, 3L)
Upvotes: 2