Reputation: 546
I have a data.frame with the cloumns: id, name, year, rop, des, cax, pcld.
I need to identify the id with the value "1" in at least one column (rop, des, cax e pcld) for all years (2014, 2015, 2016, 2017). After that, I need to identify those that have been discontinued over the years (Ex: 2014 = 1, 2015 = 1, 2016 = 0).
So, as a first step, I thought about transposing the data.frame, to look like this:
id | name | conta | 2014 | 2015 | 2016 | 2017
3243 | teste| rop | 1 | 0 | 0 | 1
3243 | teste| des | 0 | 1 | 0 | 0
3243 | teste| cax | 0 | 0 | 1 | 0
3243 | teste| pcld | 0 | 0 | 0 | 0
In the next step, id "3243" would be identified. It has at least one value "1" in all years (columns).
I have been trying to transpose:
library(magrittr)
dados %>% dplyr::select(id, name, year, rop, des, cax, pcld) %>%
tidyr::pivot_longer(
cols = -c('id', 'name'),
names_to = 'Conta'
)
Data:
structure(list(
id = c(1111, 1111, 1111, 1111, 22222,22222, 22222, 22222, 33333, 33333,3243,3243,3243,3243),
name = c("empresa", "empresa", "empresa", "empresa","firma", "firma", "firma", "firma", "loja", "loja","teste","teste","teste","teste"),
year = c(2014, 2015, 2016, 2017, 2014, 2015, 2016, 2017, 2014,2015,2014, 2015, 2016, 2017),
rop = c(1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1),
des = c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0),
cax = c(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0),
pcld = c(1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0)),
row.names = c(NA,14L), class = "data.frame")
There is a better way to do it?
Upvotes: 0
Views: 456
Reputation: 1250
Try this:
df %>% group_by(id,year) %>%
rowwise() %>%
mutate(C=max(rop,des,cax,pcld)) %>%
group_by(id) %>%
filter(sum(C)==4) %>%
select(-C) %>%
ungroup
Upvotes: 2
Reputation: 887941
if we want to subset the 'id' where at least one of the column from 'rop' to 'pcld' have 1 in its column for all rows, then do a group by 'id' and filter
by creating a logical vector with rowSums
, and wrap with all
library(dplyr)
dados %>%
group_by(id) %>%
filter(all(rowSums(select(cur_data(), rop:pcld)) > 0)) %>%
ungroup
Or another option is if_any
(dplyr
version 1.0.4
) with all
dados %>%
group_by(id) %>%
filter(all(if_any(rop:pcld, as.logical)))
i.e. if_any
loops over the columns rop
to pcld
, convert the binary to logical (as.logical
), returns TRUE/FALSE as a vector if any of the columns for a row have 1, then wrap the logical vector with all
so that it returns a single TRUE (when all rows have at least one 1 for any column) or FALSE (when any row doesn't have a 1 for all the columns) to subset the 'id'
Upvotes: 1