Rose
Rose

Reputation: 137

Subsetting rows in R, and filtering out rows if column has all-zero values for subset

This question is a variant on the question I asked (very) recently here. (Sorry to ask two similar questions. I realised after asking that the question I asked wasn't quite right, but I thought I'd leave the original in case it's of use to people in future, and keep this question separate.)

I have a set of data that looks like this, modified slightly from the previous question:

  Category     Item Shop1 Shop2 Shop3
1    Fruit   Apples     4     6     0
2    Fruit  Oranges     0     2     7
3      Veg Potatoes     0     1     0
4      Veg   Onions     0     2     8
5      Veg  Carrots     0     1     3
6    Dairy  Yoghurt     1     5     9
7    Dairy     Milk     0     1     0
8    Dairy   Cheese     0     0     7

I want to filter my data so that I only have Categories that are sold by all shops - if a shop doesn't have any sales for a whole Category, then I'd like to filter it out. In this example, the Veg category would be filtered out because Shop1 has no Veg sales.

In an attempt to resolve this, I tried changing the suggestions I was given for my previous question from using FUN = any to FUN = all but that didn't work and threw errors each time, and I'm unsure what else to try.

I'd appreciate any help you can offer.

Upvotes: 0

Views: 1961

Answers (4)

s_baldur
s_baldur

Reputation: 33498

One more solution using data.table using two steps.

# Data
dt <- data.table(Category = c(rep("Fruit", 2), rep("Veg", 3), rep("Dairy", 3)),
                 Item     = c("Apples", "Oranges", "Potatoes", "Onions", 
                              "Carrots", "Yoghurt", "Milk", "Cheese"),
                 Shop1    = c(4, rep(0, 4), 1, rep(0, 2)),
                 Shop2    = c(6, 2, 1, 2, 1, 5, 1, 0),
                 Shop3    = c(0, 7, 0, 8, 3, 9, 0, 7))

filt <- dt[, any(sum(Shop1) == 0, sum(Shop2) == 0, sum(Shop3) == 0), 
           by = Category]
filt
       Category    V1
1:    Fruit FALSE
2:      Veg  TRUE
3:    Dairy FALSE

dt[Category %in% filt[V1 == FALSE, Category]]

   Category    Item Shop1 Shop2 Shop3
1:    Fruit  Apples     4     6     0
2:    Fruit Oranges     0     2     7
3:    Dairy Yoghurt     1     5     9
4:    Dairy    Milk     0     1     0
5:    Dairy  Cheese     0     0     7

Upvotes: 0

fmic_
fmic_

Reputation: 2436

Here is an example using dplyr. You first group_by the Category variable, and only keep records that have more than 0 sale.

library(tidyverse)
d <- data_frame(
    Category = c(rep("Fruit", 2), rep("Veg", 3), rep("Dairy", 3)),
    Item = c("Apples", "Oranges", "Potatoes", "Onions", "Carrots", "Yoghurt", "Milk", "Cheese"),
    Shop1 = c(4, rep(0, 4), 1, rep(0, 2)),
    Shop2 = c(6, 2, 1, 2, 1, 5, 1, 0),
    Shop3 = c(0, 7, 0, 8,  3, 9, 0, 7)
)

d %>%
    group_by(Category) %>%
    filter(sum(Shop1)  > 0 & sum(Shop2) > 0 &  sum(Shop3) > 0) %>%
    ungroup()

Upvotes: 3

quant
quant

Reputation: 4482

Here is a solution using library(data.table)

dt <- data.table(category=c("Fruit","Fruit","Veg","Veg","Veg","Dairy","Dairy","Dairy"),
                            item=c("apples","oranges","potatoes","onions","carrots","yoghurt","milk","cheese"),
                            shop1=c(4,0,0,0,0,1,0,0),
                            shop2=c(6,2,1,2,1,5,1,0),
                            shop3=c(0,7,0,8,3,9,0,7))
dt_m <- melt(dt,id.vars = c("category","item"))
dt_m[,counts:=sum(value),by=.(category,variable)]
dt_m <- dt_m[counts>0]
dt_m[,counts:=NULL]
dt <- dcast.data.table(dt_m,category+item~variable,value.var = "value")
dt <- na.omit(dt)

Or using dplyr

dt %>% melt(id.vars = c("category","item")) %>% group_by(category,variable) %>%
  mutate(counts=sum(value)) %>% filter(counts>0) %>% mutate(counts=NULL) %>% 
  dcast(category+item~variable,value.var = "value") %>% na.omit()

Upvotes: 1

Sotos
Sotos

Reputation: 51582

Here is one idea with colSums,

ind <- colSums(sapply(split(df[3:5], df$Category), function(i) colSums(i) == 0)) == 0
df[df$Category %in% names(ind)[ind],]

Which gives,

 Category    Item Shop1 Shop2 Shop3
1    Fruit  Apples     4     6     0
2    Fruit Oranges     0     2     7
6    Dairy Yoghurt     1     5     9
7    Dairy    Milk     0     1     0
8    Dairy  Cheese     0     0     7

Upvotes: 4

Related Questions