ynitSed
ynitSed

Reputation: 145

R - countifs like excel

I have a CSV like this, saved as an object in R named df1.

 X Y Z Year
 0 2 4 2014
 3 1 3 2014
 5 4 0 2014
 0 3 0 2014
 2 1 0 2015

I want to:

  1. Count each column where there are no "0" for year 2014. For example, for column X, the count = 2 (not 3 because I want 2014 data only). For column Y the count is 4. For column Z the count is 1.

  2. Sum all the counts for each column

This is what I tried:

count_total <- sum(df1$x != 0 &
                   df1$y != 0 &
                   df1&z != 0 &
                   df1$Year == 2014)
count_total

I want the output to be simply be 1 (i.e. the 2nd row in df has no 0's)

However, this does not align with my countifs on excel. In excel, it's like this:

=COUNTIFS('df1'!$A$2:$A$859,"<>0",'df1'!$B$2:$B$859,"<>0",
          'df1'!$C$2:$C$859,"<>0",'df1'!$D$2:$D$859,2014)

Wondering if I mistyped something on R? I'm a dyplr user but can't find anything particularly useful on google.

Thank you very much!

Upvotes: 2

Views: 1290

Answers (4)

Stephan
Stephan

Reputation: 2246

dplyrapproach:

library(dplyr)
df1 %>% 
  group_by(Year) %>% 
  summarise_at(vars(X:Z), function (x) sum(x != 0))

Output:

# A tibble: 2 x 4
#    Year     X     Y     Z
#   <int> <int> <int> <int>
# 1  2014     2     4     2
# 2  2015     1     1     0

Upvotes: 3

milan
milan

Reputation: 4970

Alternative using summaryBy.

library(doBy)
summaryBy(list(c('X','Y','Z'), c('Year')), df1, FUN= function(x) sum(x!=0), keep.names=T)

  Year X Y Z
1 2014 2 4 2
2 2015 1 1 0

When needed use colSums as explained before.

Upvotes: 2

Rui Barradas
Rui Barradas

Reputation: 76402

You can do this with aggregate then colSums to get the totals by column.

agg <- aggregate(. ~ Year, df1, function(x) sum(x != 0))
agg
#  Year X Y Z
#1 2014 2 4 2
#2 2015 1 1 0

colSums(agg[-1])
#X Y Z 
#3 5 2

Data.

df1 <- read.table(text = "
X Y Z Year
 0 2 4 2014
 3 1 3 2014
 5 4 0 2014
 0 3 0 2014
 2 1 0 2015
",header = TRUE)

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388817

One way is using rowSums on subset of data

sum(rowSums(subset(df1, Year == 2014) == 0) == 0)
#[1] 1

Upvotes: 4

Related Questions