Find the number of specific value where is greater than a specific frequency in r

Question

I'm trying to get the frequency distribution for a list if it's over a certain number. In my data, I have multiple columns and I want to generate a code that identifies the frequency of "0" in each column where "0" is greater than 3.

My dataset is like this:


a   b   c   d   e   f   g   h 
0   1   0   1   1   1   1   1
2   0   0   0   0   0   0   0
0   1   2   2   2   1   0   1
0   0   0   0   1   0   0   0
1   0   2   1   1   0   0   0
1   1   0   0   1   0   0   0
0   1   2   2   2   2   2   2
```

The output of the code that I need is :
```
Variable     Frequency
a            4 
c            4 
f            4
g            5
h            4
```

So this will show us the numbers of "0" in the data frame in each column when it is greater than 3.

Thank you.

Ronak Shah · Accepted Answer

You can use colSums to count number of 0's in each column and subset the values which are greater than 3.

subset(stack(colSums(df == 0, na.rm = TRUE)), values > 3)

tidyverse way would be :

library(dplyr)
df %>%
  summarise(across(.fns = ~sum(. == 0, na.rm = TRUE))) %>%
  tidyr::pivot_longer(cols = everything()) %>%
  filter(value > 3)

#  name  value
#   
#1 a         4
#2 c         4
#3 f         4
#4 g         5
#5 h         4

data

df <- structure(list(a = c(0L, 2L, 0L, 0L, 1L, 1L, 0L), b = c(1L, 0L, 
1L, 0L, 0L, 1L, 1L), c = c(0L, 0L, 2L, 0L, 2L, 0L, 2L), d = c(1L, 
0L, 2L, 0L, 1L, 0L, 2L), e = c(1L, 0L, 2L, 1L, 1L, 1L, 2L), f = c(1L, 
0L, 1L, 0L, 0L, 0L, 2L), g = c(1L, 0L, 0L, 0L, 0L, 0L, 2L), h = c(1L, 
0L, 1L, 0L, 0L, 0L, 2L)), class = "data.frame", row.names = c(NA, -7L))

Find the number of specific value where is greater than a specific frequency in r

Answers (1)

Related Questions