Reputation: 209
I have a dataframe like this:
ID S1 C
1 1 2 3
2 1 2 3
3 3 1 1
4 6 2 5
5 6 7 5
What I need is the number of rows per group ID
where S1 <= C
. This is the desired output.
ID Obs
1 1 2
2 3 1
3 6 1
Even though the question was answered below, I have a follow up question: Is it possible to do the same for multiple columns (S1, S2, ..). For example for the dataframe below:
ID S1 S2 C
1 1 2 2 3
2 1 2 2 3
3 3 1 1 1
4 6 2 2 5
5 6 7 7 5
And then get:
ID S1.Obs S2.Obs
1 1 2 2
2 3 1 1
3 6 1 1
Upvotes: 0
Views: 68
Reputation: 886938
An option with data.table
library(data.table)
setDT(df)[S1 <=C, .(Obs = .N), ID]
# ID Obs
#1: 1 2
#2: 3 1
#3: 6 1
df <- structure(list(ID = c(1L, 1L, 3L, 6L, 6L), S1 = c(2L, 2L, 1L, 2L, 7L),
C = c(3L, 3L, 1L, 5L, 5L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5"))
Upvotes: 0
Reputation: 35554
A base
R solution with aggregate()
.
aggregate(Obs ~ ID, transform(df, Obs = S1 <= C), sum)
# ID Obs
# 1 1 2
# 2 3 1
# 3 6 1
A dplyr
solution
library(dplyr)
df %>%
filter(S1 <= C) %>%
count(ID, name = "Obs")
# ID Obs
# 1 1 2
# 2 3 1
# 3 6 1
Data
df <- structure(list(ID = c(1L, 1L, 3L, 6L, 6L), S1 = c(2L, 2L, 1L, 2L, 7L),
C = c(3L, 3L, 1L, 5L, 5L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5"))
Extension
If you want to apply this rule on multiple columns such as S1, S2, S3:
df %>%
group_by(ID) %>%
summarise(across(starts_with("S"), ~ sum(.x <= C)))
Upvotes: 1
Reputation: 1362
data <- data.frame(
ID = c(1, 1, 3, 6, 6),
S1 = c(2, 2, 1, 2, 7),
C = c(3, 3, 1, 5, 5)
)
library(dplyr)
data.filtered <- data[data$S1 <= data$C,]
data.filtered %>% group_by(ID) %>%
summarize(Obs = length(ID))
Upvotes: 1