Reputation: 183
I have a data frame with 2 columns: person and points. In my actual dataset there are more than 1000 persons.
My goal: I need to find persons that have more than 126 points.
df1:
person points
abc
abc 1
abc
abc 2
abc1
abc1 1
abc1
I have used this code:
df1 <- read.csv("df1.csv")
points_to_numeric <- as.numeric(df1$points)
person_filtered <- df1 %>%
group_by(person) %>%
dplyr::filter(sum(points_to_numeric, na.rm = T)>126)%>%
distinct(person) %>%
pull()
person_filtered
When I enter this code, as a result I get 800 unique persons. But if I want to know how many persons have less than 126 points - I also get 800 unique persons. So it looks like that it does not work.
Upvotes: 0
Views: 43
Reputation: 101044
Maybe you can try the code below
subset(aggregate(.~person,df1,sum), points > 126)
or
subset(df1,ave(points,persion,FUN = sum)>126)
Upvotes: 0
Reputation: 5956
Use of summarise
is more idiomatic for this use case.
library(tidyverse)
person_filtred <- df1 %>%
group_by(person) %>%
summarise(totalPoints=sum(points, na.rm=TRUE)) %>%
filter(totalPoints >= 126)
Upvotes: 0
Reputation: 131
Tidyverse solution. Returns a vector with the persons with more than 126 points.
library(tidyverse)
person_filtred <- df1 %>%
group_by(person) %>%
dplyr::filter(sum(points, na.rm = T)>126) %>%
distinct(person) %>%
pull()
Upvotes: 2