San
San

Reputation: 183

data frame columns: how can I use loops in this case?

I have a data frame with 2 columns: person and points. In my actual dataset there are more than 1000 persons.

My goal: I need to find persons that have more than 126 points.

df1:

person      points
abc
abc        1
abc
abc        2
abc1    
abc1       1
abc1

I have used this code:

df1 <- read.csv("df1.csv")
  points_to_numeric <- as.numeric(df1$points)

  person_filtered <- df1 %>%
  group_by(person) %>%
  dplyr::filter(sum(points_to_numeric, na.rm = T)>126)%>%
  distinct(person) %>%
  pull()

person_filtered

When I enter this code, as a result I get 800 unique persons. But if I want to know how many persons have less than 126 points - I also get 800 unique persons. So it looks like that it does not work.

Upvotes: 0

Views: 43

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 101044

Maybe you can try the code below

subset(aggregate(.~person,df1,sum), points > 126)

or

subset(df1,ave(points,persion,FUN = sum)>126)

Upvotes: 0

emilliman5
emilliman5

Reputation: 5956

Use of summarise is more idiomatic for this use case.

library(tidyverse)

person_filtred <- df1 %>%
  group_by(person) %>%
  summarise(totalPoints=sum(points, na.rm=TRUE)) %>%
  filter(totalPoints >= 126)

Upvotes: 0

Random
Random

Reputation: 131

Tidyverse solution. Returns a vector with the persons with more than 126 points.

library(tidyverse)

person_filtred <- df1 %>%
  group_by(person) %>%
  dplyr::filter(sum(points, na.rm = T)>126) %>%
  distinct(person) %>%
  pull()

Upvotes: 2

Related Questions