firmo23
firmo23

Reputation: 8404

Count how many values of a variable changed according to two other variables in the same dataframe

I have the dataframe below and I want to count how many students (Name) changed their grade (Val) from 2018 to 2019. The result should be 1 as only Bob changed it.

Name<-c("bb","Bob","df","asd","Bob","df","asd","jkl")
Year<-c(2018,2018,2018,2018,2019,2019,2019,2019)
Val<-c(9,4,6,7,9,6,7,7)
gr<-data.frame(Name,Year, Val)




 Name Year Val
1   bb 2018   9
2  Bob 2018   4
3   df 2018   6
4  asd 2018   7
5  Bob 2019   9
6   df 2019   6
7  asd 2019   7
8  jkl 2019   7

Upvotes: 0

Views: 44

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269471

First create an array giving the standard deviation of the values for each Name. This will be NA if there is are not two values, 0 if the two values are the same and > 0 if there are two values which differ. which(... > 0) gives the positions of the values > 0 and we take its length to get the count.

length(which(tapply(gr$Val, gr$Name, sd) > 0))
## [1] 1

Upvotes: 1

markus
markus

Reputation: 26343

You can group by Name and check whether Val is different between the years.

sum(with(gr, ave(Val, Name, FUN = function(x) x[1]) != Val))
# [1] 1

step by step

For each name replace the values for year 2019 with those from the first year, i.e. 2018.

with(gr, ave(Val, Name, FUN = function(x) x[1]))
#[1] 9 4 6 7 4 6 7 7

Then check whether these values do not differ between the years

with(gr, ave(Val, Name, FUN = function(x) x[1]) != Val)
# [1] FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE

And finally calculate the sum of this vector.

Upvotes: 1

Related Questions