Reputation: 8404
I have the dataframe below and I want to count how many students (Name
) changed their grade (Val
) from 2018
to 2019
. The result should be 1 as only Bob
changed it.
Name<-c("bb","Bob","df","asd","Bob","df","asd","jkl")
Year<-c(2018,2018,2018,2018,2019,2019,2019,2019)
Val<-c(9,4,6,7,9,6,7,7)
gr<-data.frame(Name,Year, Val)
Name Year Val
1 bb 2018 9
2 Bob 2018 4
3 df 2018 6
4 asd 2018 7
5 Bob 2019 9
6 df 2019 6
7 asd 2019 7
8 jkl 2019 7
Upvotes: 0
Views: 44
Reputation: 269471
First create an array giving the standard deviation of the values for each Name
. This will be NA if there is are not two values, 0 if the two values are the same and > 0 if there are two values which differ. which(... > 0) gives the positions of the values > 0 and we take its length to get the count.
length(which(tapply(gr$Val, gr$Name, sd) > 0))
## [1] 1
Upvotes: 1
Reputation: 26343
You can group by Name
and check whether Val
is different between the years.
sum(with(gr, ave(Val, Name, FUN = function(x) x[1]) != Val))
# [1] 1
step by step
For each name replace the values for year 2019 with those from the first year, i.e. 2018.
with(gr, ave(Val, Name, FUN = function(x) x[1]))
#[1] 9 4 6 7 4 6 7 7
Then check whether these values do not differ between the years
with(gr, ave(Val, Name, FUN = function(x) x[1]) != Val)
# [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
And finally calculate the sum of this vector.
Upvotes: 1