Reputation: 659
I have a set of data which shows the visit ID and the subject name
visit<-c(1,2,3,1,2,1,1,2,3,1,2,3)
subject<-c("A","A","A","B","B","C","D","D","D","E","E","E")
data<-data.frame(visit=visit,subject=subject)
I attempted to work out the latest visit ID for each subject:
tapply(visit,subject,max)
And I get this output:
A B C D E
3 2 1 3 3
I am wondering if there is any way that I can change the output such that it becomes:
A 3
B 2
C 1
D 3
E 3
Thank you
Upvotes: 0
Views: 111
Reputation: 193507
You can easily do this in base R with stack
:
stack(tapply(visit, subject, max))
# values ind
# 1 3 A
# 2 2 B
# 3 1 C
# 4 3 D
# 5 3 E
(Note: In this case, the values for "visit" and "subject" aren't actually coming from your data.frame
. Just thought you should know!)
(Second note: You could also do data.frame(as.table(tapply(visit, subject, max)))
but that is more deceptive than using stack
so may lead to less readable code later on.)
Upvotes: 1
Reputation: 161
It may feel dirty, but using the base function as.matrix
(or matrix
for that matter) will give you what you need.
> as.matrix(tapply(visit,subject,max))
[,1]
A 3
B 2
C 1
D 3
E 3
Upvotes: 1
Reputation: 14413
And a dplyr
solution would be:
library(dyplr)
data %>% group_by(subject) %>% summarize(max = max(visit))
## Source: local data frame [5 x 2]
## subject max
## 1 A 3
## 2 B 2
## 3 C 1
## 4 D 3
## 5 E 3
Upvotes: 2
Reputation: 886938
You can try aggregate
aggregate(visit~subject, data, max)
# subject visit
#1 A 3
#2 B 2
#3 C 1
#4 D 3
#5 E 3
Or from tapply
res <- tapply(visit,subject,max)
data.frame(subject=names(res), visit=res)
Or data.table
library(data.table)
setDT(data)[, list(visit=max(visit)), by=subject]
Upvotes: 3