Reputation: 23
I have a data frame that contains information on patients and their attendance status at hospitals. I want to create a new dataframe from the original that will list their number of appointments, number of missed and number attended
For example, my starting data table looks like:
Patient Status
A Attended
A Attended
A Attended
B Did not attend
C Did not attend
D Attended
D Did not attend
I would like the output table to look like
Patient Number of appointments Number attended Number missed
A 3 3 0
B 1 0 1
C 1 0 1
D 2 1 1
I tried having a look through previous answers, but couldnt quite find my answer. I think I might have to do something with aggregate function but I couldnt figure it out
Upvotes: 2
Views: 96
Reputation: 2359
abc <- data.frame(Patient = c(rep("A",3),"B","C",rep("D",2)),
Status = c(rep("Attended",3),
rep("Did not attend",2),"Attended","Did not attend"))
abc[,c("Patient","Status")] <-apply(abc[,c("Patient","Status")] ,2, as.character)
Patient Status
A Attended
A Attended
A Attended
B Did not attend
C Did not attend
D Attended
D Did not attend
for( i in unique(abc$Patient)){
abc$Number_of_appointments[abc$Patient==i] <-length(abc$Status[abc$Patient==i])
abc$Number_attended[abc$Patient==i] <- sum(grepl("Attended",abc$Status[abc$Patient==i])%in%"TRUE")
abc$Number_missed[abc$Patient==i] <- sum(!grepl("Attended",abc$Status[abc$Patient==i])%in%"TRUE")
}
abc$Status <-NULL
abc1<-abc[!duplicated(abc$Patient),]
Patient Status Number_of_appointments Number_attended Number_missed
A Attended 3 3 0
B Did not attend 1 0 1
C Did not attend 1 0 1
D Attended 2 1 1
Upvotes: 0
Reputation: 887048
We can do
library(data.table)
setDT(df1)[,.(Number_of_appointments = .N,
Number_Attended = sum(Status== "Attended"),
Number_Missed = sum(Status == "Did not attend")) , Patient]
Or use dcast
dcast(setDT(df1)[, N:=.N, Patient], Patient+N~Status, value.var='Status', length)
# Patient N Attended Did not attend
#1: A 3 3 0
#2: B 1 0 1
#3: C 1 0 1
#4: D 2 1 1
Upvotes: 3