Reputation: 5
I have 2 dataframes in R. The first one is a list of patients.
Patient 1
Patient 2
Patient 3
The second one is a list of procedures, and their costs, per patient.
Procedure 1 - Patient 1 - Cost
Procedure 2 - Patient 1 - Cost
Procedure 3 - Patient 1 - Cost
Procedure 1 - Patient 2 - Cost
Procedure 1 - Patient 3 - Cost
Etc.
I want to add the costs, per patients, into a new column in the first data frame (i.e, total expenditure per patient)
How can I do this?
Upvotes: 0
Views: 46
Reputation: 550
Seems like you just need to aggregate and merge your data.
Here’s some example data
patient_df <- structure(list(patient_id = 1:3, gender = structure(c(2L, 1L,
2L), .Label = c("F", "M"), class = "factor")), class = "data.frame", row.names = c(NA,
-3L))
print(patient_df)
## patient_id gender
## 1 1 M
## 2 2 F
## 3 3 M
procedure_df <- structure(list(procedure_id = c(1, 2, 3, 1, 2, 1), patient_id = c(1,
1, 1, 2, 2, 3), cost = c(10, 5, 12, 10, 5, 10)), class = "data.frame", row.names = c(NA,
-6L))
print(procedure_df)
## procedure_id patient_id cost
## 1 1 1 10
## 2 2 1 5
## 3 3 1 12
## 4 1 2 10
## 5 2 2 5
## 6 1 3 10
Let’s aggregate the procedure data
library(dplyr)
total_costs <- procedure_df %>%
group_by(patient_id) %>%
summarize(total_cost = sum(cost)) %>%
ungroup()
print(total_costs)
## # A tibble: 3 x 2
## patient_id total_cost
## <dbl> <dbl>
## 1 1 27
## 2 2 15
## 3 3 10
And then merge it to patient data
patient_costs <- left_join(patient_df, total_costs, by = "patient_id")
print(patient_costs)
## patient_id gender total_cost
## 1 1 M 27
## 2 2 F 15
## 3 3 M 10
Upvotes: 1