Reputation: 145
I have a data set with 1 variable for systolic blood pressure and 1 variable diastolic blood pressure. I want to make one categorical variable of blood pressure levels. This requires using ranges of values from each variable which is proving difficult.
ID Systolic Diastolic
1 130 80
2 118 76
3 120 80
4 115 74
5 184 107
6 114 69
7 95 72
This is closest I've gotten but I don't believe I'm on the right path with this one. Can someone point me in the right direction?
df$BPLevel[Systolic < 120 | Diastolic < 80] <- "Normal"
df$BPLevel[120 < Systolic < 139 | 80 < Diastolic < 89] <- "Prehypertension"
df$BPLevel[Systolic >= 140 | Diastolic >= 90] <- "Hypertension"
df$BPLevel[Systolic == "." | Diastolic == "."] <- "Missing"
Upvotes: 0
Views: 278
Reputation: 692
With situations like this, my initial attempt is to try using dplyr
's case_when()
function.
library(dplyr)
df <- data.frame(ID = c(1:7),
Systolic = c(130,118,120,115,184,114,95),
Diastolic = c(80,76,80,74,107,69,72))
df <- df %>%
mutate(BPLevel = case_when(Systolic < 120 | Diastolic < 80 ~ "Normal",
between(Systolic, 120, 139) | between(Diastolic, 80, 89)~ "Prehypertension",
Systolic>=140 | Diastolic >= 90 ~ "Hypertension",
TRUE ~ "Missing"
))
The only other thing is that in your example above, what should happen if Systolic = 120 or Diastolic = 80? The dplyr::between
function I used includes 120 and 80. Check ?dplyr::between
for more details.
Does this help solve your problem?
Upvotes: 3