Kareem Abdalla
Kareem Abdalla

Reputation: 1

How to pass column values for each row in data frame into custom function in R?

Thanks in advance to the community for the wealth of knowledge you all provide.

I wrote the below function to calculate a point total for men as done in the Framingham Heart Study (total points used to determine a 10 yr risk score). What I'm trying to do now is apply this function to a data frame with a set of test patients to get the point total for each individual (i.e. by row). I'm assuming I need a for loop as I've tried apply(), mapply(), etc. and I'm not having any luck with that approach. Any thoughts on how to get the function to pull in the individual values for each attribute into the function by patient? I'm new to coding so I apologize if this is a very ignorant question (I tried searching with very limited success).

risk_men <- function(total_cholesterol, age, taking_bp_medication, systolic_bp, smoke, diabetes, hdl_cholesterol)
{
    tot_points = 0
    
    # Adding points based off of total cholesterol
    if(total_cholesterol < 160) tot_points = tot_points + 0
    else if(total_cholesterol >= 160 & total_cholesterol < 199) tot_points = tot_points + 1
    else if(total_cholesterol >= 200 & total_cholesterol < 239) tot_points = tot_points + 2
    else if(total_cholesterol >= 240 & total_cholesterol < 279) tot_points = tot_points + 3
    else if(total_cholesterol >= 280) tot_points = tot_points + 4
    
    # Adding or subtracting points based off of age
    if(age >= 30 & age < 34) tot_points = tot_points + 0
    else if(age >= 35 & age < 39) tot_points = tot_points + 2
    else if(age >= 40 & age < 44) tot_points = tot_points + 5
    else if(age >= 45 & age < 49) tot_points = tot_points + 6
    else if(age >= 50 & age < 54) tot_points = tot_points + 8
    else if(age >= 55 & age < 59) tot_points = tot_points + 10
    else if(age >= 60 & age < 64) tot_points = tot_points + 11
    else if(age >= 65 & age < 69) tot_points = tot_points + 12
    else if(age >= 70 & age < 74) tot_points = tot_points + 14
    else if(age >= 75) tot_points = tot_points + 15
    
    # Adding points for systolic blood pressure stratified by whether or not pt on bp meds
    if(taking_bp_medication == 0)
    {
        if(systolic_bp < 120) tot_points = tot_points - 2
        else if(systolic_bp >= 120 & systolic_bp < 129) tot_points = tot_points + 0
        else if(systolic_bp >= 130 & systolic_bp < 139) tot_points = tot_points + 1
        else if(systolic_bp >= 140 & systolic_bp < 159) tot_points = tot_points + 2
        else if(systolic_bp >= 160) tot_points = tot_points + 3
    } else if(taking_bp_medication == 1)
    {
        if(systolic_bp < 120) tot_points = tot_points + 0
        else if(systolic_bp >= 120 & systolic_bp < 129) tot_points = tot_points + 2
        else if(systolic_bp >= 130 & systolic_bp < 139) tot_points = tot_points + 3
        else if(systolic_bp >= 140 & systolic_bp < 159) tot_points = tot_points + 4
        else if(systolic_bp >= 160) tot_points = tot_points + 5
    }

    # Adding points for smoking
    if(smoke == 0) tot_points = tot_points + 0
    else if(smoke == 1) tot_points = tot_points + 4
    
    # Adding points for diabetes
    if(diabetes == 0) tot_points = tot_points + 0
    else if(diabetes == 1) tot_points = tot_points + 3
    
    # Adding or subtracting points based on HDL cholesterol levels
    if(hdl_cholesterol >= 60) tot_points = tot_points - 2
    else if(hdl_cholesterol >= 50 & hdl_cholesterol < 59) tot_points = tot_points - 1
    else if(hdl_cholesterol >= 45 & hdl_cholesterol < 49) tot_points = tot_points + 0
    else if(hdl_cholesterol >= 35 & hdl_cholesterol < 44) tot_points = tot_points + 1
    else if(hdl_cholesterol < 35) tot_points = tot_points + 2
    
    return(tot_points)
}

The patients data frame with the test patients is shown here:

id sex total_cholesterol age systolic_bp diastolic_bp smoke bmi diabetes taking_bp_medication hdl_cholesterol ldl_cholesterol
1 2 323 39 131.5 85 1 24.79 0 0 NA NA
2 1 264 49 127.5 81 0 25.16 0 0 68 152
3 1 200 57 117.5 80 0 25.41 0 0 NA NA
4 1 260 41 137.5 80 1 26.89 0 0 NA NA
5 2 312 62 162.5 93.5 0 25.33 0 1 NA NA
6 1 260 41 120 72.5 1 26.36 0 0 46 221

Upvotes: 0

Views: 170

Answers (2)

Gregor Thomas
Gregor Thomas

Reputation: 145785

Try something like this:

risk_men_vectorized <- function(total_cholesterol, age, taking_bp_medication, systolic_bp, smoke, diabetes, hdl_cholesterol)
{
   
    chol_points = cut(total_cholesterol, breaks = c(0, 160, 200, 240, 280, Inf), labels = 0:4, right FALSE)
    age_points = cut(age, breaks = c(0, seq(35, 70, by = 5), Inf), labels = c(0, 2, 5, 6, 8, 10, 11, 12, 14, 15), right = FALSE)
    
    # Adding points for systolic blood pressure stratified by whether or not pt on bp meds
    if(taking_bp_medication == 0) {
        bp_points = cut(systolic_bp, breaks = c(0, 120, 130, 140, 160, Inf), labels = c(-2, 0, 1, 2, 3), right = FALSE)
    } else if(taking_bp_medication == 1) {
        bp_points = cut(systolic_bp, breaks = c(0, 120, 130, 140, 160, Inf), labels = c(0, 2, 3, 4, 5), right = FALSE)
    }

    hdl_points = cut(hdl_cholesterol , breaks = c(0, 35, 45, 50, 60, Inf), labels = c(2, 1, 0, -1, -2), right FALSE)

    ## cut returns factors so we need to convert to integer before adding
    tot_points = as.integer(as.character(chol_points)) +
        as.integer(as.character(age_points)) +
        as.integer(as.character(bp_points)) +|
        as.integer(as.character(hdl_points)) +|    

    # Adding points for smoking
    ## we don't need an if() if the action is "do nothing" (like add 0)
    if(smoke == 1) tot_points = tot_points + 4
    
    # Adding points for diabetes
    if(diabetes == 1) tot_points = tot_points + 3
    
    return(tot_points)
}

This function is vectorized, so you can call it with:

your_data$points = with(your_data, risk_men_vectorized(total_cholesterol, age, taking_bp_medication, systolic_bp, smoke, diabetes, hdl_cholesterol))

Upvotes: 1

Sirius
Sirius

Reputation: 5429

Are you comfortable that your function's arguments will always match the column names of your data.frame?

If so rowwise might be your friend (I would also add triple dots ... to your function's signature in case things change in the future)

your.data %>% rowwise() %>% do.call( what=risk_men )

this would iterate over your data.frame row by row, giving each row (which incidently also is a list) do do.call which takes it as named arguments to the function risk_men. Which should be what you ask for.

Upvotes: 0

Related Questions