RikT
RikT

Reputation: 81

r - apply function to individual cells across rows of a dataframe

I have data containing serial temperature measurements on individual patients (each patient represented by a row) arranged in a dataframe:

df<-data.frame(H0=c(35.4, 36.0, 36.0, 36.4), H1=c(NA, 34.0, 33.4, NA), 
           H2=c(NA, 33.5, NA, 34.2), H3=c(32.9, NA, 34.0, NA),
           H4=c(NA, 33.1, NA, NA), H5=c(33.2, NA, NA, 32.8))

The target temperature is 33.0. I have written a simple function that returns the difference per measurement from the target temperature:

sumfun<-function(x) {
if (x>=33 & !is.na(x)) {
x-33
} else if (x<33 & !is.na(x)) {
33-x
} else {
0
}
}

Which works as expected in the console.

What I would like to achieve is a column on the end of the dataframe that contains the sum of the overshoot/undershoot measurements per patient.

df$cumushoot<-rowSums(df[,1:6]-33, na.rm=TRUE)

Is close to what I want except I would like to add (not subtract) undershoot values. Hence returns 2.5 for the first patient where I would like 2.6. If I try:

df$cumushoot1<-rowSums(sumfun(df[,1:6])) 

returns:

Warning message:
In if (x >= 33 & !is.na(x)) { :
the condition has length > 1 and only the first element will be used

Upvotes: 1

Views: 110

Answers (1)

Henrik
Henrik

Reputation: 67778

This may be a possibility. The - 33is recycled across the entire data frame, you take the absolute value of the differences (abs), and sum each row:

df$dif <- rowSums(abs(df - 33), na.rm = TRUE)
#     H0   H1   H2   H3   H4   H5 dif
# 1 35.4   NA   NA 32.9   NA 33.2 2.7
# 2 36.0 34.0 33.5   NA 33.1   NA 4.6
# 3 36.0 33.4   NA 34.0   NA   NA 4.4
# 4 36.4   NA 34.2   NA   NA 32.8 4.8

Upvotes: 3

Related Questions