DonCharlie
DonCharlie

Reputation: 53

How to set a function that makes operations between columns (within a dataframe)

Problem

I´m making a function that describes the changes in the temporal state of a given time series. It will say if the value of a given column is more, less, equal than the previous one, and print the result: It could be in the same data frame or in other different object. I´m doing it to transform the data in order to be good for survival analysis.

What has been done

I already made an if else ladder that looks like this: where (x) is an i column in a data drame and (y) is the column just before it (i-1). However, I am clueless about how to define the first line of the function to actually do this operation in each column of the data frame(counting from the second one), also to dont crash with the last column

func_name <- function (x, columns) {
if (x == NA) {
print("gone")
} else if (x < y) {
print("less")
} else if (x > y) {
print("more")
} else if (x = y) {
print("same")
} else {
print ("")
}
}

What is being expected

Ideally will be transforming something like this:

Id <- c(1,2,3)
Time1 <- c(3,3,4)
Time2 <- c(2,5,4)
Time3 <- c(1,5,8)
df <- data.frame(Id,Time1,Time2,Time3)
df

Into something like this:

Id <- c(1,2,3)
Time1 <- c(3,3,4)
Time2 <- c("Less","More","Same")
Time3 <- c("Less","Same","More")
df2 <- data.frame(Id,Time1,Time2,Time3)
df2

Any help, highly apreciated!

Solutions: Both @Andrew and @Cole solution works solving the problem!

Upvotes: 1

Views: 114

Answers (2)

Cole
Cole

Reputation: 11255

Here's the use of mapply with an anonymous function inside:

df <- data.frame(Id,Time1,Time2,Time3)

df[, 3:4] <- mapply(function(x, y) ifelse(y < x , 'Less', ifelse(y > x, 'More', 'Same'))
                    , df[, 2:3]
                    , df[, 3:4])
df

mapply will walk along each field of the datasets and apply a function. In other words, I am taking the difference between df[, 2] and df[, 3], and then df[, 3] and df[, 4]. I could have also done something like:

fx_select <- function(x, y) {
ifelse(y < x, 'Less', ifelse(y > x, 'More', 'Same'))
}

df[, 3:4] <- mapply(fx_select, df[, 2:3], df[, 3:4])

And here's one more approach:

df[3:4] <- lapply(sign(df[2:3] - df[3:4]) + 2,
       function(x) c('More', 'Same', 'Less')[x]
       )

Upvotes: 2

Andrew
Andrew

Reputation: 5138

This sounds like it is what you are looking for. It is not a custom function, but if can be adapted if you need one. Hope this helps!

# Select the columns you need. NOTE: used [-1] to remove starting time column
cols <- grep("Time", names(df), fixed = T)[-1]

# Use case_when with your conditions
df[cols] <- lapply(cols, function(i) dplyr::case_when(
  is.na(df[i]) ~ "Gone",
  df[i] > df[i-1] ~ "More",
  df[i] < df[i-1] ~ "Less",
  df[i] == df[i-1] ~ "Same"
))

df
  Id Time1 Time2 Time3
1  1     3  Less  Less
2  2     3  More  Same
3  3     4  Same  More

Upvotes: 3

Related Questions