egnha
egnha

Reputation: 1197

Using dplyr to apply a function of several columns of an R data frame

Using dplyr’s “verbs,” how can I apply a (general) function to a column of an R data frame, if that function depends on multiple columns of the data frame?

Here’s a concrete example of the type of situation that I face. I have a data frame like this:

df <- data.frame(
    d1 = c('2016-01-30 08:40:00 UTC', '2016-03-06 09:30:00 UTC'),
    d2 = c('2016-01-30 16:20:00 UTC', '2016-03-06 13:20:00 UTC'),
    tz = c('America/Los_Angeles', 'America/Chicago'), stringsAsFactors = FALSE)

I want to convert the UTC times to local times, to get a data frame like this:

                   d1                  d2                  tz
1 2016-01-30 00:40:00 2016-01-30 08:20:00 America/Los_Angeles
2 2016-03-06 03:30:00 2016-03-06 07:20:00     America/Chicago

To do this, I would like to apply the following function, which converts UTC time to local time using the lubridate library, to the date columns:

getLocTime <- function(d, tz) {
    as.character(with_tz(ymd_hms(d), tz))
}

Using dplyr, it seems that the transformation

df %>% mutate(d1 = getLocTime(d1, tz), d2 = getLocTime(d2, tz))

should do the trick. However, it fails with the complaint Error in eval(expr, envir, enclos): invalid 'tz' value.

The only way I've managed to do the conversion to local time is with the rather ungainly assignment

df[c('d1', 'd2')] <- lapply(c('d1', 'd2'),
                            function(x) unlist(Map(getLocTime, df[[x]], df$tz)))

Is there in fact a natural way to perform this transformation using dplyr idioms?

Upvotes: 0

Views: 1586

Answers (1)

thothal
thothal

Reputation: 20399

As mentioned by lukeA, the problem occurs because getLocTime is not vectorized. So either you vectorize the function as proposed, or you perform your function rowwise:

 df %>% rowwise() %>% mutate(d1 = getLocTime(d1, tz), d2 = getLocTime(d2, tz))

which makes sure that getLocTime is called with a single number and not a vector. I leave it up to you to determine which approach is faster.

Upvotes: 3

Related Questions