Reputation: 41
I am new to R and writing functions. I've spent hours trying to figure this out and searching Google, but can't seem to find anything. Hopefully you can help? I want to use lapply() to analyze the data below using the ts() function.
My code looks like this:
library(dplyr)
#group out different sites
mylist <- data %>%
group_by(Site)
mylist
#Write ts() function
alpha_function = function(x) {
ts_alpha = ts(x$Temperature, frequency=12, start=c(0017, 7, 20))
return(data.frame(ts_alpha))
}
#Run list through lapply()
results = lapply(mylist, alpha_function())
But I get this error: argument "x" is missing with no default.
I have a data set that looks like:
Site(factor) Date(POSIXct) Temperature(num)
1 0017-03-04 2.73
2 0017-03-04 3.73
3 0017-03-04 2.71
4 0017-03-04 2.22
5 0017-03-04 2.89
etc.
I have over 3,000 temperature readings at different dates for 5 different sites.
Thanks in advance!
Upvotes: 0
Views: 328
Reputation: 12839
A recommended approach when working with dplyr
and the tidyverse
is to keep things in data frames:
library(tidyverse)
library(zoo)
dat %>%
nest(-Site) %>%
mutate(data = map(data, ~ zoo(.x$Temperature, .x$Date)))
# # A tibble: 5 x 2
# Site data
# <fct> <list>
# 1 a <S3: zoo>
# 2 b <S3: zoo>
# 3 c <S3: zoo>
# 4 d <S3: zoo>
# 5 e <S3: zoo>
Or if we must have ts
rather than zoo
objects, we can use as.ts(zoo(...))
.
In case we still prefer regular lists, we can use base split()
and lapply()
:
dat %>%
split(.$Site) %>%
lapply(function(.x) zoo(.x$Temperature, .x$Date))
# List of 5
# $ a:‘zoo’ series from 2017-03-04 12:00:00 to 2017-05-06 00:30:00
# Data: num [1:3000] 5.37 5.49 5.32 5.44 5.43 ...
# Index: POSIXct[1:3000], format: "2017-03-04 12:00:00" ...
# $ b:‘zoo’ series from 2017-03-04 12:00:00 to 2017-05-06 00:30:00
# Data: num [1:3000] 5.36 5.22 5.15 5.41 5.41 ...
# Index: POSIXct[1:3000], format: "2017-03-04 12:00:00" ...
# $ c:‘zoo’ series from 2017-03-04 12:00:00 to 2017-05-06 00:30:00
# Data: num [1:3000] 6.08 6.11 6.22 6.13 6.03 ...
# Index: POSIXct[1:3000], format: "2017-03-04 12:00:00" ...
# $ d:‘zoo’ series from 2017-03-04 12:00:00 to 2017-05-06 00:30:00
# Data: num [1:3000] 5.06 4.96 5.23 5.16 5.29 ...
# Index: POSIXct[1:3000], format: "2017-03-04 12:00:00" ...
# $ e:‘zoo’ series from 2017-03-04 12:00:00 to 2017-05-06 00:30:00
# Data: num [1:3000] 5.1 5.08 5.14 5.13 5.22 ...
# Index: POSIXct[1:3000], format: "2017-03-04 12:00:00" ...
(where dat
is generated as follows:
n_sites <- 5
n_dates <- 3000
set.seed(123) ; dat <- tibble(
Site = factor(rep(letters[1:n_sites], each = n_dates)),
Date = rep(seq.POSIXt(as.POSIXct("2017-03-04 12:00:00"), by = "30 min", length.out = n_dates), times = n_sites),
Temperature = as.vector(replicate(n_sites, runif(1, 5, 6) + cumsum(rnorm(n_dates, 0, 0.1))))
)
Upvotes: 1
Reputation: 2406
I'm not exactly an R
guy, but I would wager this line:
results = lapply(mylist, alpha_function())
should be
results = lapply(mylist, alpha_function)
.
What you have calls the alpha function when you are trying to supply it to lapply
, when what you really (most likely) want to do is provide a reference to the function without calling it. (The error you are getting indicates that alpha_function
needs an x parameter when being called like alpha_function()
).
Upvotes: 2