user3021648
user3021648

Reputation: 67

Writing a Function to Lapply Over Large List

Basically I have a large time series data frame over several days. I had written some code which works for one day at a time in the data frame but now I want to adapt it so that it runs for all days. For each day in my data frame there is a column containing the sunrise time for that day and one containing the sunset. I want to use these times to split each day into day hours and night hours. My sunrise sunset columns look like so and are different for each day:

Sunrise              Sunset
2010-01-19 08:55:12 2010-01-19 17:26:34

I have used split to divide the data frame by date to give a large list containing 10 elements (days)

# Splits data frame by date 
sepdays<- split(df, df$Date)

# Function to split each day into day and night hours
daynight <- function(){
rise <- as.character(df$Sunrise[1])
rise <- substr(rise, 12,19)
set <- as.character(df$Sunset[1])
set <- substr(set, 12,19)
day <- df[df$Time>rise & df$Time<set, ]
df.night1<-df[df$Time<rise,]
df.night2<-df[df$Time>set,]
night <- merge.data.frame(df.night1,df.night2, sort = TRUE, all.x = TRUE, all.y=TRUE)
return(table(day$Activity))
}

# Apply function over list of days
lapply(sepdays,daynight)

When I run lapply I get the unused arguments error:

Error in FUN(X[[1L]], ...) : unused argument (X[[1]])

I am also not sure if this is the best way to go about getting the specific sunrise/set times for each matching day. I realise that my function has no arguments but I am an R newbie so not really sure what I'm doing.

Here is what my data looks like.

Date       Time      Activity  Sunrise              Sunset
2010-01-19 23:58:00  1         2010-01-19 08:55:12  2010-01-19 17:26:34
2010-01-19 23:59:00  1         2010-01-19 08:55:12  2010-01-19 17:26:34
2010-01-19 00:00:00  0         2010-01-20 08:54:13  2010-01-20 17:28:11
2010-01-19 00:01:00  0         2010-01-20 08:54:13  2010-01-20 17:28:11
2010-01-20 00:02:00  1         2010-01-20 08:54:13  2010-01-20 17:28:11
2010-01-20 00:03:00  0         2010-01-20 08:54:13  2010-01-20 17:28:11
2010-01-20 00:04:00  1         2010-01-20 08:54:13  2010-01-20 17:28:11

I would like my ouput to contain for each date a table of activity e.g.:

2010-01-19
1  0
2  0

2010-01-20
1  0
2  3

Upvotes: 0

Views: 979

Answers (1)

Roland
Roland

Reputation: 132576

I'm not quite sure, because your question is pretty vague, but I think you could do this:

DF <- read.table(text="Date,       Time,      Activity,  Sunrise,              Sunset
2010-01-19, 23:58:00,  1,         2010-01-19 08:55:12,  2010-01-19 17:26:34
2010-01-19, 23:59:00,  1,         2010-01-19 08:55:12,  2010-01-19 17:26:34
2010-01-19, 00:00:00,  0,         2010-01-19 08:55:12,  2010-01-19 17:26:34
2010-01-19, 00:01:00,  0,         2010-01-19 08:55:12,  2010-01-19 17:26:34
2010-01-19, 09:01:00,  0,         2010-01-19 08:55:12,  2010-01-19 17:26:34
2010-01-20, 00:02:00,  1,         2010-01-20 08:54:13,  2010-01-20 17:28:11
2010-01-20, 00:03:00,  0,         2010-01-20 08:54:13,  2010-01-20 17:28:11
2010-01-20, 00:04:00,  1,         2010-01-20 08:54:13,  2010-01-20 17:28:11", header=TRUE, sep=",")

DF$datetime <- as.POSIXct(paste(DF$Date, DF$Time), "%Y-%m-%d %H:%M:%S", tz="GMT")
DF$date <- as.Date(DF$datetime)
DF$Sunrise <- as.POSIXct(DF$Sunrise, "%Y-%m-%d %H:%M:%S", tz="GMT")
DF$Sunset <- as.POSIXct(DF$Sunset, "%Y-%m-%d %H:%M:%S", tz="GMT")


DF$day <- (DF$datetime > DF$Sunrise) & (DF$datetime < DF$Sunset)

#        Date      Time Activity             Sunrise              Sunset            datetime   day       date
#1 2010-01-19  23:58:00        1 2010-01-19 08:55:12 2010-01-19 17:26:34 2010-01-19 23:58:00 FALSE 2010-01-19
#2 2010-01-19  23:59:00        1 2010-01-19 08:55:12 2010-01-19 17:26:34 2010-01-19 23:59:00 FALSE 2010-01-19
#3 2010-01-19  00:00:00        0 2010-01-19 08:55:12 2010-01-19 17:26:34 2010-01-19 00:00:00 FALSE 2010-01-19
#4 2010-01-19  00:01:00        0 2010-01-19 08:55:12 2010-01-19 17:26:34 2010-01-19 00:01:00 FALSE 2010-01-19
#5 2010-01-19  09:01:00        0 2010-01-19 08:55:12 2010-01-19 17:26:34 2010-01-19 09:01:00  TRUE 2010-01-19
#6 2010-01-20  00:02:00        1 2010-01-20 08:54:13 2010-01-20 17:28:11 2010-01-20 00:02:00 FALSE 2010-01-20
#7 2010-01-20  00:03:00        0 2010-01-20 08:54:13 2010-01-20 17:28:11 2010-01-20 00:03:00 FALSE 2010-01-20
#8 2010-01-20  00:04:00        1 2010-01-20 08:54:13 2010-01-20 17:28:11 2010-01-20 00:04:00 FALSE 2010-01-20

table(DF[,c("date", "Activity", "day")])

#, , day = FALSE
#
#            Activity
#date         0 1
#  2010-01-19 2 2
#  2010-01-20 1 2
#
#, , day = TRUE
#
#            Activity
#date         0 1
#  2010-01-19 1 0
#  2010-01-20 0 0

This is easier to read and much more efficient.

Upvotes: 1

Related Questions