Reputation: 2101
I have a data frame of time series data for many locations (rows as dates and columns as locations).
Dates <- c(1950, 1960, 1970, 1980)
Well1 <- c(25, 30, 40, 28)
Well2 <- c(26, 29, 38, 25)
Well3 <- c(20, 25, 35, 19)
Inputs <- cbind.data.frame(Dates, Well1, Well2, Well3)
I have a data frame of new dates for each location.
Well1new <- c(1955, 1965, 1975, 1985)
Well2new <- Well1new + 1
Well3new <- Well2new + 1
NewDates <- cbind.data.frame(Well1new, Well2new, Well3new)
I need to interpolate to each of the new dates for each location, based on an interpolation of each of the Input dates and locations and return a data frame. I can calculate it easily for one location at a time:
approx(Inputs$Dates, Inputs$Well1, NewDates$Well1new, rule = 2)$y
[1] 27.5 35.0 34.0 28.0
approx(Inputs$Dates, Inputs$Well2, NewDates$Well2new, rule = 2)$y
[1] 27.8 34.4 30.2 25.0
approx(Inputs$Dates, Inputs$Well3, NewDates$Well3new, rule = 2)$y
[1] 23.5 32.0 23.8 19.0
But, in reality I will have thousands of locations. I tried to use apply to loop over the columns in NewDates, but I did not understand how to index the Inputs columns accordingly. I would also like to avoid for loops as speed is a concern (or is apply no faster than for loops?).
Upvotes: 0
Views: 786
Reputation: 93938
Take advantage of Map
to loop over both objects.
Map(approx, xout=NewDates, x=Inputs["Dates"], y=Inputs[-1], rule=2)
Output:
#$Well1new
#$Well1new$x
#[1] 1955 1965 1975 1985
#
#$Well1new$y
#[1] 27.5 35.0 34.0 28.0
#...
Upvotes: 1
Reputation: 301
you can use lapply
to perform the calculation for all your wells as follows. The performance of lapply
vs for
loops is a long debate with mixed opinions. I personally still like to use lapply
where feasible
all.wells <- names(Inputs)[-1]
lapply(all.wells, function(x) {
approx(Inputs$Dates, Inputs[,x], NewDates[,paste0(x, "new")], rule = 2)$y
}
)
Upvotes: 0