Reputation: 13
I need to interpolate annual data from a 5-year interval and so far I found how to do it for one observation using approx(). But I have a large data set and when trying to use ddply() to apply for each row, no matter what I try in the last row of code I keep receiving error messages.
e.g:
town <- data.frame(name = c("a","b","c"), X1990 = c(100,300,500), X1995=c(200,400,700))
d1990 <-c(1990)
d1995 <-c(1995)
town_all <- cbind(town,d1990,d1995)
library(plyr)
Input <- data.frame(town_all)
x <- c(town_all$X1990, town_all$X1995)
y <- c(town_all$d1990, town_all$d1995)
approx_frame <- function(df) (approx(x=x, y=y, method="linear", n=6, ties="mean"))
ddply(Input, town_all$X1990, approx_frame)
Also, if you know what function calculates geometric interpolation, it will be great. (I was only able to find examples of spline or constant methods.)
Upvotes: 1
Views: 2302
Reputation: 32466
I would first put the data in long format (each column corresponds to a variable, so one column for 'year' and one for 'value'). Then, I use data.table, but the same approach could be followed with dplyr or another split-apply-combine method. This interp
function is meant to do geometric interpolation with a constant rate calculated for each interval.
## Sample data (added one more year)
towns <- data.frame(name=c('a', 'b', 'c'),
x1990=c(100, 300, 500),
x1995=c(200, 400, 700),
x2000=c(555, 777, 999))
## First, transform data from wide -> long format, clean year column
library(data.table) # or use reshape2::melt
towns <- melt(as.data.table(towns), id.vars='name', variable.name='year') # wide -> long
towns[, year := as.integer(sub('[[:alpha:]]', '', year))] # convert years to integers
## Function to interpolate at constant rate for each interval
interp <- function(yrs, values) {
tt <- diff(yrs) # interval lengths
N <- head(values, -1L)
P <- tail(values, -1L)
r <- (log(P) - log(N)) / tt # rate for interval
const_rate <- function(N, r, time) N*exp(r*(0:(time-1L)))
list(year=seq.int(min(yrs), max(yrs), by=1L),
value=c(unlist(Map(const_rate, N, r, tt)), tail(P, 1L)))
}
## geometric interpolation for each town
res <- towns[, interp(year, value), by=name]
## Plot
library(ggplot2)
ggplot(res, aes(year, value, color=name)) +
geom_line(lwd=1.3) + theme_bw() +
geom_point(data=towns, cex=2, color='black') + # add points interpolated between
scale_color_brewer(palette='Pastel1')
Upvotes: 1