Reputation: 1013
Vector V1 contains 56 observations for X, and vector BS contains a bootstrapped sample of V1 of length 100000. I would like to interpolate linearly within points in BS to fill in any missing values. For example, V1 contains no 0.27 values, and hence neither does BS. But BS would contain a few 0.28 and 0.26. I would like the interpolation to create a few 0.27 values and add those to BS. And so on for any missing values within the two extremes in the vector.
V1 <- c(0.18, 0.2, 0.24, 0.35, -0.22, -0.17, 0.28, -0.28, -0.14, 0.03, 0.87, -0.2, 0.06, -0.1, -0.72, 0.18, 0.01, 0.31, -0.36, 0.61, -0.16, -0.07, -0.13, 0.01, -0.09, 0.26, -0.14, 0.08, -0.62, -0.2, 0.3, -0.21, -0.11, 0.05, 0.06, -0.28, -0.27, 0.17, 0.42, -0.05, -0.15, 0.05, -0.07, -0.22, -0.34, 0.16, 0.34, 0.1, -0.12, 0.24, 0.45, 0.37, 0.61, 0.9, -0.25, 0.02)
BS <- sample(V1, 100000, replace=TRUE)
The approxfun
functions do not help as are for interpolating within data sets. Have found a few questions/answers covering interpolating within different data sets, but not within one data set. Thank you for your help.
EDIT: please note I do not want to fit a normal distribution (or any other) to create those points.
Upvotes: 2
Views: 3147
Reputation: 35324
You can use approx()
(or approxfun()
) to do this by treating BS
as the y-coordinate and using sequential x-coordinates:
set.seed(1L); BS <- sample(V1,1e5L,T);
res <- approx(seq_along(BS),BS,n=length(BS)*2L-1L)$y;
The specification of n
here is important. It ensures that exactly one interpolated value will be produced halfway between each adjacent pair of input values.
Here's a plot of an excerpt of the result, centered around the first occurrence of an adjacent pair of 0.26 and 0.28:
i <- which(BS[-length(BS)]==0.26 & BS[-1L]==0.28)[1L];
j <- i*2L-1L;
xlim <- c(j-6L,j+8L);
ylim <- c(-1,1);
xticks <- seq(xlim[1L],xlim[2L]);
yticks <- seq(ylim[1L],ylim[2L],0.05);
plot(NA,xlim=xlim,ylim=ylim,xlab='res index',ylab='y',axes=F,xaxs='i',yaxs='i');
abline(v=xticks,col='lightgrey');
abline(h=yticks,col='lightgrey');
axis(1L,xticks,cex.axis=0.7);
axis(2L,yticks,sprintf('%.02f',round(yticks,2L)),las=1L,cex.axis=0.7);
x <- seq(xlim[1L],xlim[2L],2L); y <- BS[seq(i-3L,len=8L)];
points(x,y,pch=16L,col='red',xpd=NA);
x <- seq(xlim[1L],xlim[2L]); y <- res[x];
points(x,y,pch=4L,cex=1.2,col='blue',xpd=NA);
text(x+0.24,y+0.03,y,cex=0.7,xpd=NA);
legend(xlim[1L]+1.5,0.87,c('input value','interpolated'),col=c('red','blue'),pch=c(16L,4L));
Upvotes: 4