apply substituting for loop

Question

I´m trying to learn how to use function and apply instead of for-loops, as it supposedly takes less time. Can anyone give me advise on how to change the following code in order to reduce time spent?

The goal is to make RF have the same properties at Dates but with the corresponding Euribor returns instead of the "dates" in Dates. The dates are "serialnumber-dates" in both Euribor and Dates (class: numeric).

Example Data(output of this code is my similar to my input):

Dates=matrix(NA,4,10)
Dates[1,1:8]=seq(3610,3617,1)
Dates[2,1:10]=seq(3613,3622,1)
Dates[3,1:5]=seq(3615,3619,1)
Dates[4,1:7]=seq(3616,3622,1)

Euribor=matrix(0,2,51)
Euribor[1,]=seq(3600,3650,1)
Euribor[2,]=rnorm(51)

This solution returns the correct output, but takes very long time with a 4500x4700 matrix.

RF = matrix(0,nrow(Dates),ncol(Dates))
for (i in 1:nrow(Dates)){
In=grep(Dates[i,1],Euribor[1,])
end=sum(!is.na(Dates[i,]))
RF[i,1:end]=as.matrix(Euribor[2,In:(In+end-1)])
}

Thank you in advance for any help.

Kristofersen · Accepted Answer

Dates=matrix(NA,4,10)
Dates[1,1:8]=seq(3610,3617,1)
Dates[2,1:10]=seq(3613,3622,1)
Dates[3,1:5]=seq(3615,3619,1)
Dates[4,1:7]=seq(3616,3622,1)

Euribor=matrix(0,2,51)
Euribor[1,]=seq(3600,3650,1)
Euribor[2,]=rnorm(51)

RF = matrix(0,nrow(Dates),ncol(Dates))
for (i in 1:nrow(Dates)){
  In=grep(Dates[i,1],Euribor[1,])
  end=sum(!is.na(Dates[i,]))
  RF[i,1:end]=as.matrix(Euribor[2,In:(In+end-1)])
}

RF2 = matrix(Euribor[2,match(c(Dates), Euribor[1,])], nrow = nrow(Dates), ncol = ncol(Dates))

So, RF2 is the fast way to do this and should be the same as RF.

    > RF
           [,1]        [,2]       [,3]       [,4]       [,5]       [,6]       [,7]      [,8]       [,9]     [,10]
[1,] -0.0819133 -0.08336513  0.6926775  1.0500598 -0.5244457  1.1804117  1.7349849 1.3002456  0.0000000 0.0000000
[2,]  1.0500598 -0.52444574  1.1804117  1.7349849  1.3002456 -0.7438148 -1.2804350 0.9480801 -0.7692101 0.3189216
[3,]  1.1804117  1.73498487  1.3002456 -0.7438148 -1.2804350  0.0000000  0.0000000 0.0000000  0.0000000 0.0000000
[4,]  1.7349849  1.30024557 -0.7438148 -1.2804350  0.9480801 -0.7692101  0.3189216 0.0000000  0.0000000 0.0000000
> RF2
           [,1]        [,2]       [,3]       [,4]       [,5]       [,6]       [,7]      [,8]       [,9]     [,10]
[1,] -0.0819133 -0.08336513  0.6926775  1.0500598 -0.5244457  1.1804117  1.7349849 1.3002456         NA        NA
[2,]  1.0500598 -0.52444574  1.1804117  1.7349849  1.3002456 -0.7438148 -1.2804350 0.9480801 -0.7692101 0.3189216
[3,]  1.1804117  1.73498487  1.3002456 -0.7438148 -1.2804350         NA         NA        NA         NA        NA
[4,]  1.7349849  1.30024557 -0.7438148 -1.2804350  0.9480801 -0.7692101  0.3189216        NA         NA        NA

We can replace the NAs with 0s like this

RF2[is.na(RF2)] = 0
> RF2
           [,1]        [,2]       [,3]       [,4]       [,5]       [,6]       [,7]      [,8]       [,9]     [,10]
[1,] -0.0819133 -0.08336513  0.6926775  1.0500598 -0.5244457  1.1804117  1.7349849 1.3002456  0.0000000 0.0000000
[2,]  1.0500598 -0.52444574  1.1804117  1.7349849  1.3002456 -0.7438148 -1.2804350 0.9480801 -0.7692101 0.3189216
[3,]  1.1804117  1.73498487  1.3002456 -0.7438148 -1.2804350  0.0000000  0.0000000 0.0000000  0.0000000 0.0000000
[4,]  1.7349849  1.30024557 -0.7438148 -1.2804350  0.9480801 -0.7692101  0.3189216 0.0000000  0.0000000 0.0000000

Edit: I figured I should probably explain how this works. Essentially all we need is the index in Euribor where the Date values are. I figured the easiest way to do this was to collapse Date into a vector and then match the locations of the date values back into Euribor and take the values in col 2 on the matches.

Collapsing Date into a vector goes by column and so does matrix by default so it constructs it back into the form we're looking for.

Finally, we can just swap out all the NAs at the end, and that part is pretty easy.

Since we've removed the need for the for loop this will be much faster. I'm not sure of how we could use an apply function here. There probably is a way but it's not needed to speed it up.

apply substituting for loop

Answers (1)

Related Questions