Reputation: 387
I would like to speed up this code in R.
The input is an array 3x3x3 containing integer number and based on the neighbors, if they are zero, replace them for the respective number.
The output is the array "mask_roi" with the new values.
###### Start here
list_neig = array(0, dim = c(3,3,3))
mask_roi = array(sample(c(0,1,2),27,replace=T), dim = c(3,3,3))
values_mask = array(1:27, dim = c(3,3,3))
values_mask_melted = melt(values_mask, varnames=c("x","y","z"))
### Tranform the 3D Matrix in a data.table wit 4 columns position and value
image_melted <- melt(mask_roi, varnames=c("x","y","z")) # 4 columns: x, y, z, value
image_melted$box = rownames(image_melted)
image_melted_non_zeros<-image_melted[!(image_melted$value==0),]
box_neigbors = vector("list", nrow(image_melted))
for (i in 1:(nrow(image_melted_non_zeros))){
cat(i,"\n")
x = image_melted_non_zeros[i,1]
y = image_melted_non_zeros[i,2]
z = image_melted_non_zeros[i,3]
box_neigbors[[image_melted_non_zeros[i,5]]] <- list(nearestNeighbors(values_mask, elem = c(x,y,z), dist = 1,dim = c(3,3,3)))
}
I have the "box_neighbors" vector done, just included it here to show how to get it, we need to make faster from here to the end. The idea is, check all voxel different of zero and check all his neighbors. If his neighbor is zero, he will have the same value, if not zero keep it the original.
for (i in 1:(nrow(image_melted_non_zeros))){
cat(i,"\n")
x = image_melted_non_zeros[i,1]
y = image_melted_non_zeros[i,2]
z = image_melted_non_zeros[i,3]
number_of_nei = length(box_neigbors[[image_melted_non_zeros[i,5]]][[1]] )
value_vozel = mask_roi[x,y,z] # it will give this new value
for (j in 1:number_of_nei){
nei_number = box_neigbors[[image_melted_non_zeros[i,5]]][[1]][j]
xx = image_melted[nei_number,1]
yy = image_melted[nei_number,2]
zz = image_melted[nei_number,3]
value_nei = mask_roi[xx,yy,zz]
if(value_nei == 0){
mask_roi[xx,yy,zz] = value_vozel
}
}
}
I need do this for 256x256x256 array not 3x3x3.
Thanks a lot!
nearestNeighbors <- function(ary, elem, dist, dims){
usedims <- mapply(function(el, d) {
seq(max(1, el - dist), min(d, el + dist))
}, elem, dims, SIMPLIFY=FALSE)
df <- as.matrix(do.call('expand.grid', usedims))
ndist <- sqrt(apply(df, 1, function(x) sum((x - elem)^2)))
ret <- df[which(ndist > 0 & ndist <= dist),,drop = FALSE]
return(ary[ret])
}
Upvotes: 3
Views: 376
Reputation: 68
I put together an implementation that uses K-d Trees. It can process a 256x256x256 array in ~13 seconds running on a MacBookPro with 16GB RAM and 2.3 GHz i.7 processor. You didn't give any specific benchmark, but I think 13s is reasonable enough to post an answer. I've outlined my steps below. Please let me know if I have misunderstood part of the question.
Setup:
We have a box of side length n filled with points. A point in the box is determined by coordinates i,j,k which can range from 1 to n. In total, the box contains n^3 unique points. Each point has an associated integer value 0, 1 or 2.
The Problem:
With box having n = 256. For each point P having a 0 value, find its k nearest NON-ZERO-VALUED neighbor and update P with that neighbor's value. After the update every point in the box should be non-zero.
Solution:
Our box has 16,777,216 (256^3) points so brute force methods are out. Luckily this is exactly what K-d Trees are useful for https://en.wikipedia.org/wiki/K-d_tree. There are a few R libraries focused on metric data structures. I am using FNN for this example since I think it has a more robust API than alternatives https://cran.r-project.org/web/packages/FNN/index.html.
The Code:
The box is represented as a matrix with column names (i, j, k, value). Each row represents a single point in the box.
set.seed(256)
library(FNN)
len = 256
values = c(0, 1, 2)
createBox = function(n, vals) {
index = 1:len^3
value = sample(vals, length(index), replace = T)
box = as.matrix(cbind(index, index, index, value))
dimnames(box) = list(NULL, c("i", "j", "k", "value"))
box
}
box= createBox(len, values)
The knnx.index function accepts the box matrix and a query matrix (subset of box matrix) as arguments and returns the nearest neighbor indices for each point in the query.
updateZeroValuedPoints = function(box, kval) {
zeroPointIndx = which(box[ , "value"] == 0)
nonZeroPoints = box[-1 * zeroPointIndx, ]
zeroPoints = box[zeroPointIndx, ]
nnIdx = knnx.index(nonZeroPoints, zeroPoints, k = kval, algorithm = "kd_tree")
zeroPoints[, "value"] = nonZeroPoints[nnIdx[ , ncol(nnIdx)], "value"]
zeroPoints
}
Once you have the neighbor indices it is a straightforward swap to update the values, no for loops required.
system.time(updateZeroValuedPoints(box, 1))
# > system.time(updateZeroValuedPoints(box, 1))
# user system elapsed
# 13.517 1.162 14.676
Hopefully this is useful and somewhere near your performance expectations.
Upvotes: 1