Reputation: 63
Maybe somebody can help me with this:
I have a 4-dimensional spatiotemporal stars dataset, which contains annual values of a variable for four different tree species:
library(stars)
# 4-dimensional stars data: x,y,time, species
dat <- st_as_stars(array(data = runif(2000), dim = c(10, 10, 5, 4))) |>
setNames("data_val") |>
st_set_dimensions(names= c("x","y", "year", "tree_species")) |>
st_set_dimensions("year", values = 2018:2022) |>
st_set_dimensions("tree_species",values = c("bu", "fi", "ki", "ei"))
plot(dat) # (only values of the first slice of tree_speciess ("bu") displayed)
As a second object I have a species identity map, which contains information on which pixel belongs to which species:
# 2-d species map (x,y)
species_map <- st_as_stars(matrix(sample(c("bu", "fi", "ki", "ei"),size = 100, replace = TRUE),
ncol = 10, byrow=T)) |>
setNames("tree_species") |>
st_set_dimensions(c(1,2), names= c("x","y"))
species_map$tree_species <- factor(species_map$tree_species)
plot(species_map, col = c("#fdc086","#ffff99","#7fc97f","#beaed4"))
Based on this species_map, I want to extract pixel by pixel (x/y) the timeseries from the data that matches tree_species
. The resulting stars-object would be a 3-d array (as the 4th data dimension (tree_species) was collapsed), with the data-values in (x,y,year) coming from the respective species entry. So, in the upper-left corner of the resulting stars-array (species ei,ki,ei,bu) there would be the data-values from the 4th-dimension indexes (4,3,4,1).
All my attempts so far failed... I tried to use ifelse()
, which resulted in the right dimensionality, however it did not obtain the right results. I also tried to match values, but this also failed:
dat$data_val[, , , species_map$tree_species]
I guess my problem can be solved easily with some base R commands.
Thank you for your help.
All the best Paul
Upvotes: 0
Views: 110
Reputation: 63
Based on the previous posts, I managed to solve this myself. Thank you for your help!
First we need to translate tree_species
to the 4th dimension index in dat
(like in Allan's for loop, see above), and then use JMenezes' indexmatrix, that just needed some little adjustments:
# create tree_species index grid
# (-> where to find the right tree_species in dat)
species_map$tree_species_ind <- matrix(match(species_map$tree_species,
st_get_dimension_values(dat, 4)),
nrow= 10, ncol = 10)
# create index matrix
indexmatrix <- expand.grid(x=1:10, y = 1:10, year = 1:5) # Quickway to generate all time-x-y combinations, recycling by time first
indexmatrix$tree_species <- rep(species_map$tree_species_ind,5) # Add species
indexmatrix <- as.matrix(indexmatrix)
# prepare stars object to hold the results
res <- dat[,,,,1,drop =TRUE] # copy one slice of tree_species from dat-array, and drop dimension
# extract results
res$data_val <- dat$data_val[indexmatrix]
plot(res)
It's diffcult to check wether everything works as expected with my initial example of random data. Therefore I create a second example of 'stars'-data with a fixed set of numbers varying along dimension 4. In the result, one pixel should finally hold the same value for each year. Also, I change from a square to rectangle map, to check if the rows and columns dimensions are alright.
# data with 4 fixed numbers varying along tree_species-dimension:
sample_numbers <- sample(1:10, size = 4)
dat <- st_as_stars(array(data = rep(sample_numbers,each = 5*7*3),
dim = c(5, 7, 3, 4))) |>
setNames("data_val") |>
st_set_dimensions(names= c("x","y", "year", "tree_species")) |>
st_set_dimensions("year", values = 2018:2020) |>
st_set_dimensions("tree_species",values = c("bu", "fi", "ki", "ei"))
# 2-d species map (x,y)
species_map <- st_as_stars(array(sample(c("bu", "fi", "ki", "ei"),
size = 5*7,
replace = TRUE),
dim = c(5,7))) |>
setNames("tree_species") |>
st_set_dimensions(c(1,2), names= c("x","y"))
species_map$tree_species_f <- factor(species_map$tree_species)
plot(species_map["tree_species_f"], col = c("#fdc086","#ffff99","#7fc97f","#beaed4"))
# create index grid
species_map$tree_species_ind <- matrix(match(species_map$tree_species,
st_get_dimension_values(dat, 4)),
nrow= 5, ncol = 7)
# create index matrix
indexmatrix = expand.grid(x=1:5, y = 1:7, year = 1:3) # Quickway to generate all time-x-y combinations, recycling by time first
indexmatrix$tree_species = rep(species_map$tree_species_ind,3) # Add species
indexmatrix = as.matrix(indexmatrix)
res <- dat[,,,,1,drop =T] # copy one tree_species-slice from dat-array, and drop dimension
res$data_val <- dat$data_val[indexmatrix]
res$data_val_f <- as.factor(res$data_val) # convert to factor for plotting
# Verification
plot(res["data_val_f"]) # results plot: same pattern for each year, as expected
rbind(sample_numbers, st_get_dimension_values(dat, 4)) # Check which number maps to which tree_species
plot(species_map["tree_species_f"], col = c("#fdc086","#ffff99","#7fc97f","#beaed4")) # plot species map to compare
Upvotes: 0
Reputation: 1059
You can provide a matrix as an index to a multidimensional array. Let me demonstrate with a simpler problem:
test = array(1:27,c(3,3,3))
print(test)
I'll take numbers "22" and "6" out of this array using a matrix. Their indexes are: (1,2,3) and (3,2,1).
indexmatrix = rbind(c(1,2,3),(3,2,1))
test[indexmatrix]
NOTE: there are no commas in the []
brackets. Adding commas will silently convert the matrix to a vector. Also it must be a matrix. Data.frames will not work.
Applying to your problem:
indexmatrix = expand.grid(1:5,1:10,1:10) # Quickway to generate all time-x-y combinations, recycling by time first
indexmatrix$species = species_map$tree_species # Add species
indexmatrix = indexmatrix[,c(2,3,1,4)] # Reorder to match dimensions
indexmatrix = as.matrix(indexmatrix)
dat$data_val[indexmatrix]
Upvotes: 1
Reputation: 173793
Perhaps there's a simpler way, but you could achieve this via a loop:
res <- array(0, dim = c(10, 10, 5))
for(i in seq(dim(dat)[1])) {
for(j in seq(dim(dat)[2])) {
spe <- species_map$tree_species[i,j]
spe <- match(spe, attributes(dat)$dimensions$tree_species$values)
res[i,j,] <- dat$data_val[i,j,,spe]
}
}
st_as_stars(res) |>
setNames("data_val") |>
st_set_dimensions(names= c("x","y", "year")) |>
st_set_dimensions("year", values = 2018:2022)
#> stars object with 3 dimensions and 1 attribute
#> attribute(s):
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> data_val 0.0006052661 0.2296937 0.4519045 0.4826458 0.7461618 0.9963633
#> dimension(s):
#> from to offset delta point x/y
#> x 1 10 0 1 FALSE [x]
#> y 1 10 0 1 FALSE [y]
#> year 1 5 2018 1 FALSE
Upvotes: 1