Reputation: 1823
I have a data set extracted from S2 satellite rasters (10 x 10 meters) with 12 values (ras.df.ll
), but 6 is in a tile (T21JYG
) and the second in another (T21JYG
).
I'd like to calculate the mean in the same (x,y coordinates) between tiles without success. I don't find any way to recognise 1st row in the first tile is the same coordinate of 1st row in the second tile just the end of my data set. In my example:
library(sf)
library(sfheaders)
library(dplyr)
# Raster (10x10 meters) info in data frame 1 - crs +proj=utm +zone=21 +south +datum=WGS84 +units=m +no_defs
x <-c(789385,789395,789405,789415,789425,789435)
y <-c(6626865,6626865,6626865,6626865,6626865,6626865)
tile <- rep("T21JYG",6)
values <-c(321,249,234,238,224,244)
ras.ds1<-data.frame(x,y,tile,values)
ras.ds1.sf <- st_as_sf(ras.ds1, coords = c("x", "y"), crs = 32721, agr = "constant")
ras.ds1.sf.ll <- st_transform(ras.ds1.sf, crs=4326)
ras.ds1.sf.ll
#Simple feature collection with 6 features and 2 fields
#Attribute-geometry relationship: 2 constant, 0 aggregate, 0 identity
#Geometry type: POINT
#Dimension: XY
#Bounding box: xmin: -53.98638 ymin: -30.45564 xmax: -53.98586 ymax: -30.45562
#Geodetic CRS: WGS 84
# tile values geometry
# 1 T21JYG 321 POINT (-53.98638 -30.45564)
# 2 T21JYG 249 POINT (-53.98628 -30.45563)
# 3 T21JYG 234 POINT (-53.98617 -30.45563)
# 4 T21JYG 238 POINT (-53.98607 -30.45563)
# 5 T21JYG 224 POINT (-53.98596 -30.45563)
# 6 T21JYG 244 POINT (-53.98586 -30.45562)
# Raster (10x10 meters) info in data frame - crs +proj=utm +zone=22 +south +datum=WGS84 +units=m +no_defs
x <-c(213285,213295,213305,213315,213325,213335)
y <-c(6626955,6626955,6626955,6626955,6626955,6626955)
tile <- rep("T22JBM",6)
values <-c(336,355,363,426,341,308)
ras.ds2 <-data.frame(x,y,tile,values)
ras.ds2.sf <- st_as_sf(ras.ds2, coords = c("x", "y"), crs = 32722, agr = "constant")
ras.ds2.sf.ll <- st_transform(ras.ds2.sf, crs=4326)
ras.ds2.sf.ll
# Simple feature collection with 6 features and 2 fields
# Attribute-geometry relationship: 2 constant, 0 aggregate, 0 identity
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: -53.98638 ymin: -30.45564 xmax: -53.98586 ymax: -30.45562
# Geodetic CRS: WGS 84
# tile values geometry
# 1 T21JYG 321 POINT (-53.98638 -30.45564)
# 2 T21JYG 249 POINT (-53.98628 -30.45563)
# 3 T21JYG 234 POINT (-53.98617 -30.45563)
# 4 T21JYG 238 POINT (-53.98607 -30.45563)
# 5 T21JYG 224 POINT (-53.98596 -30.45563)
# 6 T21JYG 244 POINT (-53.98586 -30.45562)
# Join information
ras.ds.sf.ll <- rbind(ras.ds1.sf.ll, ras.ds2.sf.ll)
ras.df.ll <- sf_to_df(ras.ds.sf.ll, fill = TRUE, unlist = NULL)
# Mean values by tile
ras.df.ll %>%
group_by(x,y) %>% dplyr::summarise(values=mean(values))
# `summarise()` has grouped output by 'x'. You can override using the `.groups` argument.
# # A tibble: 12 x 3
# # Groups: x [12]
# x y values
# <dbl> <dbl> <dbl>
# 1 -54.0 -30.5 321
# 2 -54.0 -30.5 249
# 3 -54.0 -30.5 234
# 4 -54.0 -30.5 238
# 5 -54.0 -30.5 224
# 6 -54.0 -30.5 244
# 7 -54.0 -30.5 336
# 8 -54.0 -30.5 355
# 9 -54.0 -30.5 363
# 10 -54.0 -30.5 426
# 11 -54.0 -30.5 341
# 12 -54.0 -30.5 308
# Nothing happened!!
# But if I try to change x and y values using accuracy
round_any = function(x, accuracy, f=round){f(x/ accuracy) * accuracy}
ras.df.ll2 <- ras.df.ll %>%
mutate(x = round_any(x, accuracy = 0.001),
y = round_any(y, accuracy = 0.001))
ras.df.ll2 %>%
group_by(x,y) %>% dplyr::summarise(values=mean(values))
`summarise()` has grouped output by 'x'. You can override using the `.groups` argument.
# # A tibble: 3 x 3
# # Groups: x [2]
# x y values
# <dbl> <dbl> <dbl>
# 1 -54.0 -30.5 252.
# 2 -54.0 -30.5 370
# 3 -54.0 -30.5 324.
# Hapens something but is wrong!!
Please, is there any way to make this corrected mean extraction? Thanks in advance Alexandre
Upvotes: 0
Views: 120
Reputation: 4949
Let's start by noticing that what you wrote is unnecessarily complicated. I suggest you do it like this.
fst = function(data)
st_as_sf(data, coords = c("x", "y"),
crs = data$crs, agr = "constant") %>%
st_transform(crs=4326) %>%
sf_to_df(fill = TRUE, unlist = NULL)
df = tibble(
tile = rep(c("T21JYG", "T22JBM"), each = 6) %>% fct_inorder(),
values = c(321,249,234,238,224,244,336,355,363,426,341,308),
x = c(789385,789395,789405,789415,789425,789435,
213285,213295,213305,213315,213325,213335),
y = c(6626865,6626865,6626865,6626865,6626865,6626865,
6626955,6626955,6626955,6626955,6626955,6626955),
crs = rep(c(32721, 32722), each = 6)
) %>% group_by(tile) %>%
nest(data=x:crs) %>%
mutate(st = map(data, ~ fst(.x))) %>%
unnest(st) %>%
mutate(
x = x %>% plyr::round_any(accuracy = 0.001) %>% paste(),
y = y %>% plyr::round_any(accuracy = 0.001) %>% paste(),
) %>% group_by(x,y)
output
# A tibble: 12 x 8
# Groups: x, y [3]
tile values data crs sfg_id point_id x y
<fct> <dbl> <list> <dbl> <int> <int> <chr> <chr>
1 T21JYG 321 <tibble [1 x 3]> 32721 1 1 -53.986 -30.456
2 T21JYG 249 <tibble [1 x 3]> 32721 1 1 -53.986 -30.456
3 T21JYG 234 <tibble [1 x 3]> 32721 1 1 -53.986 -30.456
4 T21JYG 238 <tibble [1 x 3]> 32721 1 1 -53.986 -30.456
5 T21JYG 224 <tibble [1 x 3]> 32721 1 1 -53.986 -30.456
6 T21JYG 244 <tibble [1 x 3]> 32721 1 1 -53.986 -30.456
7 T22JBM 336 <tibble [1 x 3]> 32722 1 1 -53.986 -30.455
8 T22JBM 355 <tibble [1 x 3]> 32722 1 1 -53.986 -30.455
9 T22JBM 363 <tibble [1 x 3]> 32722 1 1 -53.986 -30.455
10 T22JBM 426 <tibble [1 x 3]> 32722 1 1 -53.986 -30.455
11 T22JBM 341 <tibble [1 x 3]> 32722 1 1 -53.985 -30.455
12 T22JBM 308 <tibble [1 x 3]> 32722 1 1 -53.985 -30.455
However, I completely do not understand what you see incorrect in it
df %>% dplyr::summarise(values=mean(values))
output
# A tibble: 3 x 3
# Groups: x [2]
x y values
<chr> <chr> <dbl>
1 -53.985 -30.455 324.
2 -53.986 -30.455 370
3 -53.986 -30.456 252.
This is exactly what we do, which is the mean
of values
in the x
, y
groups.
Make clear what you want to achieve. And don't write it in the comments, but in the body of the post!
Hmm. I don't know anything about the functions st_as_sf
, st_transform
, sf_to_df
. I don't know what they do or how to interpret their results. Note, however, that they are placed in a pipeline (in my fst
function) so they do the necessary calculations and then pass the results to each other. These results must therefore be correct.
Perhaps the problem is that you originally specified two different crs
parameters in the st_as_sf
function. I did this by putting this variable in tibble
. However, you only pass one value of crs = 4326
to the st_transform
function. Perhaps two different values should also be conveyed here?
Finally, I got six results when I set the accuracy
in the round_any
function to 0.00025
.
Take a look at these answers.
# A tibble: 6 x 3
# Groups: x [6]
x y values
<chr> <chr> <dbl>
1 -53.98525 -30.4555 308
2 -53.9855 -30.4555 377.
3 -53.98575 -30.4555 312.
4 -53.986 -30.45575 231
5 -53.98625 -30.45575 242.
6 -53.9865 -30.45575 321
In conclusion, I showed you the correct way to program it.
Upvotes: 1