Reputation: 43
I want to calculate the share (%) of pixels classified as 1 from a list of files. For a single image the code works well, however, when I try to write it in a for loop R tells me named numeric(0)
for all files.
How do I get what I want?
Single Image:
ras <- raster("path") # binary product
ras_df <- as.data.frame(ras) # creates data frame
ras_table <- table(ras_df$file) # creates table
share_suit_hab <- ras_table[names(ras_table)==1]/sum(ras_table[names(ras_table)]) # number of pixels with value 1 divided by sum of pixels with value 0 and 1 = share of suitable habitat (%)
print(share_suit_hab)
> ras
class : RasterLayer
dimensions : 1000, 1000, 1e+06 (nrow, ncol, ncell)
resolution : 2165.773, 2463.182 (x, y)
extent : -195054.2, 1970719, 2723279, 5186461 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
source : C:/Users/name/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/ct/2006/cis_SGRDREA_20060703_pl_a.tif
names : cis_SGRDREA_20060703_pl_a
values : 0, 1 (min, max)
For Loop:
list_ct <- list.dirs("path")
i=0
for(year in list_ct){
ct_files_list <- list.files(year, recursive = FALSE, pattern = "\\.tif$", full.names = FALSE)
ct_file_df <- as.data.frame(paste0("path", i, "/", ct_files_list))
ct_file_df <- as.data.frame(matrix(unlist(ct_file_df), nrow= length(unlist(ct_file_df[1]))))
ct_table <- table(ct_file_df[, 1])
stored <- ct_table[names(ct_table)==1]/sum(ct_table[names(ct_table)])
print(stored)
}
Upvotes: 1
Views: 129
Reputation: 43
This is the final code which is running perfectly!
list_ct <- list.dirs("path", recursive = FALSE)
stored <- list()
for (year in seq_along(list_ct)){
ct_file_list <- list.files(list_ct[year], recursive=FALSE, pattern = ".tif$", full.names = FALSE)
tmp <- list()
for (i in seq_along(ct_file_list)){
ct_file_df <- raster(paste0(list_ct[year], "/", ct_file_list[i])) %>% as.data.frame()
# do calculations
tmp[[i]] <- sum(ct_file_df[,1], na.rm=TRUE) / length(ct_file_df[!is.na(ct_file_df)[],1])
names(tmp)[i] <- paste0(list_ct[year], "/", ct_file_list[i])
print(tmp[i])
}
stored[[year]] <- tmp
names(stored)[year] <- paste0(list_ct[year])
}
Upvotes: 1
Reputation: 43
Thank you!
This is working perfectly for all files of one year!
library(raster)
s_list <- list.files("C:/Users/OneDrive - wwfgermany/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/2006/", full.names = T)
s <- raster::stack(s_list)
f <- freq(s, useNA = 'no')
f
ct_avg <- sapply(f, function(x) x[2,2]/sum(x[,2]))
ct_avg__mean <- mean(ct_avg)
ct_avg__mean
However, when I want to write it in another loop, to get one value per year as a final result in the end, I end up with an error saying subscript out of bounds. This is the code I am using:
setwd("C:/Users/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/")
list_ct <- list.dirs("C:/Users/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/")
i=0
for (year in list_ct) {
s_list <- list.files(year, recursive = FALSE, pattern = "\\.tif$", full.names = FALSE)
s <- raster::stack(s_list)
f <- freq(s, useNA = 'no')
f
ct_avg <- sapply(f, function(x) x[2,2]/sum(x[,2]))
ct_avg__mean <- mean(ct_avg)
ct_avg__mean
}
Upvotes: 0
Reputation: 47146
Example data
library(raster)
s <- stack(system.file("external/rlogo.grd", package="raster"))
s <- s > 200
#plot(s)
If your actual data is all for the same area (and the raster data have the same extent and resolution, you want to create a RasterStack
(using the filenames) and use freq
as below
f <- freq(s)
f
#$red
# value count
#[1,] 0 3975
#[2,] 1 3802
#$green
# value count
#[1,] 0 3915
#[2,] 1 3862
#$blue
# value count
#[1,] 0 3406
#[2,] 1 4371
Followed by
sapply(f, function(x) x[2,2]/sum(x[,2]))
# red.count green.count blue.count
# 0.4888775 0.4965925 0.5620419
If you cannot make a RasterStack you can make a list and lapply
and continue as above, or use sapply
and do this
ss <- as.list(s)
x <- sapply(ss, freq)
x[4,] / colSums(x[3:4, ])
#[1] 0.4888775 0.4965925 0.5620419
If you insist on a loop
res <- rep(NA, length(ss))
for (i in 1:length(ss)) {
# r <- raster(ss[i]) # if these were filenames
r <- ss[[i]] # here we extract from the list
x <- freq(r)[,2]
res[i] <- x[2] / sum(x)
}
res
# 0.4888775 0.4965925 0.5620419
Upvotes: 0
Reputation: 88
Could you add a reproducible example (data incl.)?
You probably need to replace numeric(0) simply by 0. Numeric(0) does not mean 0, it means a numeric vector of length zero (i.e., empty). I'm guessing you're probably assigning numeric(0)+1 which is still a numeric vector of 0.
Edit:
You have a folder containing multiple folders which each include 1 or more tif files. You want to loop through each of these folders, importing the tif(s) file, do a calculation, save the result.
In the following, my path contains 5 folders named '2006','2007','2008','2009' and '2010'. Each of these "year"-folders contain an .xlsx file. Each .xlsx file contains 1 column (here, you just need to select the right one in your data frame). This column has the same name in all excel files, "col1", and contains values between 0 and 1. Then this will work:
library(dplyr)
library(readxl)
#
list_ct <- list.dirs("mypath", recursive = FALSE)
stored <- list()
for (year in seq_along(list_ct)){
ct_file_list <- list.files(list_ct[year], recursive=FALSE, pattern = ".xlsx$", full.names = FALSE)
tmp <- list()
for (i in seq_along(ct_file_list)){
ct_file_df <- read_excel(paste0(list_ct[year], "/", ct_file_list[i])) %>% as.data.frame()
# do calculations ..
tmp[[i]] <- sum(ct_file_df$col1) / length(ct_file_df$col1)
names(tmp)[i] <- paste0(list_ct[year], "/", ct_file_list[i])
print(tmp[i])
}
stored[[year]] <- tmp
names(stored)[year] <- paste0(list_ct[year])
}
Instead of using "read_excel", you just use raster()
like you did with the single file. Hope you can use the answer.
Upvotes: 0