howdidienduphere
howdidienduphere

Reputation: 43

Access files from list in for loop

I want to calculate the share (%) of pixels classified as 1 from a list of files. For a single image the code works well, however, when I try to write it in a for loop R tells me named numeric(0) for all files. How do I get what I want?

Single Image:

ras <- raster("path") # binary product
ras_df <- as.data.frame(ras) # creates data frame
ras_table <- table(ras_df$file) # creates table
share_suit_hab <- ras_table[names(ras_table)==1]/sum(ras_table[names(ras_table)]) # number of pixels with value 1 divided by sum of pixels with value 0 and 1 = share of suitable habitat (%)
print(share_suit_hab)

> ras class : RasterLayer dimensions : 1000, 1000, 1e+06 (nrow, ncol, ncell) resolution : 2165.773, 2463.182 (x, y) extent : -195054.2, 1970719, 2723279, 5186461 (xmin, xmax, ymin, ymax) crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 source : C:/Users/name/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/ct/2006/cis_SGRDREA_20060703_pl_a.tif names : cis_SGRDREA_20060703_pl_a values : 0, 1 (min, max)

For Loop:

list_ct <- list.dirs("path")
i=0
for(year in list_ct){
  ct_files_list <- list.files(year, recursive = FALSE, pattern = "\\.tif$", full.names = FALSE)
  ct_file_df <- as.data.frame(paste0("path", i, "/", ct_files_list))
  ct_file_df <- as.data.frame(matrix(unlist(ct_file_df), nrow= length(unlist(ct_file_df[1]))))
  ct_table <- table(ct_file_df[, 1])
  stored <- ct_table[names(ct_table)==1]/sum(ct_table[names(ct_table)])
  print(stored)
}

Upvotes: 1

Views: 129

Answers (4)

howdidienduphere
howdidienduphere

Reputation: 43

This is the final code which is running perfectly!

    list_ct <- list.dirs("path", recursive = FALSE)
    stored <- list()

    for (year in seq_along(list_ct)){

      ct_file_list <- list.files(list_ct[year], recursive=FALSE, pattern = ".tif$", full.names = FALSE)
      tmp <- list()

      for (i in seq_along(ct_file_list)){    
        ct_file_df   <- raster(paste0(list_ct[year], "/", ct_file_list[i])) %>% as.data.frame()

        # do calculations

        tmp[[i]] <- sum(ct_file_df[,1], na.rm=TRUE) / length(ct_file_df[!is.na(ct_file_df)[],1])
        names(tmp)[i] <- paste0(list_ct[year], "/", ct_file_list[i])
        print(tmp[i])
      }
      stored[[year]] <- tmp
      names(stored)[year] <- paste0(list_ct[year])

    }

Upvotes: 1

howdidienduphere
howdidienduphere

Reputation: 43

Thank you!

This is working perfectly for all files of one year!

library(raster) s_list <- list.files("C:/Users/OneDrive - wwfgermany/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/2006/", full.names = T) s <- raster::stack(s_list) f <- freq(s, useNA = 'no') f ct_avg <- sapply(f, function(x) x[2,2]/sum(x[,2])) ct_avg__mean <- mean(ct_avg) ct_avg__mean

However, when I want to write it in another loop, to get one value per year as a final result in the end, I end up with an error saying subscript out of bounds. This is the code I am using:

setwd("C:/Users/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/") list_ct <- list.dirs("C:/Users/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/") i=0 for (year in list_ct) { s_list <- list.files(year, recursive = FALSE, pattern = "\\.tif$", full.names = FALSE) s <- raster::stack(s_list) f <- freq(s, useNA = 'no') f ct_avg <- sapply(f, function(x) x[2,2]/sum(x[,2])) ct_avg__mean <- mean(ct_avg) ct_avg__mean }

Upvotes: 0

Robert Hijmans
Robert Hijmans

Reputation: 47146

Example data

library(raster)
s <- stack(system.file("external/rlogo.grd", package="raster"))
s <- s > 200
#plot(s)

If your actual data is all for the same area (and the raster data have the same extent and resolution, you want to create a RasterStack (using the filenames) and use freq as below

f <- freq(s)
f
#$red
#     value count
#[1,]     0  3975
#[2,]     1  3802

#$green
#     value count
#[1,]     0  3915
#[2,]     1  3862

#$blue
#     value count
#[1,]     0  3406
#[2,]     1  4371

Followed by

sapply(f, function(x) x[2,2]/sum(x[,2]))
# red.count green.count  blue.count 
#  0.4888775   0.4965925   0.5620419 

If you cannot make a RasterStack you can make a list and lapply and continue as above, or use sapply and do this

ss <- as.list(s)
x <- sapply(ss, freq)
x[4,] / colSums(x[3:4, ])
#[1] 0.4888775 0.4965925 0.5620419

If you insist on a loop

res <- rep(NA, length(ss))
for (i in 1:length(ss)) {
  # r <- raster(ss[i]) # if these were filenames
    r <- ss[[i]]  # here we extract from the list
    x <- freq(r)[,2]
    res[i] <- x[2] / sum(x)
} 
res
# 0.4888775 0.4965925 0.5620419

Upvotes: 0

Kristian
Kristian

Reputation: 88

Could you add a reproducible example (data incl.)?

You probably need to replace numeric(0) simply by 0. Numeric(0) does not mean 0, it means a numeric vector of length zero (i.e., empty). I'm guessing you're probably assigning numeric(0)+1 which is still a numeric vector of 0.

Edit:

You have a folder containing multiple folders which each include 1 or more tif files. You want to loop through each of these folders, importing the tif(s) file, do a calculation, save the result.

In the following, my path contains 5 folders named '2006','2007','2008','2009' and '2010'. Each of these "year"-folders contain an .xlsx file. Each .xlsx file contains 1 column (here, you just need to select the right one in your data frame). This column has the same name in all excel files, "col1", and contains values between 0 and 1. Then this will work:

library(dplyr)
library(readxl)

# 
list_ct <- list.dirs("mypath", recursive = FALSE)
stored <- list()

for (year in seq_along(list_ct)){

  ct_file_list <- list.files(list_ct[year], recursive=FALSE, pattern = ".xlsx$", full.names = FALSE)
  tmp <- list()

  for (i in seq_along(ct_file_list)){    
    ct_file_df   <- read_excel(paste0(list_ct[year], "/", ct_file_list[i])) %>% as.data.frame()

    # do calculations ..

    tmp[[i]] <- sum(ct_file_df$col1) / length(ct_file_df$col1)
    names(tmp)[i] <- paste0(list_ct[year], "/", ct_file_list[i])
    print(tmp[i])
  }
  stored[[year]] <- tmp
  names(stored)[year] <- paste0(list_ct[year])

}

Instead of using "read_excel", you just use raster() like you did with the single file. Hope you can use the answer.

Upvotes: 0

Related Questions