Carmellose
Carmellose

Reputation: 5088

Compute the size of directory in R

I want to compute the size of a directory in R. I tried to use the list.info function, by unfortunably that follows the symbolic links so my results are biased:

# return wrong size, with duplicate counts for symlinks
sum(file.info(list.files(path = '/my/directory/', recursive = T, full.names = T))$size)

How do I compute the file size of a directory, so that it gives me the same result as on Linux, e.g. with du -s for example?

Thanks

Upvotes: 14

Views: 9901

Answers (5)

islem
islem

Reputation: 246

"file.size" return the actual size, size on disk is the actual amount of space being taken up on the disk. check this to understand the difference . https://superuser.com/questions/66825/what-is-the-difference-between-size-and-size-on-disk try this for size of all files:

 files<-list.files(path_of_directory, full.names = TRUE, recursive = TRUE)
 vect_size <- sapply(files, file.size)
 size_files <- sum(vect_size)

Upvotes: 3

Hope
Hope

Reputation: 129

Recently, I have deal with this problem and here is my code:

library(pacman)
p_load(fs,tidyfst)

sys_time_print({
  dir_info(your_directory_path) -> your_dir_info
})

your_dir_info %>% 
  summarise_dt(size = sum(size,na.rm = T))

When I first run the code above, it takes about 3min to track 52G files (in 174,731 separate files). Later when I run again, it takes shorter than 6s. This is amazing.

Upvotes: 2

polkas
polkas

Reputation: 4184

Healthy solution, might be very useful for checking a package size.

dir_size <- function(path, recursive = TRUE) {
  stopifnot(is.character(path))
  files <- list.files(path, full.names = T, recursive = recursive)
  vect_size <- sapply(files, function(x) file.size(x))
  size_files <- sum(vect_size)
  size_files
}

cat(dir_size(find.package("Rcpp"))/10**6, "MB")
#> 14.81649 MB

Created on 2021-06-26 by the reprex package (v2.0.0)

Upvotes: 5

Carmellose
Carmellose

Reputation: 5088

I finally used this:

system('du -s')

Upvotes: 7

Hack-R
Hack-R

Reputation: 23210

system('powershell -noprofile -command "ls -r|measure -s Length"')

References:

  1. https://technet.microsoft.com/en-us/library/ff730945.aspx
  2. Get Folder Size from Windows Command Line
  3. https://stat.ethz.ch/R-manual/R-devel/library/base/html/system.html
  4. https://superuser.com/questions/217773/how-can-i-check-the-actual-size-used-in-an-ntfs-directory-with-many-hardlinks

You can also leverage cygwin if you have it; this lets you use Linux commands and get comparable results. Further there's a nice solution using Sysinternals in the last link I gave above.

Upvotes: 5

Related Questions