lucacerone
lucacerone

Reputation: 10149

R: get list of files but not of directories

In R how can I get a list of files in a folder, but not of the directories?

I have tried using dir(), list.files(), list.dirs() with different options, but none of them seems to work.

Upvotes: 43

Views: 13247

Answers (8)

mattador
mattador

Reputation: 481

Here's another solution using a regular expression to exclude listings that don't have a "."

list.files("dir_path",pattern="\\.")

Upvotes: 3

see24
see24

Reputation: 1230

If you are willing to try a non-base R package try the fs package

To get just the files in a directory

fs::dir_ls("dir_path", type = "file")

Upvotes: -1

Dunois
Dunois

Reputation: 1843

I wrote a small wrapper function that tackles precisely this problem:

list_files <- function(path = ".", pattern = NULL, all.files = FALSE, full.names = TRUE, 
                       recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE, 
                       incl_dirs = FALSE){
  
  #Set incl_dirs = TRUE to revert to default list.files() behavior.

  if(path == ".") { path = getwd() }
  
  #Include directories if recursive is set.
  if(incl_dirs & recursive) { include.dirs = TRUE }
  
  #Needs to have full.names = TRUE in order to get full path to pass to dir.exists().
  files <- list.files(path = path, pattern = pattern, all.files = all.files, full.names = TRUE, 
                      recursive = recursive, ignore.case = ignore.case, include.dirs = include.dirs, 
                      no.. = no..)
  
  if(!incl_dirs){
    files <- files[!dir.exists(files)]
  }
  
  if(!full.names){
    return(basename(files))
  } else{
    return(files)
  }
  
}

With the following example directory structure:

dir_test_lvl0/
├── dir_test_lvl1
│   ├── dir_test_lvl2
│   │   ├── dir_test_lvl3
│   │   │   ├── dir_test_lvl4
│   │   │   └── file_test_lvl4
│   │   └── file_test_lvl3
│   └── file_test_lvl2
└── file_test_lvl1

The outputs would look like this, depending on whether incl_dirs and recursive are set (or not).

#No directories presented.
#Recursive.
list_files("dir_test_lvl0", incl_dirs = FALSE, recursive = TRUE, full.names = FALSE)
# [1] "file_test_lvl4" "file_test_lvl3" "file_test_lvl2" "file_test_lvl1"

#No directories presented.
#Non-Recursive.
list_files("dir_test_lvl0", incl_dirs = FALSE, recursive = FALSE, full.names = FALSE)
# [1] "file_test_lvl1"


#With directories presented (default list.files() behavior).
#Non-recursive.
list_files("dir_test_lvl0", incl_dirs = TRUE, recursive = FALSE, full.names = FALSE)
# [1] "dir_test_lvl1"  "file_test_lvl1"


#With directories presented (default list.files() behavior).
#Recursive.
list_files("dir_test_lvl0", incl_dirs = TRUE, recursive = TRUE, full.names = FALSE)
# [1] "dir_test_lvl1"  "dir_test_lvl2"  "dir_test_lvl3"  "dir_test_lvl4"  "file_test_lvl4"
# [6] "file_test_lvl3" "file_test_lvl2" "file_test_lvl1"

All other list.files() options are passed on to it faithfully by list_files(). The function has no external dependencies (base R only).

Upvotes: 1

abalter
abalter

Reputation: 10393

The fact that base R does not have a direct method to do this is somewhat appalling. The fact that BASH doesn't have a direct way is also a bit odd.

In my opinion, the best R solution is to simply appeal to the shell:

filenames = system('ls -p | grep -v /', intern=T)

Explanation:

ls -p     Append "/" to end of directory names
grep -v   Exclude strings matching "/"
intern=T  store the output in the variable rather then printing to stdout

Upvotes: 3

DzMatt
DzMatt

Reputation: 527

So, I know that these are all old and that there was an accepted answer, but I tried most of them and none really worked.

Here is what I got:

  1. Example of all files in a folder:

    files <- list.files("Training/Out/")
    
  2. Output of that code:

    [1] "Filtered"           "Training_Chr01.txt" "Training_Chr02.txt" "Training_Chr03.txt"
    [5] "Training_Chr04.txt" "Training_Chr05.txt" "Training_Chr06.txt" "Training_Chr07.txt"
    [9] "Training_Chr08.txt" "Training_Chr09.txt" "Training_Chr10.txt"
    

Where the first one [1] is a directory

  1. Ran this code to get only the files:

    files <- list.files("Training/Out",recursive = TRUE)
    
  2. With this output:

    [1] "Training_Chr01.txt" "Training_Chr02.txt" "Training_Chr03.txt" "Training_Chr04.txt"
    [5] "Training_Chr05.txt" "Training_Chr06.txt" "Training_Chr07.txt" "Training_Chr08.txt"
    [9] "Training_Chr09.txt" "Training_Chr10.txt"
    

This is more or less to help someone who looks at this and was as confused as I was.

Upvotes: 1

BrodieG
BrodieG

Reputation: 52687

Another option:

Filter(function(x) file_test("-f", x), list.files())

And if you want to get fully functional with library functional, then you can save a few keystrokes:

Filter(Curry(file_test, "-f"), list.files())

This latter one transforms file_test into a function with the first argument set to "-f", which is basically what we did in the first approach, but Curry does it more cleanly because of the lamentable decision to have the function keyword be so long (why not f(x) {...}???)

Upvotes: 2

tonytonov
tonytonov

Reputation: 25638

Here's one possibility:

all.files <- list.files(rec=F)
all.files[!file.info(all.files)$isdir]

Another option (pattern for files with extensions, not so universal, of course):

Sys.glob("*.*")

Upvotes: 24

Sven Hohenstein
Sven Hohenstein

Reputation: 81743

setdiff(list.files(), list.dirs(recursive = FALSE, full.names = FALSE))

will do the trick.

Upvotes: 40

Related Questions