chawkins
chawkins

Reputation: 1973

Using R to list all files with a specified extension

I'm very new to R and am working on updating an R script to iterate through a series of .dbf tables created using ArcGIS and produce a series of graphs.

I have a directory, C:\Scratch, that will contain all of my .dbf files. However, when ArcGIS creates these tables, it also includes a .dbf.xml file. I want to remove these .dbf.xml files from my file list and thus my iteration. I've tried searching and experimenting with regular expressions to no avail. This is the basic expression I'm using (Excluding all of the various experimentation):

files <- list.files(pattern = "dbf")

Can anyone give me some direction?

Upvotes: 187

Views: 212624

Answers (6)

Marek
Marek

Reputation: 50704

files <- list.files(pattern = "\\.dbf$")

$ at the end means that this is end of string. "dbf$" will work too, but adding \\. (. is special character in regular expressions so you need to escape it) ensure that you match only files with extension .dbf (in case you have e.g. .adbf files).

To ignore case use:

files <- list.files(pattern = "\\.dbf$", ignore.case = TRUE)

to match e.g.: FILE.DBF.

Upvotes: 279

dipetkov
dipetkov

Reputation: 3690

Another option is the fs::dir_ls function. It allows to search with either a wildcard pattern (such as "*.dbf") or with a regex pattern such as "dbf$".

fs::dir_ls(dir, recurse = FALSE, glob = "*.dbf")
fs::dir_ls(dir, recurse = FALSE, regex = "dbf$")

Upvotes: 5

Surya Chhetri
Surya Chhetri

Reputation: 11568

Gives you the list of files with full path:

  Sys.glob(file.path(file_dir, "*.dbf")) ## file_dir = file containing directory

Upvotes: 12

donshikin
donshikin

Reputation: 1503

I am not very good in using sophisticated regular expressions, so I'd do such task in the following way:

files <- list.files()
dbf.files <- files[-grep(".xml", files, fixed=T)]

First line just lists all files from working dir. Second one drops everything containing ".xml" (grep returns indices of such strings in 'files' vector; subsetting with negative indices removes corresponding entries from vector). "fixed" argument for grep function is just my whim, as I usually want it to peform crude pattern matching without Perl-style fancy regexprs, which may cause surprise for me.

I'm aware that such solution simply reflects drawbacks in my education, but for a novice it may be useful =) at least it's easy.

Upvotes: 12

G. Grothendieck
G. Grothendieck

Reputation: 269371

Try this which uses globs rather than regular expressions so it will only pick out the file names that end in .dbf

filenames <- Sys.glob("*.dbf")

Upvotes: 75

Gavin Simpson
Gavin Simpson

Reputation: 174778

Peg the pattern to find "\\.dbf" at the end of the string using the $ character:

list.files(pattern = "\\.dbf$")

Upvotes: 16

Related Questions