Reputation: 1973
I'm very new to R and am working on updating an R script to iterate through a series of .dbf tables created using ArcGIS and produce a series of graphs.
I have a directory, C:\Scratch, that will contain all of my .dbf files. However, when ArcGIS creates these tables, it also includes a .dbf.xml file. I want to remove these .dbf.xml files from my file list and thus my iteration. I've tried searching and experimenting with regular expressions to no avail. This is the basic expression I'm using (Excluding all of the various experimentation):
files <- list.files(pattern = "dbf")
Can anyone give me some direction?
Upvotes: 187
Views: 212624
Reputation: 50704
files <- list.files(pattern = "\\.dbf$")
$
at the end means that this is end of string. "dbf$"
will work too, but adding \\.
(.
is special character in regular expressions so you need to escape it) ensure that you match only files with extension .dbf
(in case you have e.g. .adbf
files).
To ignore case use:
files <- list.files(pattern = "\\.dbf$", ignore.case = TRUE)
to match e.g.: FILE.DBF
.
Upvotes: 279
Reputation: 3690
Another option is the fs::dir_ls function. It allows to search with either a wildcard pattern (such as "*.dbf"
) or with a regex pattern such as "dbf$"
.
fs::dir_ls(dir, recurse = FALSE, glob = "*.dbf")
fs::dir_ls(dir, recurse = FALSE, regex = "dbf$")
Upvotes: 5
Reputation: 11568
Gives you the list of files with full path:
Sys.glob(file.path(file_dir, "*.dbf")) ## file_dir = file containing directory
Upvotes: 12
Reputation: 1503
I am not very good in using sophisticated regular expressions, so I'd do such task in the following way:
files <- list.files()
dbf.files <- files[-grep(".xml", files, fixed=T)]
First line just lists all files from working dir. Second one drops everything containing ".xml" (grep returns indices of such strings in 'files' vector; subsetting with negative indices removes corresponding entries from vector). "fixed" argument for grep function is just my whim, as I usually want it to peform crude pattern matching without Perl-style fancy regexprs, which may cause surprise for me.
I'm aware that such solution simply reflects drawbacks in my education, but for a novice it may be useful =) at least it's easy.
Upvotes: 12
Reputation: 269371
Try this which uses globs rather than regular expressions so it will only pick out the file names that end in .dbf
filenames <- Sys.glob("*.dbf")
Upvotes: 75
Reputation: 174778
Peg the pattern to find "\\.dbf"
at the end of the string using the $
character:
list.files(pattern = "\\.dbf$")
Upvotes: 16