Reputation: 347
I'm trying to create a list of files from a directory containing files with the following patterns:
Name_Surname_12345_noe_xy.xls
Name_Surname_12345_xy.xls
xy can be one or two characters.
Now I want a list of all files wich do not contain "noe" in the filename. I can read in only "noe" - files using
fl = list.files(pattern = "noe.+xls$", recursive=T, full.names=T)
but found no way to exclude them. Any suggestions?
Many thanks
Markus
Upvotes: 1
Views: 1796
Reputation: 94267
Get all the files and then use grep
to find the noe
ones and subset them out:
> all
[1] "Name_Surname_123425_xy.xls" "Name_Surname_1234445_xy.xls"
[3] "Name_Surname_12345_noe_xy.xls" "Name_Surname_12345_xy.xls"
[5] "Name_Surname_13245_noe_xy.xls"
> all[grep("noe_xy.xls",all,invert=TRUE)]
[1] "Name_Surname_123425_xy.xls" "Name_Surname_1234445_xy.xls"
[3] "Name_Surname_12345_xy.xls"
always make sure you check the edge cases where all or none of the files match:
> all[grep("xls",all,invert=TRUE)]
character(0)
> all[grep("fnord",all,invert=TRUE)]
[1] "Name_Surname_123425_xy.xls" "Name_Surname_1234445_xy.xls"
[3] "Name_Surname_12345_noe_xy.xls" "Name_Surname_12345_xy.xls"
[5] "Name_Surname_13245_noe_xy.xls"
Using grep with a negative index works except in these edge cases:
> all
[1] "Name_Surname_123425_xy.xls" "Name_Surname_1234445_xy.xls"
[3] "Name_Surname_12345_noe_xy.xls" "Name_Surname_12345_xy.xls"
[5] "Name_Surname_13245_noe_xy.xls"
> all[-grep("noe_xy.xls",all)] # strip out the noe_xy.xls files
[1] "Name_Surname_123425_xy.xls" "Name_Surname_1234445_xy.xls"
[3] "Name_Surname_12345_xy.xls"
# works. Now strip out any xls files (should leave nothing)
> all[-grep("xls",all)]
character(0)
# yup, that works too. Now strip out 'fnord' files, shouldn't remove anything:
> all[-grep("fnord",all)]
character(0)
Epic fail! Reason is left as an exercise to the reader.
Upvotes: 3