user1071558
user1071558

Reputation: 13

Only input file names that appear in a list

I am trying to extract a list of sites from a dataframe and then load the .csv files which match these site names from a folder.

I have found multiple ways to load all the files in a given directory but haven't been able to find anything that suitably answers my question. There are no common features by which to subset the file names as I have seen in other examples. There are over 150 sites so I don't want to just load all the .csv files each time.

Is there a way to select a subset of files in a folder that match the name in a list and to only load these files? Once these files are loaded I need to perform the same analysis on each file, so I am looking for a way to load these files to make this further analysis as efficient as possible.

Any help will be greatly appreciated.

    trials<-read.csv("trial_associations.csv")
    trials
    site.name    red    blue    green    yellow
    upper.hill   yes     no      yes       no
    lower.hill   yes     yes     no        yes
    upper.lake   no      no      yes       yes
    lower.lake   no      yes     yes       no

    site<-trials[trials$red=="yes",]
    sitelist<-data.frame(site[,1])

example of sitelist

    site.name
    upper.hill
    lower.hill
    etc.

example file in sitenames folder - each file has four columns with headers and ~5800 rows

    a     b     c     d
    yes   no    no    yes
    no    yes   no    no
    yes   yes   yes   yes
    no    no    yes   no

file names in sitenames folder

    upper.hill.csv
    lower.hill.csv
    lower.lake.csv
    upper.lake.csv
    etc

Then I need to use the names in sitelist to load the .csv files from the sitenames folder within the working directory.

I have used

    list.files(dir)

to get the list of files in the directory. But am unsure how to utilise the names in sitelist to access certain files in sitenames folder.

I hope that makes things a little clearer, Thanks

Upvotes: 1

Views: 299

Answers (1)

John
John

Reputation: 23758

The code you're probably using to get all files is very similar to what you need to get some. Typically, to get all files in a director you use list.files('myDir'), or some such. Just run that part of the code and see what the result is. What you'll see is that it's just a character vector containing all of the names of all of the files.

Once you understand that it's easy. You either acquire your character vector another way, or you just subset this character vector. For example, if the list of files you want is in a file called "file list" then you can get the names with scan.

fList <- scan('file list')

Now you can just read all of those files in...

dList <- lapply(fList, read.table)

... or something like that. You already have code like that you can adapt. If you just want a randome subset of all of the files then something like this would suffice.

fList <- list.files(`myDir')  #or leave out 'myDir' for working directory - this gets all files names
subfList <- sample(fList, 4)  #just get a random 4 files

Perhaps that will get you started. It's hard to recommend something more precise.

Upvotes: 2

Related Questions