Reputation: 33
I am trying to pass directory as input to a function and use it as input to read.csv to read CSV files. However when during the process the read.csv is modifying the file name string sent at runtime.
Directory:"C:/SAT/Self Courses/R/data/specdata" Inside this directory there are number of CSV files i want to read and act upon with the following functions
complete<-function(directory,id=1:332)
{
gFull<-c()
ids<-str_pad(id,3,pad="0")
idExt<-paste(ids,".csv",sep="")
dir<-paste(directory,idExt,sep="/")
for(i in dir)
{
tableTemp<- read.csv(i,header=T)
tableTemp<- na.omit(tableTemp)
gFull<-c(gFull,nrow(tableTemp))
}
output<-data.frame(id,gFull,stringsAsFactors = F)
return(output)
}
cor_sub<-function(data,directory)
{
#print(directory)
id<-data[1]
id<-str_pad(id,3,pad="0")
id<-paste(id,".csv",sep="")
#print(id)
dir_temp<-paste(directory,id,sep="/")
print(dir_temp)
#read table
input<-read.csv(dir_temp,header=T)
input<-na.omit(input)
#correlation
return (cor(input$sulfate,input$nitrate))
}
cor<-function(directory,threshold=0)
{
#find the thresholds of each file
qorum<-complete(directory,1:12)
print(threshold)
qorum$gFull[qorum$gFull<threshold]<-NA
qorum<-na.omit(qorum)
v_cor<-apply(qorum,1,cor_sub,directory)
#(v_cor)
}
I execute this code with a call
cor("C:/SAT/Self Courses/R/data/specdata",0)
The error output which i get is
> cor("C:/SAT/Self Courses/R/data/specdata",0)
[1] 0
[1] "C:/SAT/Self Courses/R/data/specdata/001.csv"
Show Traceback
Rerun with Debug
Error in file(file, "rt") : cannot open the connection In addition: Warning message:
In file(file, "rt") :
cannot open file '7.21/001.csv': No such file or directory
The problem is dir_temp : I have "C:/SAT/Self Courses/R/data/specdata/001.csv" however in the nextline read.csv is taking input '7.21/001.csv'
Please bear with me if the question seems trivial, i am still in Novice mode :)
Upvotes: 1
Views: 665
Reputation: 460
See if this works for you (I'm ignoring most of the code that you have tried thus far because it seems unnecessarily complicated and not runnable anyways):
results <- list()
threshold <- 0 # file must have this many lines to be processed
filepath <- "C:/SAT/Self Courses/R/data/specdata"
filenames <- list.files(filepath) # assumes you want all files in directory
suppressWarnings(
for(filename in filenames) {
# construct the path for this particular file, and read it
fpath <- paste(filepath, filename, sep="/")
input <- read.csv(fpath, header=TRUE)
# check if threshold is met, skip if not
if(nrow(input) <= threshold)) next
input <- na.omit(input) # do you want this before the threshold check?
# store our correlation in our results list
# stats::cor() to avoid confusion with your defined function
results[[filename]] <- stats::cor(input$sulfate, input$nitrate)
})
print(results)
Let me know if you have any questions about how this works below (I haven't actually run it, tbh). You should be able to take it from here and generalize it to your needs.
Upvotes: 1