Reputation: 355
I have thousands of txt files (1.txt; 2.txt; 3.txt...) to be used as input (predictions), and another file called "labels". I need to run few commands to create their respective outputs (AUC values). I am using the suggestion in a previous post (Looping through all files in directory in R, applying multiple commands).
But I am having trouble in creating my function to be included in this loop.
My original code (for 1 file predictions
):
library(ROCR)
labels <- read.table(file="/data/labels/labels", header=F, sep="\t")
predictions <- read.table(file="/data/input/3.txt", header=F)
pred <- prediction(predictions, labels)
perf <- performance(pred,"tpr","fpr")
auc <- attr(performance(pred ,"auc"), "y.values")
auc
write.table(auc, "/data/out/AUC3.txt",sep="\t")
My code so far (not working):
library(ROCR)
labels <- read.table(file="/data/labels/labels", header=F, sep="\t")
files <- list.files(path="/data/input/", pattern="*.txt", full.names=TRUE, recursive=FALSE)
auc <- function(r) {
pred <- prediction(files, labels)
perf <- performance(pred,"tpr","fpr")
auc <- attr(performance(pred ,"auc"), "y.values")
}
lapply(files, function(x) {
t <- read.table(x, header=F) # load file
out <- auc(t)
write.table(out, "/data/out/", sep="\t")
})
Error message:
Error in prediction(files, labels) :
Number of predictions in each run must be equal to the number of labels for each run.
Calls: lapply -> FUN -> auc -> prediction
Execution halted
Upvotes: 0
Views: 99
Reputation: 5254
The problem is this statement files$V1
. files
is created by list.files
and that function returns an atomic vector (see ?list.files
). You cannot use $
with atomic vectors. You have to address the element using a numeric index files[###]
with ###
being the correct index.
Upvotes: 2