Alex
Alex

Reputation: 355

Create a R function to be used in a loop with lapply

I have thousands of txt files (1.txt; 2.txt; 3.txt...) to be used as input (predictions), and another file called "labels". I need to run few commands to create their respective outputs (AUC values). I am using the suggestion in a previous post (Looping through all files in directory in R, applying multiple commands).

But I am having trouble in creating my function to be included in this loop.

My original code (for 1 file predictions):

library(ROCR)
labels <- read.table(file="/data/labels/labels", header=F, sep="\t")
predictions <- read.table(file="/data/input/3.txt", header=F)
pred <- prediction(predictions, labels)
perf <- performance(pred,"tpr","fpr")
auc <- attr(performance(pred ,"auc"), "y.values")
auc
write.table(auc, "/data/out/AUC3.txt",sep="\t")

My code so far (not working):

library(ROCR)
labels <- read.table(file="/data/labels/labels", header=F, sep="\t")

files <- list.files(path="/data/input/", pattern="*.txt", full.names=TRUE, recursive=FALSE)

auc <- function(r) {
    pred <- prediction(files, labels)
    perf <- performance(pred,"tpr","fpr")
    auc <- attr(performance(pred ,"auc"), "y.values")
}

lapply(files, function(x) {
    t <- read.table(x, header=F) # load file
    out <- auc(t)
    write.table(out, "/data/out/", sep="\t")
})

Error message:

Error in prediction(files, labels) :
Number of predictions in each run must be equal to the number of labels for each run.
Calls: lapply -> FUN -> auc -> prediction
Execution halted

Upvotes: 0

Views: 99

Answers (1)

Jan
Jan

Reputation: 5254

The problem is this statement files$V1. files is created by list.files and that function returns an atomic vector (see ?list.files). You cannot use $ with atomic vectors. You have to address the element using a numeric index files[###] with ### being the correct index.

Upvotes: 2

Related Questions