Reputation: 5696
How to extract the identifiers which do not have corresponding files being generated?
Identifiers which are given as input for generation fo files:
fileIden <- c('a-1','a-2','a-3','b-1','b-2','c-1','d-1','d-2','d-3','d-4')
Checking the files generated:
files <- list.files(".")
files
# [1] "a-2.csv" "a-3.csv" "b-1.csv" "c-1.csv" "d-3.csv"
# Generated here for reproducibility.
# files <- c("a-2.csv", "a-3.csv", "b-1.csv", "c-1.csv", "d-3.csv")
Expected files if all the process is completely successful
fileExp <- paste(fileIden, ".csv", sep = "")
# [1] "a-1.csv" "a-2.csv" "a-3.csv" "b-1.csv" "b-2.csv" "c-1.csv" "d-1.csv" "d-2.csv" "d-3.csv" "d-4.csv"
Any expected files are missing?
fileMiss <- fileExp[!fileExp %in% files]
# [1] "a-1.csv" "b-2.csv" "d-1.csv" "d-2.csv" "d-4.csv"
Expected output
# "a-1" "b-2" "d-1" "d-2" "d-4"
I am sure that there is an easy process directly to get the above output without creating the files: fileExp
, fileMiss
. Could you please guide me there?
Upvotes: 0
Views: 52
Reputation: 886
a less elegant approach
result <- ifelse(fileIden %in% substr(file, 1, 3), "", fileIden)
result[result != ""]
Upvotes: 0
Reputation: 11128
You can do this :
fileIden <- c('a-1','a-2','a-3','b-1','b-2','c-1','d-1','d-2','d-3','d-4')
file <- c("a-2.csv", "a-3.csv" ,"b-1.csv", "c-1.csv", "d-3.csv")
setdiff(fileIden, trimws(gsub("\\.csv","", file)))
Another approach:
setdiff(fileIden, stringr::str_extract(file,"(.*)(?=\\.csv)"))
Logic:
setdiff
finds the difference between two vectors, gsub
replaces the ".csv" with nothing , we club them together to find the difference between those vectors.
Output:
#[1] "a-1" "b-2" "d-1" "d-2" "d-4"
Upvotes: 1