Reputation: 157
I´m making a list
of fasta
files and read them from a folder. The file name should be assigned as list element
name w/o the .fa
file format.
I´m using list.files
to asses the files in the directory "Folder"
filenames <- list.files("Folder",pattern = ".fa",full.names = T)
and than read the fasta files in.
list <- lapply(filenames, FUN=readDNAStringSet, use.names=T, format="fasta")
I found this code using setNames
to define the list
element name.
list<- setNames(list, substr(list.files("Folder", pattern=".fa"), 1,15 ))
But my file names have different length (makes it difficult to use the START to STOP (,1, 15
)) and for further processing I would like to get rid of the .fa
The files would look like:
Gene1.fa
Gene12.fa
Gene22a.fa
Gene123abc.fa
I´m using DECIPHER
but I guess this is a more base R question?
Upvotes: 3
Views: 1298
Reputation: 887098
Inorder to remove the substring at the end, we could use substr
as well, but make sure to index the first/last from the end instead from the beginning as it is varying
v1 <- list.files("Folder", pattern=".fa")
substring(v1, first = 1, last = nchar(v1) -3)
#[1] "Gene1" "Gene12" "Gene22a" "Gene123abc"
Or another option is sub
to match the dot (.
- metacharacter that matches for any character, so escape (\\
) it to get the literal meaning) followed by 'fa' at the end ($
) of the string and replace it with blank (""
)
sub("\\.fa$", "", v1)
Upvotes: 2