Reputation: 101
I'm trying to create a loop in my script in order to obtain an output file. In my script I already have other 2 loops for reading input files. At the end of my script I have a df in my environment that I'd like to save as a .csv with the name of the input files. Here a simplified version of my codes:
filePdbs <- Sys.glob("*.pdb")
fileTurns <- Sys.glob("*.txt")
for (filePdb in filePdbs) {
"pdb" <- print(read.pdb(filePdb))
##other stuff here
"coord" <- print(read.pdb(filePdb))
##other stuff here
for (fileTurn in fileTurns) {
"turn" <- read.delim(fileTurn, header = T, sep = "")
## here I have lines for merging info from pdb and txt and I obtain my df that I'd like to save as csv
}
}
For this 2 input file loops I have to create the third global one for the csv output having the same name of input file. Input files are something like "1abc_A.pdb" and "1abc_A.txt", I'd like to have as output "1abc_A.csv" How can I do it?
Upvotes: 0
Views: 572
Reputation: 96
I see several issues with your code, as already pointed out in the comments, but the actual question is how to change the file ending in the string, as far as I understand the question.
The easiest way would be to use sub(pattern="pdb$", replacement="csv", x=filePdb)
. This finds the letters 'pdb' at the end of strings ($ means end of line) and replaces it with 'csv'. I would put this line in your second look immediately after modifying your variables. Alternatively you could use filePdbs
instead of filePdb
and save all modified file names first before using them.
So here is what I would change in your code example:
When you read-in the files you don't need print
which in most cases prints the content to the consol, but we want to save the file content in a variable.
Variables do not get quotes. Putting quotes around the name changes the meaning and you get a character string in which you cannot store anything.
You seems to read the filePdb file twice within the loop. This is inefficient. If you want to access the matrix with the coordinates to store it in a variable called 'coord', you can get the matrix with pdb$xyz
(we do talk here about the bio3d package, right?).
So then you loop through every .txt file for each .pdb file. ([no of txt] * [no of pdb]). I think you might have file pairs and if so you need to open only 1 txt file per pdb file. You could achieve this with
for (i in seq(along=filePdbs)) {
pdb <- read.pdb(filePdbs[i])
turn <- read.delim(fileTurn[i], header = T, sep = "")
# ...
}
(Be sure the sep
argument fits your purpose.)
Finally, as mentioned above, put the write.csv()
(or for more control of output settings write.table()
) inside your inner loop and modify the name with sub()
.
filePdbs <- Sys.glob("*.pdb")
fileTurns <- Sys.glob("*.txt")
for (filePdb in filePdbs) {
pdb <- read.pdb(filePdb)
##other stuff here
coord <- pdb$xyz
##other stuff here
for (fileTurn in fileTurns) {
turn <- read.delim(fileTurn, header = T, sep = "")
## here I have lines for merging info from pdb and txt and I obtain my df that I'd like to save as csv
write.csv(result,
file=sub("pdb$", "csv", x=filePdb),
row.names=F)
}
}
Upvotes: 1