Reputation: 2482
Maybe it's something trivial and I simply was looking for too long at the same code... When sourcing R module getFLOSSmoleDataXML.R
via RStudio, the code correctly detects .Rdata files in cache
directory and skips downloading and parsing phases. When, on the other hand, the same module gets processed by R via GNU make
(sudo -u ruser make
), the result is, well, strange:
Rscript --no-save --no-restore --verbose getFLOSSmoleDataXML.R
running
'/usr/lib/R/bin/R --slave --no-restore --no-save --no-restore --file=getFLOSSmoleDataXML.R'
Loading required package: RCurl
Loading required package: methods
Loading required package: bitops
Loading required package: XML
Loading required package: digest
Verifying repository: FreeCode
Checking file "http://flossdata.syr.edu/data/fc/2013/2013-Dec/fcProjectAuthors2013-Dec.txt.bz2"...
rdataFile = "./cache/5802dbd08ebefadf70fbb826776f9f0f.Rdata"...
trying URL 'http://flossdata.syr.edu/data/fc/2013/2013-Dec/fcProjectAuthors2013-Dec.txt.bz2'
Content type 'application/x-bzip2' length 514960 bytes (502 Kb)
opened URL
==================================================
downloaded 502 Kb
Error in gzfile(file, "wb") : cannot open the connection
Calls: print ... FUN -> importRepoFiles -> lapply -> FUN -> save -> gzfile
In addition: Warning message:
In gzfile(file, "wb") :
cannot open compressed file './cache/5802dbd08ebefadf70fbb826776f9f0f.Rdata', probable reason 'No such file or directory'
Timing stopped at: 0.74 0.068 1.134
Execution halted
make[1]: *** [importFLOSSmole] Error 1
make[1]: Leaving directory `/home/ruser/diss-floss/import'
make: *** [collection] Error 2
ubuntu@ip-10-164-108-61:/home/ruser/diss-floss$ ls -l cache/5802*
-rw-r--r-- 1 ruser ruser 1968939 Feb 19 05:47 cache/5802dbd08ebefadf70fbb826776f9f0f.Rdata
As you see from the last two lines, I verified and confirm that the file indeed exists. What is going on here? Any ideas or advice? Thank you!
Upvotes: 0
Views: 171
Reputation: 2482
After brief investigation, I've found the source of this problem myself. As I expected, it's really a simple and small mistake, which I will describe to prevent other people from bumping into similar things.
When I use file.exists()
in my code, I pass as parameter the relative path to the file in question. I construct that path by concatenating the hard-coded "cache" directory and the dynamically determined file name itself:
# calculate URL's digest and generate corresponding RData file name
fileDigest <- digest(url, algo="md5", serialize=F)
rdataFile <- paste(RDATA_DIR, "/", fileDigest, RDATA_EXT, sep = "")
However, I forgot that make
leaves the top-level project directory and enters the sub-directory to build the code and, thus, the hard-coded value of relative path to "cache" directory (RDATA_DIR="./cache"
) becomes incorrect. Simple change (RDATA_DIR="../cache"
) fixed the problem.
That explains the reason behind the "magic" :-), when the same code builds successfully manually (R or RStudio), but fails when building via make
. Having said that, I recognize that this might not be the best practice to rely on the predetermined directory structure, but due to time limits I have to decide on compromises (and add items to TODO [potential improvements] list). I will gladly listen to your advice on the best practices in this area.
Upvotes: 1