Diogo
Diogo

Reputation: 871

Problems with Downloading pdf file using R

I would like to download a pdf file from the internet and save it in the local HD. After download, the pdf output file has lots of empty pages. What can I do to fix it?

Example:

require(XML)
url <- ('http://cran.r-project.org/doc/manuals/R-intro.pdf')
download.file(url, 'introductionToR.pdf')

Thanks in advance.

Upvotes: 18

Views: 15507

Answers (2)

Selcuk Akbas
Selcuk Akbas

Reputation: 711

you can download pdfs and export tables as data.frame using tabulizer package

https://ropensci.org/tutorials/tabulizer_tutorial.html

install.packages("devtools")
# on 64-bit Windows
ghit::install_github(c("ropenscilabs/tabulizerjars", "ropenscilabs/tabulizer"), INSTALL_opts = "--no-multiarch")
# elsewhere
ghit::install_github(c("ropenscilabs/tabulizerjars", "ropenscilabs/tabulizer"))

library(tabulizer)

f2 <- "https://github.com/leeper/tabulizer/raw/master/inst/examples/data.pdf"
extract_tables(f2, pages = 1, method = "data.frame")

Upvotes: -1

Sophia
Sophia

Reputation: 1951

Try with wb-mode like this:

download.file(url, 'introductionToR.pdf', mode="wb").

For me it works that way.

Upvotes: 50

Related Questions