MYaseen208
MYaseen208

Reputation: 23948

Converting or printing all sections of rendered quarto document into html in one go

I want to convert Shiny for Python document into pdf. Jumping to each section and then printing into pdf is possible. However, wondering if there is a more compact way to print all sections in a one go.

Upvotes: 0

Views: 517

Answers (1)

Shafee
Shafee

Reputation: 20107

I can propose a solution based on wkhtmltopdf and python (to scrape the links of html files for different sections of the docs and pass them to pdfkit, a python library which is a wrapper for wkhtmltopdf utility to convert HTML to PDF.

So at first download the wkhtmltopdf and then install this tool on your system (you may read this to get help about installation process and if you are a windows user remember to add wkhtmltopdf to PATH).

Then check its availability from cmd/shell by,

$ wkhtmltopdf --version

# wkhtmltopdf 0.12.6 (with patched qt)

Now then install these python libraries (assuming you have python installed),

pip install requests beautifulsoup4 pdfkit

and then run this python script,

$ python html2pdf.py

html2pdf.py


import re
import pdfkit
import requests
from bs4 import BeautifulSoup

# Making a GET request
r = requests.get('https://shiny.rstudio.com/py/docs/get-started.html')

# print(r.status_code)
  
# Parsing the HTML
soup = BeautifulSoup(r.content, 'html.parser')
a = soup.find_all('a', class_='sidebar-link')

# get the links
links = [link.get('href') for link in a if link.get('href') is not None]
site_link = 'https://shiny.rstudio.com/py'
full_links = [site_link + link[2:] for link in links]

# for file names
names = [re.findall("(?:.+\/)(.+)(?:.html)", link)[0] for link in full_links] 

# convert the link of htmls to pdf
for i, link in enumerate(full_links):
    pdfkit.from_url(link, f"{names[i]}.pdf")

It will convert all the html files (links in the sidebar of https://shiny.rstudio.com/py/docs/) into pdf files in one go.

$ ls

get-started.pdf            reactive-programming.pdf  ui-navigation.pdf
html2pdf.py                reactive-values.pdf       ui-page-layouts.pdf
overview.pdf               running-debugging.pdf     ui-static.pdf
putting-it-together.pdf    server.pdf                user-interface.pdf
reactive-calculations.pdf  ui-dynamic.pdf            workflow-modules.pdf
reactive-events.pdf        ui-feedback.pdf           workflow-server.pdf
reactive-mutable.pdf       ui-html.pdf

Upvotes: 1

Related Questions