Merging tableau generated pdfs without saving to disk first

Question

I have a problem with exporting pdfs from Tableau online using the Tableau server client (TSC) library using python.

I have a workbook with 8 dashboards that I want to create a filtered pdf with. I loop through the 8 dashboards and apply a filter for customerID to each. The thing is that I want to combine the pdfs into 1 pdf for the whole workbook for each customer and it's here that I encounter some problems.

The code I'm using is:

import PyPDF2
import tableauserverclient as TSC

tableau_token_name = 'TOKENNAME'
tableau_token_value = 'TOKENSECRET'

tableau_site_name = 'SITE'
tableau_server_name = 'SERVERNAME'

##Specify the workbook id that is the base for the pdf
tableau_workbook_id = 'WORKBOOKID'
##Specify the filename start for the pdf
pdf_file_name_prefix = 'PREFIX_FOR_PDF_NAME'

##create the login credentials
tableau_auth = TSC.PersonalAccessTokenAuth(tableau_token_name, tableau_token_value, tableau_site_name)
server = TSC.Server(tableau_server_name, use_server_version=True, http_options={"verify": False})

##Create the export options for the pdf
pdf_req_option = TSC.PDFRequestOptions(page_type=TSC.PDFRequestOptions.PageType.A5,orientation=TSC.PDFRequestOptions.Orientation.Landscape)
    
##create a pdf merger object 
PDFMerger = PyPDF2.PdfMerger()

with server.auth.sign_in(tableau_auth):
    ##Get the workbook that needs exporting
    workbook = server.workbooks.get_by_id(tableau_workbook_id)
    
    ##get the views from the workbook
    server.workbooks.populate_views(workbook)

    ##create a list of views from the workbook to iteratate on
    view_list = [view for view in workbook.views]

    
    customerIDs = ["1","2","3"]

    for customerID in customerIDs:

        for view in view_list:
            
            ##set export filter
            pdf_req_option.vf("customerID", customerID)

            ##server.views.get_by_id(view_id)
            server.views.populate_pdf(view, pdf_req_option)
            

            PDFMerger.append(view.pdf)

    ##writing the file as an example. The actual script will write to blob storage in azure
    PDFMerger.write("workbook" + str(customerID) + ".pdf")

          
    server.auth.sign_out()

The problem I'm encountering is that the PDF merger does not recognize my pdf object as a pdf. It just says:

AttributeError: 'bytes' object has no attribute 'seek'

If I save the view.pdf as its own file and then loop through the saved files from the hard drive it works fine. Does anyone have any suggestions on how to do this without having to save the files as pdf before? I'm going to have this script running in a serverless environment and I don't want to save the individual PDFs.

I tried to merge with the PyPDF2.PdfMerger.append() function but for some reason, the pdf object that I get from tableau is not recognized as a pdf object and seems to be missing something called to seek.

AttributeError: 'bytes' object has no attribute 'seek'

Martin Thoma · Accepted Answer

The error message says that you pass a bytes object which doesn't have a seek attribute.

Use io.BytesIO:

from io import BytesIO

# ... your code
stream = BytesIO(view.pdf)
PDFMerger.append(stream)
# ... your code

Merging tableau generated pdfs without saving to disk first

Answers (1)

Related Questions