eegloo
eegloo

Reputation: 1499

.xlsx and xls(Latest Versions) to pdf using python

With the help of this .doc to pdf using python Link I am trying for excel (.xlsx and xls formats)

Following is modified Code for Excel:

import os
from win32com import client

folder = "C:\\Oprance\\Excel\\XlsxWriter-0.5.1"
file_type = 'xlsx'
out_folder = folder + "\\PDF_excel"

os.chdir(folder)

if not os.path.exists(out_folder):
    print 'Creating output folder...'
    os.makedirs(out_folder)
    print out_folder, 'created.'
else:
    print out_folder, 'already exists.\n'

for files in os.listdir("."):
    if files.endswith(".xlsx"):
        print files

print '\n\n'

word = client.DispatchEx("Excel.Application")
for files in os.listdir("."):
    if files.endswith(".xlsx") or files.endswith('xls'):
        out_name = files.replace(file_type, r"pdf")
        in_file = os.path.abspath(folder + "\\" + files)
        out_file = os.path.abspath(out_folder + "\\" + out_name)
        doc = word.Workbooks.Open(in_file)
        print 'Exporting', out_file
        doc.SaveAs(out_file, FileFormat=56)
        doc.Close()

It is showing following error :

>>> execfile('excel_to_pdf.py')
Creating output folder...
C:\Excel\XlsxWriter-0.5.1\PDF_excel created.
apms_trial.xlsx
~$apms_trial.xlsx

Exporting C:\Excel\XlsxWriter-0.5.1\PDF_excel\apms_trial.pdf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "excel_to_pdf.py", line 30, in <module>
    doc = word.Workbooks.Open(in_file)
  File "<COMObject <unknown>>", line 8, in Open
pywintypes.com_error: (-2147352567, 'Exception occurred.', (0, u'Microsoft Excel
', u"Excel cannot open the file '~$apms_trial.xlsx' because the file format or f
ile extension is not valid. Verify that the file has not been corrupted and that
 the file extension matches the format of the file.", u'xlmain11.chm', 0, -21468
27284), None)
>>>

There is problem in

doc.SaveAs(out_file, FileFormat=56)

What should be FileFormat file format? Please Help

Upvotes: 6

Views: 47984

Answers (5)

Ryabchenko Alexander
Ryabchenko Alexander

Reputation: 12330

Another solution for

Is to start gotenberg docker container locally

https://github.com/gotenberg/gotenberg

And pass (any supported by libreoffice) file from python wia HTTP to the container and get result as pdf

LIBREOFFICE_URL = 'http://localhost:3000/forms/libreoffice/convert'
LIBREOFFICE_LANDSCAPE_URL = 'http://localhost:3000/forms/libreoffice/convert?landscape=1'


def _retry_gotenberg(url, io_bytes, post_file_name='index.html'):
    response = None
    for _ in range(5):
        response = requests.post(url, files={post_file_name: io_bytes})
        if response.status_code == 200:
            break
        logging.info('Will sleep and retry: %s %s', response.status_code, response.content)
        sleep(3)
    if not response or response.status_code != 200:
        raise RuntimeRrror(f'Bad response from doc-to-pdf: {response.status_code} {response.content}')
    return response

def process_libreoffice(io_bytes, ext: str):
    if ext in ('.doc', '.docx'):
        url = LIBREOFFICE_URL
    else:
        url = LIBREOFFICE_LANDSCAPE_URL
    response = self._retry_gotenberg(url, io_bytes, post_file_name=f'file.{ext}')
    return response.content

Upvotes: 1

Tilal Ahmad
Tilal Ahmad

Reputation: 939

The GroupDocs.Conversion Cloud SDK for Python is another option to convert Excel to PDF. It is paid API. However, it provides 150 free monthly API calls.

P.S: I'm a developer evangelist at GroupDocs.

# Import module
import groupdocs_conversion_cloud
from shutil import copyfile

# Get your client_id and client_key at https://dashboard.groupdocs.cloud (free registration is required).
client_id = "xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx"
client_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

# Create instance of the API
convert_api = groupdocs_conversion_cloud.ConvertApi.from_keys(client_id, client_key)

try:

        #Convert PDF to PNG
        # Prepare request
        request = groupdocs_conversion_cloud.ConvertDocumentDirectRequest("pdf", "C:/Temp/Book1.xlsx")

        # Convert
        result = convert_api.convert_document_direct(request)       
        copyfile(result, 'C:/Temp/Book1_output.pdf')
        print("Result {}".format(result))
        
except groupdocs_conversion_cloud.ApiException as e:
        print("Exception when calling get_supported_conversion_types: {0}".format(e.message))

Upvotes: 0

daansteraan
daansteraan

Reputation: 103

I got the same thing and the same error... ANSWER: 57.... see below...

from win32com import client
import win32api

def exceltopdf(doc):
    excel = client.DispatchEx("Excel.Application")
    excel.Visible = 0

    wb = excel.Workbooks.Open(doc)
    ws = wb.Worksheets[1]

    try:
        wb.SaveAs('c:\\targetfolder\\result.pdf', FileFormat=57)
    except Exception, e:
        print "Failed to convert"
        print str(e)
    finally:
        wb.Close()
        excel.Quit()

... as an alternative to the fragile ExportAsFixedFormat...

Upvotes: 5

eegloo
eegloo

Reputation: 1499

Link of xlsxwriter :

https://xlsxwriter.readthedocs.org/en/latest/contents.html

With the help of this you can generate excel file with .xlsx and .xls

for example excel file generated name is trial.xls

Now if you want to generate pdf of that excel file then do the following :

from win32com import client
xlApp = client.Dispatch("Excel.Application")
books = xlApp.Workbooks.Open('C:\\excel\\trial.xls')
ws = books.Worksheets[0]
ws.Visible = 1
ws.ExportAsFixedFormat(0, 'C:\\excel\\trial.pdf')

Upvotes: 23

lxx
lxx

Reputation: 1346

You can print an excel sheet to pdf on linux using python. Do need to run openoffice as a headless server and use unoconv, takes a bit of configuring but is doable

You run OO as a (service) daemon and use it for the conversions for xls, xlsx and doc, docx.

http://dag.wiee.rs/home-made/unoconv/

Upvotes: 1

Related Questions