Saeed isa
Saeed isa

Reputation: 418

python: get full formula from excel, using xlrd

I'm trying to get the full formula from Excel file I tried many ways, but all get for me the value

I need the full formula that is in the cell, not the value itself

I'm using python with xlrd

is there any function I can use ? or is there anyway to ?

Thanks alot

Upvotes: 1

Views: 2517

Answers (1)

mkultra
mkultra

Reputation: 321

So I know this is a very old post, but I found a decent way of getting the formulas from all the sheets in a workbook as well as having the newly created workbook retain all the formatting.

First step is to save a copy of your .xlsx file as .xls -- Use the .xls as the filename in the code below

Using Python 2.7

from lxml import etree
from StringIO import StringIO
import xlsxwriter
import subprocess
from xlrd import open_workbook
from xlutils.copy import copy
from xlsxwriter.utility import xl_cell_to_rowcol
import os



file_name = '<YOUR-FILE-HERE>'
dir_path = os.path.dirname(os.path.realpath(file_name))

subprocess.call(["unzip",str(file_name+"x"),"-d","file_xml"])


xml_sheet_names = dict()

with open_workbook(file_name,formatting_info=True) as rb:
    wb = copy(rb)
    workbook_names_list = rb.sheet_names()
    for i,name in enumerate(workbook_names_list):
        xml_sheet_names[name] = "sheet"+str(i+1)

sheet_formulas = dict()
for i, k in enumerate(workbook_names_list):
    xmlFile = os.path.join(dir_path,"file_xml/xl/worksheets/{}.xml".format(xml_sheet_names[k]))
    with open(xmlFile) as f:
        xml = f.read()

    tree = etree.parse(StringIO(xml))
    context = etree.iterparse(StringIO(xml))

    sheet_formulas[k] = dict()
    for _, elem in context:
        if elem.tag.split("}")[1]=='f':
            cell_key = elem.getparent().get(key="r")
            cell_formula = elem.text
            sheet_formulas[k][cell_key] = str("="+cell_formula)

sheet_formulas

Structure of Dictionary 'sheet_formulas'

{'Worksheet_Name': {'A1_cell_reference':'cell_formula'}}

Example results:

{u'CY16': {'A1': '=Data!B5',
  'B1': '=Data!B1',
  'B10': '=IFERROR(Data!B12,"")',
  'B11': '=IFERROR(SUM(B9:B10),"")',

Upvotes: 1

Related Questions