NBur
NBur

Reputation: 159

Openpyxl, regexreplace

In a project, I write xlsx files using python openpyxl.

Unfortunately, I can't find a correct way to use a regex function.

Here is my MWE:

from openpyxl import Workbook

classeur = Workbook()
sheet = classeur.active
_cell = sheet.cell(row=1, column=1)
_cell.value = "1;2;3;456;5;4;3;2;1"

_cell2 = sheet.cell(row=2, column=1)
_cell2.value = """=REGEXREPLACE(A1,"(?<![^;])[0-9]{1}(?![^;])";"@",0,0)"""

classeur.save("MWE.xlsx")

With libreoffice calc, correct formula should be =REGEX(A1;"(?<![^;])[0-9]{1}(?![^;])";"@";"g").

When opening the file (with calc), I get a Err:508

Context

The initial aim is to count exact occurences of substring. In the MWE below, you can see The "A" is counted 3 times, while I only want 1… Thus I think of regex to count only within ; delimiters.

from openpyxl import Workbook

classeur = Workbook()
sheet = classeur.active
_cA1 = sheet.cell(row=1, column=1)
_cA1.value = "1;2;3;456;5;4;3;2;1"

_cA2 = sheet.cell(row=2, column=1)
_cA2.value = """=REGEXREPLACE(A1,"(?<![^;])[0-9]{1}(?![^;])";"",0,0)"""


_cA4 = sheet.cell(row=4, column=1)
_cA4.value = "Text to count within"
_cB4 = sheet.cell(row=4, column=2)
_cB4.value = "A;Aze1;1A;Qsd;C"


_cA6 = sheet.cell(row=6, column=1)
_cA6.value = "substring to find"
_cB6 = sheet.cell(row=6, column=2)
_cB6.value = "Count"
for _i, _v in enumerate(["A","A1","1A","B","C"]):
    _c_i1 = sheet.cell(row=7+_i, column=1)
    _c_i1.value = _v
    
    _c_i2 = sheet.cell(row=7+_i, column=2)
    _c_i2.value = f"""=(LEN($B4)-LEN(SUBSTITUTE($B4,$A{7+_i},"")))/LEN($A{7+_i})"""

classeur.save("MWE.xlsx")

Upvotes: -1

Views: 30

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522506

This is just a general answer and approach which can work here, and it doesn't require regular expressions:

value = "1;2;3;456;5;4;3;2;1"
value = ';' + value + ';'
len_orig = len(value)
len_new = len(value.replace(';1;', ''))
num_occur = (len_orig - len_new) / (2 + len('1'))
print('The number 1 occured ' + str(num_occur) + ' times.')
# The number 1 occured 2 times.

The basic approach here is to:

  • Surround the input string with ; delimiters on both sides. This ensures that every ssv value has ; on both sides.
  • Capture the length of this string.
  • Then replace ;1; with empty string.
  • Finally capture the new length, then calculate how many 1 occurred.

Upvotes: 0

Related Questions