PJTraill
PJTraill

Reputation: 1420

LibreOfiice Basic function to perform regular expression replacement

Requirement

I wish to manipulate string values in a LibreOffice Basic macro using regular expression replacement? I should like to use something like y = regex (x, "^¿(.*)\?$", "$1") to operate on string variables (rather than cells of a document). The macro is to be used from LibreOffice Calc.

This is not for the above (trivial) problem, but a possibility that I frequently miss.

Apparent status
Only search exposed to scripts?

Of course the LibreOffice engine performs regular expression replacement (both in interactive Find/Replace and in the REGEX function). However, it seems that only regular expression search is exposed to Basic – and that is not what I am looking for. (It is provided by the somewhat cumbersome TextSearch service, which (if I recall correctly) can also search in string values in Basic variables.)

Spreadsheet functions: do not work

I thought one was meant to be able to call Calc functions in LO Basic, but calls like

z   = regex (x, "^¿(.*)\?$", "$1")
z   = regex (x, "^¿(.*)\?$", "$1", "g")

just said “Basic runtime error. Sub-procedure or function procedure not defined”.

Hack

As a horrible hack, one could manipulate some cells containing appropriate formulae in a scratch part of my sheet, but that has so many evident disadvantages that I would only use it in extremis!

Various sources

The question Using regex in a libreoffice calc macro to extract text from parentheses in a cell is about using REs to find text in cells.

Searching the API documentation for “regular” only threw up the TextSearch Service mentioned above.

Several forums have posts like this that only address search; I have found nothing about replacement.

Upvotes: 2

Views: 1251

Answers (2)

PJTraill
PJTraill

Reputation: 1420

Thanks to JohnSUN for his comment giving the solution using Calc.REGEX!

I know of 2 approaches to RE substitutions in LibreOffice Basic:

  • Call the Calc function REGEX. (Uses ICU REs.)
  • Use the function ReplaceRegex in the ScriptForge.String service. (Seems to use Python REs.)

Using the Calc function REGEX

One can call spreadsheet functions from macros via the callFunction method of the service com.sun.star.sheet.FunctionAccess. The following function RE_replace uses this to perform an arbitrary RE replacement (where report is some error-handling):

Function RE_replace (y_Within as String, y_Regex as String, y_By as String, Optional y_Flags as String) as String

    Dim z_A as  Variant
    If IsMissing (y_Flags) then
        z_A = Array (y_Within, y_Regex, y_By)
    Else
        z_A = Array (y_Within, y_Regex, y_By, y_Flags)
    EndIf

    Dim z_Function_Access   as  Object  :   z_Function_Access   = createUnoService ("com.sun.star.sheet.FunctionAccess")

    Dim z_Error             as  Boolean :   z_Error             = True
    On Local Error Goto ErrHandler
        Dim z   as  String  :   z   = z_Function_Access.callFunction("REGEX", z_A)
        z_Error = False
    ErrHandler:
    On Local Error Goto 0

    If z_Error Then
        Dim z_Flags_S   as  String  :   z_Flags_S   = ""
        If not IsMissing (y_Flags) Then
            z_Flags_S   = ", '" + y_Flags + "'"
        EndIf
        report ("Error from Calc.REGEX ('" + y_Within + "', '" + y_Regex + "', '" + y_By + "'" + z_Flags_S + ")")
    EndIf
    RE_replace  = z
End Function

Notes

  • I assign the result of functions to a local variable z to make debugging easier.
  • My original version can optionally repeatedly substitute up to a given limit.
  • One can cache the FunctionAccess object, but I do not suppose that saves much (I have not measured it).
  • One could simplify this by stripping out the error handling — but that may make debugging mistakes in REs harder.
  • My coding conventions work well for me! z_ indicates a local variable and y_ an argument; where possible I initialise a variable in the line where it is declared.
  • The RE syntax supported is that of ICU regular expressions.

(I have slightly simplified my code, but I have briefly tested this version too)

Using ReplaceRegex in ScriptForge.String

This is documented (for LO 7.2) at https://help.libreoffice.org/7.2/en-US/text/sbasic/shared/03/sf_string.html?&DbPAR=WRITER&System=UNIX , but I have not tried it. It appears to use Python RE syntax, as it refers us to the documentation of the Python module re.

Upvotes: 1

Bilbo
Bilbo

Reputation: 380

A spreadsheet demonstrating this function is at RegEx Demo.ODS

Function Digits( strVal As String ) As String ' return only the digits from the input
  Const cnsSvcNam    = "com.sun.star.sheet.FunctionAccess" ' UnoService for calling      spreadsheet functions
  Const cnsShtFNm    = "REGEX"                             ' RegExp Search+Replace: what spreadsheet function we will call
' Const cnsPtnDgtYes = "[:digit:]"                         ' RegExp Search+Replace: named regular expression for yes-digits
  Const cnsPtnDgtNon = "[^0-9]"                            ' RegExp Search+Replace: named regular expression for not-digits
  Const cnsStrMpt    = ""                                  ' empty string
  Const cnsFlgGlb    = "g"                                 ' RegExp Search+Replace: flag for Global operation

  Dim strRtn   As String  ' function result
  ' variables for calling REGEX( Text ; Expression [ ; [ Replacement ] [ ; Flags|Occurrence ] ] ) ' spreadsheet function signature
  Dim strSch   As String  ' RegExp Search+Replace: text       to search in
  Dim strExp   As String  ' RegExp Search+Replace: expression to match  on
  Dim strRpl   As String  ' RegExp Search+Replace: string     to insert as replacements
  Dim strFlg   As String  ' RegExp Search+Replace: flags / occurrence
  Dim arrFnc() As String  ' paramter-values to pass as arguments to spreadsheet function (array of strings)

  ' Drop all non-digit characters from input
' strRtn = RegEx( strVal, "[:digit:]", "", "g" ) ' RegEx is not defined in scope of Basic code, but it *is* defined as a spreadsheet function
  strSch = strVal                                                          ' text    to search in
  strExp = cnsPtnDgtNon                                                    ' pattern to match  on: match all non-digits
  strRpl = cnsStrMpt                                                       ' replacement data    : replace with nothing
  strFlg = cnsFlgGlb                                                       ' operation flags     : do this for all instances found
  arrFnc = Array( strSch, strExp, strRpl, strFlg )                         ' populate parameter values / arguments to   spreadsheet function
  strRtn = createUnoService( cnsSvcNam ).callFunction( cnsShtFNm, arrFnc ) ' execute                                    spreadsheet function

  Digits = strRtn                                                          ' pass result back to caller
End Function ' Digits

Upvotes: 1

Related Questions