user4187476
user4187476

Reputation:

Work with the result of re.findall()

I am using Pandoc to transform HTML into LaTeX. It works pretty well, yet I would like to post-process the output to fit my needs. Consider the following output :

string = r'foo\r\nbar\r\n\begin{longtable}[c]{@{}ll@{}}\r\nbar & bar\tabularnewline\r\nbar & bar\r\n\bottomrule\r\n\end{longtable}'

What I need to do is to capture the alignment of the tabular (the c option, the column configuration, and the content of the tabular. Here is what I have done so far :

tabular_setup = re.findall(r'\\begin{longtable}\[(.*)\]{(.*)}(.*)\\end{longtable}', string, re.DOTALL)

if tabular_setup:
    tabular_align = tabular_setup[0][0]
    column_setup  = tabular_setup[0][1]
    tab_content   = tabular_setup[0][2]

So now I can update those values to whatever value I want, but then, how to I update those value in the original string ?

Upvotes: 2

Views: 75

Answers (1)

vks
vks

Reputation: 67968

def repl(matchobj):
    if matchobj.group(1):
        return "1" #something
    if matchobj.group(2):
        return "1" #something 

new=re.sub(r"\\begin{longtable}\[(.*)\]{(.*)}(.*)\\end{longtable}",repl,string)

You can update the groups using re.sub and your own replacement function.

Upvotes: 1

Related Questions