Mark Galeck
Mark Galeck

Reputation: 6395

Apply formatting control characters (backspace and carriage return) to string, without needing recursion

What is the easiest way to "interpret" formatting control characters in a string, to show the results as if they were printed. For simplicity, I will assume there are no newlines in the string.

So for example,

>>> sys.stdout.write('foo\br')

shows for, therefore

interpret('foo\br') should be 'for'

>>>sys.sdtout.write('foo\rbar')

shows bar, therefore

interpret('foo\rbar') should be 'bar'


I can write a regular expression substitution here, but, in the case of '\b' replacement, it would have to be applied recursively until there are no more occurrences. It would be quite complex if done without recursion.

Is there an easier way?

Upvotes: 1

Views: 976

Answers (3)

Bakuriu
Bakuriu

Reputation: 101959

Python's does not have any built-in or standard library module for doing this. However if you only care for simple control characters like \r, \b and \n you can write a simple function to handle this:

def interpret(text):
    lines = []
    current_line = []
    for char in text:
        if char == '\n':
            lines.append(''.join(current_line))
            current_line = []
        elif char == '\r':
            current_line.clear()
            # del current_line[:]  # in old python versions
        elif char == '\b':
            del current_line[-1:]
        else:
            current_line.append(char)
    if current_line:
        lines.append(current_line)
    return '\n'.join(lines)

You can extend the function handling any control character you want. For example you might want to ignore some control characters that don't get actually displayed in a terminal (e.g. the bell \a)

Upvotes: 1

smci
smci

Reputation: 33950

UPDATE: after 30 minutes of asking for clarifications and an example string, we find the question is actually quite different: "How to repeatedly apply formatting control characters (backspace) to a Python string?" In that case yes you apparently need to apply the regex/fn repeatedly until you stop getting matches. SOLUTION:

import re

def repeated_re_sub(pattern, sub, s, flags=re.U):
    """Match-and-replace repeatedly until we run out of matches..."""
    patc = re.compile(pattern, flags)

    sold = ''
    while sold != s:
        sold = s
        print "patc=>%s<    sold=>%s<   s=>%s<" % (patc,sold,s)
        s = patc.sub(sub, sold)
        #print help(patc.sub)

    return s

print repeated_re_sub('[^\b]\b', '', 'abc\b\x08de\b\bfg')
#print repeated_re_sub('.\b', '', 'abcd\b\x08e\b\bfg')

[multiple previous answers, asking for clarifications and pointing out that both re.sub(...) or string.replace(...) could be used to solve the problem, non-recursively.]

Upvotes: 0

Veedrac
Veedrac

Reputation: 60147

If efficiency doesn't matter, a simple stack would work fine:

string = "foo\rbar\rbash\rboo\b\bba\br"

res = []
for char in string:
    if char == "\r":
        res.clear()
    elif char == "\b":
        if res: del res[-1]
    else:
        res.append(char)

"".join(res)
#>>> 'bbr'

Otherwise, I think this is about as fast as you can hope for in complex cases:

string = "foo\rbar\rbash\rboo\b\bba\br"

try:
    string = string[string.rindex("\r")+1:]
except ValueError:
    pass

split_iter = iter(string.split("\b"))
res = list(next(split_iter, ''))
for part in split_iter:
    if res: del res[-1]
    res.extend(part)

"".join(res)
#>>> 'bbr'

Note that I haven't timed this.

Upvotes: 1

Related Questions