Reputation: 4572
a = "a"
sample_string = "asdf {{a}} {{ { {a} { {a} }"
## need to find these brackets ^ ^ ^
print(sample_string.format(a=a))
The above string will raise
ValueError: unexpected '{' in field name
I would like to be able to escape the curly brace which _string.formatter_parser
is choking on. I started to go down the road of finding all unmatched pairs but realized that wouldn't work for double escaped curly braces. I realized I don't know how to solve this issue.
## this does not solve the problem.
def find_unmatched(s):
indices = []
stack = []
indexstack = []
for i, e in enumerate(s):
if e == "{":
stack.append(e)
indexstack.append(i)
elif e == "}":
if len(stack) < 1:
indices.append(i)
else:
stack.pop()
indexstack.pop()
while len(indexstack) > 0:
indices.append(indexstack.pop())
return indices
I know I can't simply look for single braces without looking to see if they are also paired. I can't just look for pairs before looking if they are escaped. But there are some cases that throw me off like this:
s1 = f"asdf {{{a}}} {{ {{ {{{a}}} { {a} }"
s2 = "asdf {{{a}}} {{ {{ {{{a}}} { {a} }"
print(s1)
print(s2.format(a=a))
s1
prints while s2
doesn't.
asdf {a} { { {a} {'a'}
ValueError: unexpected '{' in field name
How do you find the index positions of unescaped curly braces in a string?
Additional info:
The question was asked as to what I was even doing with this. The real-world case is actually a little bit awkward. Strings which are being logged are wrapped in with ANSI color codes to colorize the on-screen logs to help differentiate the source of the log line.
The same line is also being written to a log file which doesn't contain the ANSI codes. To accomplish this a string formatter curly brace entry is added to the line where the log formatters do the format() and replace the braces with either an ANSI color code or an empty string.
Example:
"{color.grey}Log entry which {might contain curly} braces in the string {color.reset}"
The logic to replace the color entries is done using a partial formatter where it attempts to itemize all the fields in the string replacing only those which exist in the dictionary passed in. It does the job with exception of singleton curly braces.
def partialformat(s: str, recursionlimit: int = 10, **kwargs):
"""
vformat does the actual work of formatting strings. _vformat is the
internal call to vformat and has the ability to alter the recursion
limit of how many embedded curly braces to handle. But for some reason
vformat does not. vformat also sets the limit to 2!
The 2nd argument of _vformat 'args' allows us to pass in a string which
contains an empty curly brace set and ignore them.
"""
class FormatPlaceholder(object):
def __init__(self, key):
self.key = key
def __format__(self, spec):
result = self.key
if spec:
result += ":" + spec
return "{" + result + "}"
def __getitem__(self, item):
return
class FormatDict(dict):
def __missing__(self, key):
return FormatPlaceholder(key)
class PartialFormatter(string.Formatter):
def get_field(self, field_name, args, kwargs):
try:
obj, first = super(PartialFormatter, self).get_field(field_name, args, kwargs)
except (IndexError, KeyError, AttributeError):
first, rest = formatter_field_name_split(field_name)
obj = '{' + field_name + '}'
# loop through the rest of the field_name, doing
# getattr or getitem as needed
for is_attr, i in rest:
if is_attr:
try:
obj = getattr(obj, i)
except AttributeError as exc:
pass
else:
obj = obj[i]
return obj, first
fmttr = PartialFormatter()
try:
fs, _ = fmttr._vformat(s, ("{}",), FormatDict(**kwargs), set(), recursionlimit)
except ValueError as exc:
#if we are ever to auto escape unmatched curly braces, it shall go here.
raise exc
except Exception as exc:
raise exc
return fs
Usage:
class Color:
grey = '\033[90m'
reset = '\033[0m'
colorobj = Color()
try:
s = partialformat(s, **{"color" : colorobj})
except ValueError as exc:
pass
outputs:
"Log entry which {might contain curly} braces in the string"
or
"\033[90mLog entry which {might contain curly} braces in the string \033[0m"
Additional Edit:
The problem I'm facing is when a string contains a single curly brace I cannot call partialformat
on the string as it raises a ValueError Exception "Single '{' encountered in format string"
. This causes the ability to colorize the log line to fail.
s = "{trco.grey}FAILED{trco.r} message {blah blah blah"
I figured I might be able to automatically escape the singleton curly braces if I can detect where they are in the string. It's just proving to be more difficult than I had expected.
Yet another edit:
I believe this is a problem with the order of events.
s = "text with a { single curly brace"
"{color.red}text with a { single curly brace{color.reset}"
logging.Formatter.doFormat()
do a replace on {color.red}
with the ANSI color code.Upvotes: 5
Views: 701
Reputation: 4572
The original question was how can you identify curly braces that aren't matched pairs. The problem is I was trying to identify them at a point where it is impossible to do so.
Example:
Some would say this middle brace is out of place.
"{{a}}{b}}"
^
While others might think the last one is out of place
"{{a}}{b}}"
^
It's impossible to know from the text snippet alone which brace shouldn't be there. Thus my original question is not definitively solvable. At the time I wrote this post I didn't realize I was asking the wrong question.
My original problem: How do you add a marker to logged text which could be formatted later (e.g. during the .doFormat() method of logging) which can be replaced with either the ansi color code or stripped out depending on which formatter is handling the text?
So that a string that is going to be logged to screen will contain ansi color codes, but when it is written to the file log those codes are stripped out.
As far as proper StackOverflow etiquette goes, I'm not sure if I should completely rework my question, close it, or just answer it here.
Upvotes: 0
Reputation: 2946
Regex would work for this job.
>>>import re
>>>t = re.finditer("\s{\s", "asdf {{a}} {{ { {a} { {a} }")
>>>for a in t:
print (a.start())
13
19
Upvotes: 0
Reputation: 1181
Try this:
string = "abcd {{a}} {{{{a}{{a}}"
indices = []
for i, e in enumerate(string):
if e == '{':
indices.append(i)
elif e == '}':
indices.pop()
print(indices)
this prints: [11, 12, 13]
, which are the indices
what I did is iterate over the letters and count only the opened braces, knowing that the deepest curly braces closes first, and then return the indices of these opened braces
Upvotes: 2