tsaebeht
tsaebeht

Reputation: 1680

Unable to format Python string while decoding from dict to str

I have a dict which I am encoding as a string as such:

import json

template = json.dumps({
    '_index': '{0}',
    '_type': '{1}',
    '_id': '{2}',
    '_source': {
        'doc': '{3}',
        'doc_as_upsert': True
    }
})

Now, I try to format it as per the new python conventions mentioned here: https://pyformat.info/

print template.format('one','two','three','four')

However, I get an error as so

Traceback (most recent call last): File "python", line 1, in KeyError: '"_type"'

What am I doing wrong here?

Upvotes: 0

Views: 125

Answers (1)

zwer
zwer

Reputation: 25799

The problem stems from the curly braces in your JSON - you need to double-escape them in order for str.format() to work, e.g.:

import json

template = json.dumps({
    '_index': '{0}',
    '_type': '{1}',
    '_id': '{2}',
    '_source': {
        'doc': '{3}',
        'doc_as_upsert': True
    }
})

template = template.replace("{", "{{").replace("}", "}}")

print(template.format('one','two','three','four'))

It will no longer err, but it will also escape your parameter curly braces so they wont get replaced by str.format(), so you'll have to invent your own 'parameter' escape as well (make sure it doesn't appear as the markup code for JSON, tho, like curly braces do), for example using < and > instead:

import json

template = json.dumps({
    '_index': '<0>',
    '_type': '<1>',
    '_id': '<2>',
    '_source': {
        'doc': '<3>',
        'doc_as_upsert': True
    }
})

template = template.replace("{", "{{").replace("}", "}}").replace("<", "{").replace(">", "}")

print(template.format('one', 'two', 'three', 'four'))

But it's much better idea to directly replace your data before turning it onto JSON. You can call str.format() on each (str) value in your dict individually, passing a dict with all the parameters to it and using named parameters (i.e. {one}) to pick up the needed argument from expanded keys.

UPDATE: You don't even need to recurse through your data for the last one as the json serializer is going to recurse through it anyway, but unfortunately the json module doesn't make it easy to swap the default behavior of string serializing so you'll have to do some monkey-patching:

from json import dumps, encoder

def template_json(data, args, **kwargs):
    json_s1, json_s2 = encoder.encode_basestring, encoder.encode_basestring_ascii
    encoder.encode_basestring = lambda s: json_s1(s.format(**args))
    encoder.encode_basestring_ascii = lambda s: json_s2(s.format(**args))
    try:
        return dumps(data, **kwargs)
    finally:
        encoder.encode_basestring, encoder.encode_basestring_ascii = json_s1, json_s2

It essentially temporarily wraps the internal JSON string building methods with the ones that apply formatting first and then reverts back everything so that other functions that may depend on the json module don't get unexpected behavior (although there is a little danger here, too - this is not thread-safe). Since it will be reading the elements one by one we cannot really use positional formatting so this uses named formatting as suggested above. You can test it as:

data = {
    '_index': '{one}',
    '_type': '{two}',
    '_id': '{three}',
    '_source': {
        'doc': '{four}',
        'doc_as_upsert': True,
    }
}

template_vars = {"one": "value 1", "two": "value 2", "three": "value 3", "four": "value 4"}

print(template_json(data, template_vars, indent=4))

Resulting in:

{
    "_source": {
        "doc": "value 4",
        "doc_as_upsert": true
    },
    "_index": "value 1",
    "_id": "value 3",
    "_type": "value 2"
}

But, generally, if you have to hack around your system to achieve what you want - you might want to reconsider if that's the right approach in the first place and can your objective be achieved in a simpler manner?

Upvotes: 1

Related Questions