Ben Sandler
Ben Sandler

Reputation: 2393

Pretty print json but keep inner arrays on one line python

I am pretty printing a json in Python using this code:

json.dumps(json_output, indent=2, separators=(',', ': ')

This prints my json like:

{    
    "rows_parsed": [
        [
          "a",
          "b",
          "c",
          "d"
        ],
        [
          "e",
          "f",
          "g",
          "i"
        ],
    ]
}

However, I want it to print like:

{    
    "rows_parsed": [
        ["a","b","c","d"],
        ["e","f","g","i"],
    ]
}

How can I keep the arrays that are in arrays all on one line like above?

Upvotes: 48

Views: 12839

Answers (4)

Nice Zombies
Nice Zombies

Reputation: 1107

If you're fine with applying this to objects too, you'll be able to do this with jsonyx 2.0:

import jsonyx as json

obj = {"rows_parsed": [["a", "b", "c", "d"], ["e", "f", "g", "i"]]}
json.dump(obj, indent=2, indent_leaves=False, separators=(",", ": "))
{
  "rows_parsed": [
    ["a","b","c","d"],
    ["e","f","g","i"]
  ]
}

Upvotes: 0

Ryota Mitarai
Ryota Mitarai

Reputation: 23

I modified @Martin Gergov answer a bit to make things simpler and more JSON-friendly.

def transform(json_obj, indent=4):
    def inner_transform(o):
        if isinstance(o, list) or isinstance(o, tuple):
            for v in o:
                if isinstance(v, dict):
                    return [inner_transform(v) for v in o]
                # elif isinstance(v, list): # check note on the bottom
                #     ...
            return "##<{}>##".format(json.dumps(o))
        elif isinstance(o, dict):
            return {k: inner_transform(v) for k, v in o.items()}
        return o

    if isinstance(json_obj, dict):
        transformed = {k: inner_transform(v) for k, v in json_obj.items()}
    elif isinstance(json_obj, list) or isinstance(json_obj, tuple):
        transformed = [inner_transform(v) for v in json_obj]

    transformed_json = json.dumps(transformed, separators=(', ', ': '), indent=indent)
    transformed_json = transformed_json.replace('"##<', "").replace('>##"', "").replace('\\"', "\"")

    return transformed_json

Test it with this

data = [
    [
        [1,2,3],
        {
            "a": ["a", 'b', "c", "d"],
            "b": {
                "x": [1, 2, 3, None],
                "y": "value"
            },
            "c": [1, 2, 3]
        }
    ]
]

pretty_json = transform(data)
print(pretty_json)

Result:

[
    [
        [1, 2, 3],
        {
            "a": ["a", "b", "c", "d"],
            "b": {
                "x": [1, 2, 3, null],
                "y": "value"
            },
            "c": [1, 2, 3]
        }
    ]
]

Unless if you want a list which contains a list which contains a list+ which contains a dict like [[1,2,[2, {"a": 0}]]] you'd have to modify that yourself...

Upvotes: 0

Oliwia
Oliwia

Reputation: 19

I don't see how you could do it in the json.dumps. After a bit of searching I came across a few options: One option would be to do some post-processing with a custom function:

def fix_json_indent(text, indent=3):
            space_indent = indent * 4
    initial = " " * space_indent
    json_output = []
    current_level_elems = []
    all_entries_at_level = None  # holder for consecutive entries at exact space_indent level
    for line in text.splitlines():
        if line.startswith(initial):
            if line[space_indent] == " ":
                # line indented further than the level
                if all_entries_at_level:
                    current_level_elems.append(all_entries_at_level)
                    all_entries_at_level = None
                item = line.strip()
                current_level_elems.append(item)
                if item.endswith(","):
                    current_level_elems.append(" ")
            elif current_level_elems:
                # line on the same space_indent level
                # no more sublevel_entries 
                current_level_elems.append(line.strip())
                json_output.append("".join(current_level_elems))
                current_level_elems = []
            else:
                # line at the exact space_indent level but no items indented further
                if all_entries_at_level:
                    # last pending item was not the start of a new sublevel_entries.
                    json_output.append(all_entries_at_level)
                all_entries_at_level = line.rstrip()
        else:
            if all_entries_at_level:
                json_output.append(all_entries_at_level)
                all_entries_at_level = None
            if current_level_elems:
                json_output.append("".join(current_level_elems))
            json_output.append(line)
    return "\n".join(json_output)

Another possibility is a regex but it is quite ugly and depends on the structure of the code you posted:

def fix_json_indent(text):
    import re
    return  re.sub('{"', '{\n"', re.sub('\[\[', '[\n[', re.sub('\]\]', ']\n]', re.sub('}', '\n}', text))))

Upvotes: 1

Martin Gergov
Martin Gergov

Reputation: 1658

Here is a way to do it with as least amount of modifications as possible:

import json
from json import JSONEncoder
import re

class MarkedList:
    _list = None
    def __init__(self, l):
        self._list = l

z = {    
    "rows_parsed": [
        MarkedList([
          "a",
          "b",
          "c",
          "d"
        ]),
        MarkedList([
          "e",
          "f",
          "g",
          "i"
        ]),
    ]
}

class CustomJSONEncoder(JSONEncoder):
    def default(self, o):
        if isinstance(o, MarkedList):
            return "##<{}>##".format(o._list)

b = json.dumps(z, indent=2, separators=(',', ':'), cls=CustomJSONEncoder)
b = b.replace('"##<', "").replace('>##"', "")

print(b)

Basically the lists that you want formatted in that way you make instance of MarkedList and they get parsed as strings with hopefully unique enough sequence that is later stripped from the output of dumps. This is done to eliminate the quotes that are put around a json string.

Another much more efficient way to do it, but a much more ugly one is to monkey patch json.encoder._make_iterencode._iterencode with something like:

def _iterencode(o, _current_indent_level):
    if isinstance(o, str):
        yield _encoder(o)
    elif o is None:
        yield 'null'
    elif o is True:
        yield 'true'
    elif o is False:
        yield 'false'
    elif isinstance(o, int):
        # see comment for int/float in _make_iterencode
        yield _intstr(o)
    elif isinstance(o, float):
        # see comment for int/float in _make_iterencode
        yield _floatstr(o)
    elif isinstance(o, MarkedList):
        yield _my_custom_parsing(o)
    elif isinstance(o, (list, tuple)):
        yield from _iterencode_list(o, _current_indent_level)
    elif isinstance(o, dict):
        yield from _iterencode_dict(o, _current_indent_level)
    else:
        if markers is not None:
            markerid = id(o)
            if markerid in markers:
                raise ValueError("Circular reference detected")
            markers[markerid] = o
        o = _default(o)
        yield from _iterencode(o, _current_indent_level)
        if markers is not None:
            del markers[markerid]

Upvotes: 5

Related Questions