Demetroid
Demetroid

Reputation: 78

Is there a library for parsing such serialized objects in Python?

For my python program I have an input that represents serialized object, that can contain primitive types, arrays and structures.

Sample input can look like this:

Struct(1.5, false, Struct2(“text”), [1, 2, 3])

Sample output would be:

{
    type: "Struct",
    args: [
        1.5,
        False,
        {
            type: "Struct2",
            args: [ "text" ]
        },
        [ 1, 2, 3 ]
    ]
}

So, the input string can have:

Input format is quite logical, but I couldn't find any readily available libraries/code snippets to parse such format.

Upvotes: 0

Views: 44

Answers (1)

snailor
snailor

Reputation: 269

This isn't a very clean implementation, and I'm not 100% sure if it does exactly what you're looking for, but I would recommend the Lark library for doing this.

Instead of using a ready-made parser for the job, just make your own small one, and to save time, Lark has it's "save" and "load" features, so you can save a serialized version of the parser and load that each time instead of re-creating the entire parser each runtime. Hope this helps :)

from lark import Lark, Transformer

grammar = """
%import common.WS
%import common.ESCAPED_STRING
%import common.SIGNED_NUMBER

%ignore WS

start : struct

struct  : NAME "(" [element ("," element)*] ")"
element : struct | array | primitive

array : "[" [element ("," element)*] "]"
primitive : number
          | string
          | boolean

string : ESCAPED_STRING
number : SIGNED_NUMBER

boolean : TRUE | FALSE

NAME : /[a-zA-Z][a-zA-Z0-9]*/

TRUE : "true"
FALSE : "false"
"""

class T(Transformer):
    def start(self, s):
        return s[0]

    def string(self, s):
        return s[0][1:-1].replace('\\"', '"')

    def primitive(self, s):
        return s[0]

    def struct(self, s):
        return { "type": s[0].value, "args": s[1:] }

    def boolean(self, s):
        return s[0].value == "true"

    def element(self, s):
        return s[0]
    
    array = list

    def number(self, s):
        try:
            return int(s[0].value)
        except:
            return float(s[0].value)

parser = Lark(grammar, parser = "lalr", transformer = T())

test = """
Struct(1.5, false, Struct2("text"), [1, 2, 3])
"""

print(parser.parse(test))

Upvotes: 2

Related Questions