Shamil Yakupov
Shamil Yakupov

Reputation: 5469

JSON object regex & merge

I have JSON string converted from VDF (Valve Data Format) with regex like this:

{"items_game": {
    "prefabs": {
        ...
        "coupon_crate_prefab": {
            "prefab": "weapon_case_base",
            "item_type": "coupon_crate",
            "attributes": {
                "cannot trade": "1"
            },
            "capabilities": {
                "can_delete": "0"
            },
            "attributes": {
                "expiration date": {
                    "attribute_class": "expiration_date",
                    "force_gc_to_generate": "1",
                    "use_custom_logic": "expiration_period_days_from_now",
                    "value": "2"
                }
            }
        },
        "coupon_key_prefab": {
            "prefab": "csgo_tool",
            "item_type": "coupon_key",
            "attributes": {
                "cannot trade": "1"
            },
            "capabilities": {
                "can_delete": "0"
            },
            "attributes": {
                "expiration date": {
                    "attribute_class": "expiration_date",
                    "force_gc_to_generate": "1",
                    "use_custom_logic": "expiration_period_days_from_now",
                    "value": "2"
                }
            }
        }
        ...
    }
}

Wanted result:

        "coupon_key_prefab": {
            "prefab": "csgo_tool",
            "item_type": "coupon_key",
            "attributes": {
                "cannot trade": "1",
                "expiration date": {
                    "attribute_class": "expiration_date",
                    "force_gc_to_generate": "1",
                    "use_custom_logic": "expiration_period_days_from_now",
                    "value": "2"
                }
            },
            "capabilities": {
                "can_delete": "0"
            }
        }

As you can see, there is duplicates of attributes and I need to merge them, because it's invalid in JSON.
How can I do this? (Probably with preg_replace)

Upvotes: 0

Views: 339

Answers (2)

Sobrique
Sobrique

Reputation: 53478

It is a very bad idea to do this with regex, because JSON is a data structure that can be formatted several ways and does things like nesting.

This makes it a bad idea to parse with regular expressions, because if you do, at best you'll create brittle code.

But I'm also not sure of the validity of what this is doing - if you run your JSON through a validator, the duplicate keys overwrite each other.

use strict;
use warnings;

use JSON;

local $/; 
print to_json ( from_json ( <DATA>) , { pretty => 1 } );

__DATA__
{
    "items_game": {
        "prefabs": {
            "coupon_crate_prefab": {
                "prefab": "weapon_case_base",
                "item_type": "coupon_crate",
                "attributes": {
                    "cannot trade": "1"
                },
                "capabilities": {
                    "can_delete": "0"
                },
                "attributes": {
                    "expiration date": {
                        "attribute_class": "expiration_date",
                        "force_gc_to_generate": "1",
                        "use_custom_logic": "expiration_period_days_from_now",
                        "value": "2"
                    }
                }
            }
        }
    }
}

This'll parse your JSON, which I hope I have fixed to match your source - note that it's 'clobbered' part of your data. I think this is common behaviour in most parse libraries. So it may actually mean that your 'thing' is being 'handled' in the same way.

http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf

So it's hard to give you a firm answer on what is best to do with this. Ideally you would use a JSON parser, but what you are doing is not defined within the JSON spec, so you will get variable results.

Edit: Following from comments - seems VDF is like JSON, but not quite the same.

I still wouldn't use a regex, but instead might try a recursive parse. Key it off { and 'hand down' your JSON-like content so you get a bottom branch of named key-value pairs that you can then hashify.

If there's still not a better answer, I may hack together a perl example later (sorry, don't have time at the moment).

You might find something you can use here: http://www.perlmonks.org/?node_id=995856

But that might also be a good example of why NOT to regex this :)

Upvotes: 3

Andris Leduskrasts
Andris Leduskrasts

Reputation: 1230

Well, you asked for regex. Is it possible? Probably, if you have a limited number of nested elements inside your attribute of interest. Is it a good idea? No.

(?<=\"attributes\":) (\{(?:(?:[^{]*?\{(?:[^{]|\n)*?\}[^{]*?)+|(?:[^{]|\n)*?)}) will extract all the attributes data and takes care of one-level nested arguments within your attribute, as seen https://regex101.com/r/rC3eK4/6 .

Since you only had 1 level in your example, it works very well. If you wanted to have 2 levels, you'd have to modify it by adding the option of 2 levels and so on, in order to keep the integrity of all {}. There might be a better way to solve the parenthesis-esque regex problems, but it's definitely not the best tool to do it.

Upvotes: 0

Related Questions