Reputation: 5469
I have JSON string converted from VDF (Valve Data Format) with regex like this:
{"items_game": {
"prefabs": {
...
"coupon_crate_prefab": {
"prefab": "weapon_case_base",
"item_type": "coupon_crate",
"attributes": {
"cannot trade": "1"
},
"capabilities": {
"can_delete": "0"
},
"attributes": {
"expiration date": {
"attribute_class": "expiration_date",
"force_gc_to_generate": "1",
"use_custom_logic": "expiration_period_days_from_now",
"value": "2"
}
}
},
"coupon_key_prefab": {
"prefab": "csgo_tool",
"item_type": "coupon_key",
"attributes": {
"cannot trade": "1"
},
"capabilities": {
"can_delete": "0"
},
"attributes": {
"expiration date": {
"attribute_class": "expiration_date",
"force_gc_to_generate": "1",
"use_custom_logic": "expiration_period_days_from_now",
"value": "2"
}
}
}
...
}
}
Wanted result:
"coupon_key_prefab": {
"prefab": "csgo_tool",
"item_type": "coupon_key",
"attributes": {
"cannot trade": "1",
"expiration date": {
"attribute_class": "expiration_date",
"force_gc_to_generate": "1",
"use_custom_logic": "expiration_period_days_from_now",
"value": "2"
}
},
"capabilities": {
"can_delete": "0"
}
}
As you can see, there is duplicates of attributes
and I need to merge them, because it's invalid in JSON.
How can I do this? (Probably with preg_replace)
Upvotes: 0
Views: 339
Reputation: 53478
It is a very bad idea to do this with regex, because JSON is a data structure that can be formatted several ways and does things like nesting.
This makes it a bad idea to parse with regular expressions, because if you do, at best you'll create brittle code.
But I'm also not sure of the validity of what this is doing - if you run your JSON through a validator, the duplicate keys overwrite each other.
use strict;
use warnings;
use JSON;
local $/;
print to_json ( from_json ( <DATA>) , { pretty => 1 } );
__DATA__
{
"items_game": {
"prefabs": {
"coupon_crate_prefab": {
"prefab": "weapon_case_base",
"item_type": "coupon_crate",
"attributes": {
"cannot trade": "1"
},
"capabilities": {
"can_delete": "0"
},
"attributes": {
"expiration date": {
"attribute_class": "expiration_date",
"force_gc_to_generate": "1",
"use_custom_logic": "expiration_period_days_from_now",
"value": "2"
}
}
}
}
}
}
This'll parse your JSON, which I hope I have fixed to match your source - note that it's 'clobbered' part of your data. I think this is common behaviour in most parse libraries. So it may actually mean that your 'thing' is being 'handled' in the same way.
http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
So it's hard to give you a firm answer on what is best to do with this. Ideally you would use a JSON parser, but what you are doing is not defined within the JSON spec, so you will get variable results.
Edit: Following from comments - seems VDF is like JSON, but not quite the same.
I still wouldn't use a regex, but instead might try a recursive parse. Key it off {
and 'hand down' your JSON-like content so you get a bottom branch of named key-value pairs that you can then hashify.
If there's still not a better answer, I may hack together a perl example later (sorry, don't have time at the moment).
You might find something you can use here: http://www.perlmonks.org/?node_id=995856
But that might also be a good example of why NOT to regex this :)
Upvotes: 3
Reputation: 1230
Well, you asked for regex. Is it possible? Probably, if you have a limited number of nested elements inside your attribute of interest. Is it a good idea? No.
(?<=\"attributes\":) (\{(?:(?:[^{]*?\{(?:[^{]|\n)*?\}[^{]*?)+|(?:[^{]|\n)*?)})
will extract all the attributes data and takes care of one-level nested arguments within your attribute, as seen https://regex101.com/r/rC3eK4/6 .
Since you only had 1 level in your example, it works very well. If you wanted to have 2 levels, you'd have to modify it by adding the option of 2 levels and so on, in order to keep the integrity of all {}
. There might be a better way to solve the parenthesis-esque regex problems, but it's definitely not the best tool to do it.
Upvotes: 0