Reputation: 3223
Firstly, I understand that comments aren't valid json. That said, for some reason this .json file I have to process has comments at the start of lines and at the end of lines.
How can i handle this in python and basically load the .json file but ignore the comments so that I can process it? I am currently doing the following:
with open('/home/sam/Lean/Launcher/bin/Debug/config.json', 'r') as f:
config_data=json.load(f)
But this crashes at the json.load(f) command because the file f has comments in it.
I thought this would be a common problem but I can't find much online RE how to handle it in python. Someone suggested commentjson but that makes my script crash saying
ImportError: cannot import name 'dump'
When I import commentjson
Thoughts?
Edit: Here is a snippet of the json file i must process.
{
// this configuration file works by first loading all top-level
// configuration items and then will load the specified environment
// on top, this provides a layering affect. environment names can be
// anything, and just require definition in this file. There's
// two predefined environments, 'backtesting' and 'live', feel free
// to add more!
"environment": "backtesting",// "live-paper", "backtesting", "live-interactive", "live-interactive-iqfeed"
// algorithm class selector
"algorithm-type-name": "BasicTemplateAlgorithm",
// Algorithm language selector - options CSharp, FSharp, VisualBasic, Python, Java
"algorithm-language": "CSharp"
}
Upvotes: 13
Views: 10191
Reputation: 1
We use a powerful json preprocessor to solve this problem. Next to comments it supports also
Download: JsonPreprocessor (PyPI)
This allows common definitions and hierarchical structures for huge projects.
We use also a VSCode Plugin for JSONP syntax: test-fullautomation/vscode-jsonp (github.com)
Upvotes: 0
Reputation: 21778
Switch into json5. The JSON 5 is a very small superset of JSON that supports comments and few other features you could just ignore.
import json5 as json
# and the rest is the same
It is beta, and it is slower, but if you just need to read some short configuration once when starting the program, this probably can be considered as an option. It is better to switch into another standard than not to follow any.
Upvotes: 12
Reputation: 7886
You can take out the comments with the following:
data=re.sub("//.*?\n","",data)
data=re.sub("/\\*.*?\\*/","",data)
This should remove all comments from the data. It could cause problems if there are // or /* inside your strings
Upvotes: 0
Reputation: 394
I haven't used it personally but you can have a look on JSONComment python package which supports parsing a json file with comment. Use it in place of JsonParser
parser = JsonComment(json)
parsed_object = parser.loads(jsonString)
Upvotes: 2
Reputation: 140266
kind of a hack (because if there are //
within the json data then it will fail) but simple enough for most cases:
import json,re
s = """{
// this configuration file works by first loading all top-level
// configuration items and then will load the specified environment
// on top, this provides a layering affect. environment names can be
// anything, and just require definition in this file. There's
// two predefined environments, 'backtesting' and 'live', feel free
// to add more!
"environment": "backtesting",// "live-paper", "backtesting", "live-interactive", "live-interactive-iqfeed"
// algorithm class selector
"algorithm-type-name": "BasicTemplateAlgorithm",
// Algorithm language selector - options CSharp, FSharp, VisualBasic, Python, Java
"algorithm-language": "CSharp"
}
"""
result = json.loads(re.sub("//.*","",s,flags=re.MULTILINE))
print(result)
gives:
{'environment': 'backtesting', 'algorithm-type-name': 'BasicTemplateAlgorithm', 'algorithm-language': 'CSharp'}
apply regular expression to all the lines, removing double slashes and all that follows.
Maybe a state machine parsing the line would be better to make sure the //
aren't in quotes, but that's slightly more complex (but doable)
Upvotes: 7