Reputation: 94
I have a JSON file that contains some regex expressions that I want to use in my python code. The problem arises when I try to escape reserved regex characters in the JSON file. When I run the python code, it can't process the json file and throws an exception.
I have already debugged the code and come to the conclusion, that it fails when calling json.loads(ruleFile.read())
. Apparently only some characters can be escaped in JSON and the dot is not one of them which causes a syntax error.
try:
with open(args.rules, "r") as ruleFile:
rules = json.loads(ruleFile.read())
for rule in rules:
rules[rule] = re.compile(rules[rule])
except (IOError, ValueError) as e:
raise Exception("Error reading rules file")
{
"Rule 1": "www\.[a-z]{3,10}\.com"
}
Traceback (most recent call last):
File "foo.py", line 375, in <module>
main()
File "foo.py", line 67, in main
raise Exception("Error reading rules file")
Exception: Error reading rules file
How do I work around this JSON syntax problem?
Upvotes: 4
Views: 3136
Reputation: 148965
The rule is to first have a correct string in a correct dictionary. And \
are to be escapes in Python.
So you should initially write:
rules = {"Rule 1": r"www\.[a-z]{3,10}\.com"}
You can then easily convert that to a JSON string:
print(json.dumps(rules, indent=4))
{
"Rule 1": "www\\.[a-z]{3,10}\\.com"
}
You now know how the json file containing the regexes should be formatted.
Upvotes: 1
Reputation: 97152
The backslash needs to be escaped in JSON.
{
"Rule 1": "www\\.[a-z]{3,10}\\.com"
}
From here:
The following characters are reserved in JSON and must be properly escaped to be used in strings:
- Backspace is replaced with \b
- Form feed is replaced with \f
- Newline is replaced with \n
- Carriage return is replaced with \r
- Tab is replaced with \t
- Double quote is replaced with \"
- Backslash is replaced with \\
Upvotes: 1