Reputation: 105
I'm trying to write a regex for the following situation. I have a file with hundreds of dictionaries as string.
EG:
{'a':1'}
{{'a':1, 'b':2}{'c':3}}
{'a':4, 'b':6}
I read the file and removed the newlines
. Now I'm trying split
them based on a regex
.
{'a':1'}{{'a':1, 'b':2}{'c':3}}{'a':4, 'b':6}
re.split("({.*?})", str)
. This wouldn't work because the whole second dict wouldn't match. How can I write a regex that would match all the lines return a list of dictionaries.
Upvotes: 3
Views: 3263
Reputation: 43199
You could simply do:
(\{[^{}]+\})
# look for an opening {
# and anything that is not { or }
# as well as an ending }
In Python
this would be:
import re
rx = r'(\{[^{}]+\})'
string = "{'a':1'}{{'a':1, 'b':2}{'c':3}}{'a':4, 'b':6}"
matches = re.findall(rx, string)
print matches
# ["{'a':1'}", "{'a':1, 'b':2}", "{'c':3}", "{'a':4, 'b':6}"]
Upvotes: 3
Reputation: 4418
Python regular expressions are not able to handle nested structures by themselves. you would have to do some looping or recursion separately.
However, you commented above that each line is a json response. Why not use json.loads()
on each line.
import json
with open('path_to_file', 'r') as f:
data = [json.loads(line) for line in f]
data
is now a list of dictionaries.
Upvotes: 0