Reputation: 605
I have been trying to come up with a regex for the following string:
[1,null,"7. Mai 2017"],[2,"test","8. Mai 2018"],[3,"test","9. Mai 2019"]
I am trying to get as match output each bracket with its content as a single element like the following:
[1,null,"7. Mai 2017"]
[2,"test","8. Mai 2018"]
[3,"test","9. Mai 2019"]
My initial naive approach was something like this:
(\[[^d],.+\])+
However, the .+ rule is too general and ends up matching the whole line. Any hints?
Upvotes: 3
Views: 77
Reputation: 104092
You might consider the wonderful module pyparsing to do this:
import pyparsing
for match in pyparsing.originalTextFor(pyparsing.nestedExpr('[',']')).searchString(exp):
print match[0]
[1,null,"7. Mai 2017"]
[2,"test","8. Mai 2018"]
[3,"test","9. Mai 2019"]
(Unless it is actually JSON -- use the JSON module if so...)
Upvotes: 1
Reputation: 474201
I am not sure about the data format you are trying to parse and where it is coming from, but it looks JSON-like. For this particular string, adding square brackets from the beginning and the end of the string makes it JSON loadable:
In [1]: data = '[1,null,"7. Mai 2017"],[2,"test","8. Mai 2018"],[3,"test","9. Mai 2019"]'
In [2]: import json
In [3]: json.loads("[" + data + "]")
Out[3]:
[[1, None, u'7. Mai 2017'],
[2, u'test', u'8. Mai 2018'],
[3, u'test', u'9. Mai 2019']]
Note how null
becomes Python's None
.
Upvotes: 1
Reputation: 1856
The following code will output what you've requested using \[[^]]*]
.
import re
regex = r'\[[^]]*]'
line = '[1,null,"7. Mai 2017"],[2,"test","8. Mai 2018"],[3,"test","9. Mai 2019"]'
row = re.findall(regex, line)
print(row)
Output:
['[1,null,"7. Mai 2017"]', '[2,"test","8. Mai 2018"]', '[3,"test","9. Mai 2019"]']
Consider changing null
to None
as it matches python representation.
Upvotes: 1