Reputation: 624
Though I have found quite a few SO posts talking about using regex to extract key/value pairs, I was not able to find a solution for my particular use case.
I have key-value pairs like this:
{date=2020-07-22, labelId=100000004}
That will vary in the number of key/value pairs.
I would like to have a regular expression to extract these as keys and values"groups" like groups[1:] = "date", "2020-07-22", "labelID", "100000004
This sort of matches correctly for the first match,
([a-zA-Z0-9]+)=((?:[^\\\\\"]|\\\\.)*+)
...but I need a way to "split" on the comma
In regex gurus able to help me out with this one?
Thanks in advance.
Upvotes: 0
Views: 2176
Reputation: 18631
You can do without a regex here:
text = '{date=2020-07-22, labelId=100000004}'
print(dict([x.split('=') for x in text.strip('{}').split(', ')]))
See Python proof.
That is, remove braces on both ends of the string, split with comma-space, and then split with =
.
Results: {'date': '2020-07-22', 'labelId': '100000004'}
Upvotes: 1
Reputation: 781761
Use a pattern that excludes {
and =
from the key, and }
and ,
from the value of each match.
import re
data = '{date=2020-07-22, labelId=100000004}'
regex = '([^{=]+)=([^,}]+)'
print(re.findall(regex, data))
Note that this regex doesn't allow for quoting the key and/or value to let them include the delimiter characters. That makes using regular expressions much more complicated.
Upvotes: 2
Reputation: 627103
You can use
import re
text = "{date=2020-07-22, labelId=100000004}"
print(dict(re.findall(r'([a-zA-Z0-9]+)=([\d-]+)', text)))
# => {'date': '2020-07-22', 'labelId': '100000004'}
See the Python demo and the regex demo.
Regex details:
([a-zA-Z0-9]+)
- Group 1: any one or more letters or digits=
- a =
char([\d-]+)
- Group 2: one or more digits or -
.Upvotes: 1