Reputation: 265
I have the below regex (from this link: get python dictionary from string containing key value pairs)
r"\b(\w+)\s*:\s*([^:]*)(?=\s+\w+\s*:|$)"
Here is the explanation:
\b # Start at a word boundary
(\w+) # Match and capture a single word (1+ alnum characters)
\s*:\s* # Match a colon, optionally surrounded by whitespace
([^:]*) # Match any number of non-colon characters
(?= # Make sure that we stop when the following can be matched:
\s+\w+\s*: # the next dictionary key
| # or
$ # the end of the string
) # End of lookahead
My question is that when my string has the word with the "-" in between, for example: movie-night
, the above regex is not working and I think it is due to the b(\w+)
. How can I change this regex to work with word including the "-"? I have tried b(\w+-)
but it does not work. Thanks for your help in advance.
Upvotes: 1
Views: 593
Reputation: 1376
You could try something such as this:
r"\b([\w\-]+)\s*:\s*([^:]*)(?=\s+\w+\s*:|$)"
Note the [\w\-]+
, which allows matching both a word character and a dash.
For readability in the future, you may also want to investigate re.X/re.VERBOSE
, which can make regex more readable.
Upvotes: 1