Leo
Leo

Reputation: 265

Regex find word including "-"

I have the below regex (from this link: get python dictionary from string containing key value pairs)

r"\b(\w+)\s*:\s*([^:]*)(?=\s+\w+\s*:|$)"

Here is the explanation:

\b           # Start at a word boundary
(\w+)        # Match and capture a single word (1+ alnum characters)
\s*:\s*      # Match a colon, optionally surrounded by whitespace
([^:]*)      # Match any number of non-colon characters
(?=          # Make sure that we stop when the following can be matched:
 \s+\w+\s*:  #  the next dictionary key
|            # or
 $           #  the end of the string
)            # End of lookahead

My question is that when my string has the word with the "-" in between, for example: movie-night, the above regex is not working and I think it is due to the b(\w+). How can I change this regex to work with word including the "-"? I have tried b(\w+-) but it does not work. Thanks for your help in advance.

Upvotes: 1

Views: 593

Answers (1)

Elizafox
Elizafox

Reputation: 1376

You could try something such as this:

r"\b([\w\-]+)\s*:\s*([^:]*)(?=\s+\w+\s*:|$)"

Note the [\w\-]+, which allows matching both a word character and a dash.

For readability in the future, you may also want to investigate re.X/re.VERBOSE, which can make regex more readable.

Upvotes: 1

Related Questions