Reputation: 117
Say I have templates to fill with values in dict:
I have templates like this:
templates = [
"I have four {fruit} in {place}",
"I have four {fruit} and {grain} in {place}",
...
]
With dictionary like this:
my_dict = {'fruit': ['apple', 'banana', 'mango'],
'place': ['kitchen', 'living room'],
'grain' : ['wheat', 'rice']
}
Say I have a sentence like this:
sentence = "I have four apple in kitchen"
Given this sentence, templates, and dictionary, I would like to know that sentence matched one of the templates and return values which it matched like this:
{'fruit': 'apple', 'place': 'kitchen'}
And similar to above if:
Input: "I have four apple and wheat in kitchen"
Output: {'fruit': 'apple', 'grain': 'wheat', 'place': 'kitchen'}
And it would be great if it can handle this too:
Input: "I have four apple in bedroom"
Output: {'fruit': 'apple'}
Notice it only returns fruit and not bedroom since bedroom is not in the values of place.
Upvotes: 3
Views: 469
Reputation: 1122222
Turn your formatted strings into regular expressions:
import re
words = {k: '(?P<{}>{})'.format(k, '|'.join(map(re.escape, v))) for k, v in my_dict.items()}
patterns = [re.compile(template.format(**words)) for template in templates]
This produces patterns of the form I have four (?P<fruit>apple|banana|mango) in (?P<place>kitchen|living room)"
. Matching these then provides you with your expected output:
for pattern in patterns:
match = pattern.match(sentence)
if match:
matched_words = match.groupdict()
This is a very fast, O(N) approach to matching sentences exactly:
>>> import re
>>> templates = [
... "I have four {fruit} in {place}",
... "I have four {fruit} and {grain} in {place}",
... ]
>>> my_dict = {'fruit': ['apple', 'banana', 'mango'],
... 'place': ['kitchen', 'living room'],
... 'grain' : ['wheat', 'rice']
... }
>>> def find_matches(sentence):
... for pattern in patterns:
... match = pattern.match(sentence)
... if match:
... return match.groupdict()
...
>>> find_matches("I have four apple in kitchen")
{'fruit': 'apple', 'place': 'kitchen'}
>>> find_matches("I have four apple and wheat in kitchen")
{'fruit': 'apple', 'grain': 'wheat', 'place': 'kitchen'}
If you need your templates to match partial sentences, wrap the optional parts in (?...)
groups:
"I have four {fruit} in (?{place})"
or add \w+
to the words list (in addition to the valid words), then validate groupdict()
result against my_dict
after matching. For the in bedroom
case, \w+
will match the bedroom
part but won't be found in the my_dict
list for place
, for example.
Upvotes: 6