Reputation: 60
I have a predefined list of objects such as
["pop", "pizza", "orange juice", "apple juice", "pasta", "taco", ...]
I'm given a raw text asking for these objects such as
Buy a pizza for me and a pasta for my friend. Also buy me a pop, an orange juice, or an apple juice.
I would like to extract the objects mentioned in the text as well as the 'and', 'or' relationship between them. E.g., for the example above, I need the output to be something like:
[["pizza"], ["pasta"], ["pop", "orange juice", "apple juice"]]
showing the text is asking for pizza, pasta, and at least one object from (pop, orange juice, apple juice). i.e., the text is looking for (pizza AND pasta AND (pop OR orange juice OR apple juice)). There can be several variations in the raw text.
I was looking at parsing and nlp techniques, but could not find anything helpful. I appreciate any help or pointer.
Upvotes: 1
Views: 38
Reputation: 6039
I would using a combination of verb-SRL and other annotations:
Here is the output annotations for your input sentences:
As you can see, the things that you want often appear with "A1" label:
And it misses "pasta". Within "A1" span it's often easy to split into different items, say by splitting on commas.
Note that this also tells you who wants it; for example: A0.neader: I
Btw you can play with the demo yourself here: http://deagol.cs.illinois.edu:8080/
If you want to connect "I" to "me", etc, of course you can use co-reference.
Upvotes: 1