Sap
Sap

Reputation: 5291

New to NLP, Question about annotation

I am new to NLP and I am looking for a starting point, in terms of some tutorials, documentation or example code. I have been told to research the possibilities of processing natural text to extract some structured data from it. For example I want to extract(annotate) height and weight from following statements. "He is 6 feet tall and weighs 200 pounds" or "His height is 6 feet and weight is 200" etc. I have looked into UIMA but it seems like a self created REGEX dictionary with no training capabilities. So in a nutshell, what Java framework can I use to create an annotation engine that can be trained as well! Any help(pointers) on this will be heavily appreciated. Thanks

Upvotes: 2

Views: 1199

Answers (3)

Daniel
Daniel

Reputation: 6039

I'd use NER. Here is the output I see for your input text: enter image description here

You can try it here: http://deagol.cs.illinois.edu:8080

Upvotes: 0

Stompchicken
Stompchicken

Reputation: 15931

If you really want to want to use machine learning to train your annotator, then GATE is probably your best bet. Take a look at the chapter on machine learning in their guide.

Upvotes: 3

Sujith Surendranathan
Sujith Surendranathan

Reputation: 2579

Since you asked for pointers: LingPipe (already mentioned above), OpenNLP, and Stanford NLP distributions.

Note: if Python is an option, you can use the Natural Language Toolkit.

Upvotes: 5

Related Questions