I want to tag a field specific such as technical and scientific nouns in a sentence using Part-of-speech technique . Example Consider the sentences: 1) Computers need keyboard , moniter , CPU to work. 2) Automobile uses gears and clutch . Now my objective is , the example sentences have to be tagged as 1st sentence Computer/technical need/noun keyboard/technical CPU / technical to /preposition work /verb 2nd sentence Automobile / mechanical uses / verb gears / mechanical and / conjunction clutch / mechanical My need I want to implement above mentions objective in java, that is to tag nouns by it related field such as technical , mechanical , electrical etc. My Previous Works I already used Stanford NLP , Open NLP , but they are tagging POS , but not satisfying what is need. Please tell me how to do this ?

javanlpinformation-retrievalstanford-nlpopennlp

Ever Think

Reputation: 813

How to tag field specific nouns using Parts-of-Speech Taggers?

I want to tag a field specific such as technical and scientific nouns in a sentence using Part-of-speech technique .

Example

Consider the sentences:

1) Computers need keyboard , moniter , CPU to work.
2) Automobile uses gears and clutch .

Now my objective is , the example sentences have to be tagged as

1st sentence

Computer/technical
need/noun
keyboard/technical
CPU / technical
to /preposition
work /verb

2nd sentence

Automobile / mechanical
uses / verb
gears / mechanical
and / conjunction
clutch / mechanical

My need
I want to implement above mentions objective in java, that is to tag nouns by it related field such as technical , mechanical , electrical etc.

My Previous Works
I already used Stanford NLP , Open NLP , but they are tagging POS , but not satisfying what is need.

Please tell me how to do this ?

Upvotes: 0

Answers (2)

Chthonic Project

Reputation: 8366

Named entity recognition (NER) is an entity identification/extraction system that locates entities in text and classifies them into predefined categories (e.g. motherboard --> technical, RAM --> technical random access memory --> technical). NERs typically use linguistic grammar-based methods and statistical methods. I doubt you will need to get into the details of these methods for your task. If you do get interested, feel free to read up on conditional random fields.

As far as I can see, all you need is to be able to train your own NER with your categories (i.e. technical, mechanical, etc.). The Stanford NER FAQ page provides adequate information on how to do this.

For an intuitive understanding of how the final system will work, you can take a look at the online demo of the Stanford NER. They provide English, Chinese and German classifiers. There are three English classifiers that were trained on 3, 4 and 7 categories ... try them out, and see for yourself.

I've tried to be as succinct as possible. A detailed introduction to NER is not possible on SO. I hope my answer, together with the links provided, helps your task.

Upvotes: 1

Mark Giaconia

Reputation: 3953

Interesting problem, here are a few thoughts. Since you need the parts of speech, use a part of speech tagger such as OpenNLP, this will give you the POS tags you need. The second part is a bit trickier (classifying certain words). If the words that map to a category will be limited, you could simply use a lookup list, sometimes this is the simplest and most accurate, using an NER model will give you some noise. If not, then you can do what was already suggested, with is to train an NER model.

Upvotes: 1

How to tag field specific nouns using Parts-of-Speech Taggers?

Answers (2)

Related Questions