Vignesh
Vignesh

Reputation: 952

NLP creating a model out of POS tags

I am trying to create a knowledgebase based on text mining. I am using Genia Corpus to tag the words by their Parts of speech. Given two terms from the text, how do i create a model that finds out its relation?

Eg Text:

HIF1A gene is involved in Hypoxic regulation. Hypoxia also up regulates BRCA1 gene expression which is mainly associated in breast cancer.

I have the POS tagged out.

Word     Base Form  Part-Of-Speech   
HIF1A    HIF1A          NN  
gene     gene           NN  
is           be         VBZ 
involved     involve    VBN 
in           in         IN  
Hypoxic  Hypoxic    JJ  
regulation   regulation NN  
.            .          .   
Hypoxia  Hypoxia        NN  
also     also           RB  
regulates    regulate   VBZ 
BRCA1    BRCA1          NN  
gene     gene           NN  
which    which          WDT 
is           be         VBZ 
mainly   mainly         RB  
associated   associate  VBN 
in           in         IN  
breast   breast         NN  
cancer   cancer         NN

I am writing a web interface that when queried BRCA1 and Hypoxia should tell that there is positive regulation between them. when queried HIF1A and Hypoxia it should tell that there is a positive regulation based on these sentences.

Now that i have the POS tagged I dont know how to proceed in creating a model that would come up with identifying the relation between them. This is just an example. I want to do it for general biomedical terms and texts.

Anyone any suggestions?

Upvotes: 1

Views: 100

Answers (1)

Pierre
Pierre

Reputation: 1246

Relying solely on the output of a POS tagger you'll have to define local grammar rules (patterns).

Personally, I would suggest you to use a (syntactic) parser to get argument structures like regulate(Hypoxia,BRCA1)...

Upvotes: 2

Related Questions