Reputation: 61
I am using GATE tool for natural language processing.. i am using java code to read lines from the sentence and get the keywords.. what modification has to be done in creole xml to read complete paragraph..
Upvotes: 1
Views: 1498
Reputation: 822
This worked for me:
following code
FeatureMap features = gateDoc.getFeatures();
String originalContent = (String)
features.get(GateConstants.ORIGINAL_DOCUMENT_CONTENT_FEATURE_NAME);
int length = originalContent.length();
TextualDocumentFormat tdf = new TextualDocumentFormat();
try {
tdf.annotateParagraphs(gateDoc,0, length,null);
} catch (DocumentFormatException e) {
e.printStackTrace();
}
AnnotationSet paragraphs = gateDoc.getAnnotations().get("paragraph");
Iterator it = paragraphs.iterator();
Annotation currAnnot;
SortedAnnotationList sortedParagraphs = new SortedAnnotationList();
while (it.hasNext()) {
currAnnot = (Annotation) it.next();
sortedParagraphs.addSortedExclusive(currAnnot);
} // while
StringBuffer editableContent = new StringBuffer(originalContent);
System.out.println("Number of Paragraphs - "+paragraphs.size());
for(Annotation paragraph:paragraphs){
long start = paragraph.getStartNode().getOffset().longValue();
long end = paragraph.getEndNode().getOffset().longValue();
String paraText=editableContent.substring((int) start, (int) end);
System.out.println(paraText);
}
Upvotes: 0
Reputation: 169
You can use
doc.getNamedAnnotationSets().get("Original markups")
If it doesn't give any results, you can use the method annotateParagraphs()
of the class
gate.corpora.TextualDocumentFormat
.
Upvotes: 2
Reputation: 6480
I am not sure what do u mean, but if you use ANNIE you can put each paragraph in a separate tag. I used standAloneAnnie.java
http://gate.ac.uk/wiki/code-repository/src/sheffield/examples/StandAloneAnnie.java
If user enters
What is your name, ,some text sometext Sometext sometext sometext
The result will be
<paragraph>What is your name, ,some text sometext</paragraph>
<paragraph>Sometext sometext sometext</paragraph>
You cane get more tags like, Person, Location, Sentence or Token for each word.
If user enters for example
Where To Dine In Kuala Lumpur. Helton Hotel
The result will be an xml file that contains
<paragraph>
<Sentence>
<Token>Where</Token>
<Token>To</Token>
<Token>
<Unknown>Dine</Unknown>
</Token>
<Token>In</Token>
<Lookup>
<Location>
<Token>Kuala</Token>
<Token>
<Lookup>Lumpur</Lookup>
</Token>
</Location>
</Lookup>
<Token>
<Split>.</Split>
</Token>
</Sentence>
<Sentence>
<Organization>
<Token>Helton</Token>
<Token>
<Lookup>
<Lookup>Hotel</Lookup>
</Lookup>
</Token>
</Organization>
</Sentence>
</paragraph>
I am currently trying to get synonyms but unable to do so :( I want the result to include other options like for the above sentence, i want to result to have Dine -> Dinner, Food, Eat, Restaurant.
Upvotes: 0