Reputation: 19
I have a new upcoming requirement, where we have to process, a given description, in a transaction and process that, to break into pre-defined categories.
The description is a description of doctor prescription.
For example "Take 1 pill every morning for 30 days", "take 1 capsule twice a day for two weeks"
Note these description have to be broken down into categories say for eg. days, duration, repetition, type of drug, way of taking.
I am trying to use apache ONLP.
Please suggest how to move forward in this problem, so as to make this more accurate, as the solution has to be accurate.
Upvotes: 1
Views: 149
Reputation: 821
What you want to do, is called Information Extraction in Computational Linguistics terms. You can consult this page for starters.
Upvotes: 2
Reputation: 5017
Please check out cTAKES
, an open source project. They are doing the same thing what you want.
You can use Finite State Machine
for this purpose.
Refer this guide, to set up this cTAKES
project.
Also refer this javaDoc for frequency unit of Drug.
Upvotes: 1
Reputation: 6039
Use Illinois quantities package for standardizing numerical values: http://cogcomp.cs.illinois.edu/demo/quantities/index.php
Upvotes: 2