Reputation: 4720
At first this problem seems trivial: given two ontologies, which term in ontology A best refers to a term in ontology B. But its simplicity is deceptive: this problem is extremely hard and has currently lead to thousands of academic publications, without any consensus on how to solve this problem.
Naively, one would expect that simply looking at the term "Heart Attack" in both ontologies would suffice. However, ontologies almost never encode the same phrase. In simple cases "Heart Attack" might be coded as "Heart Attacks", or "Heart attack (non-fatal)", but in more complicated cases it might only be coded as "Myocardial infarction". In other cases it is even more complicated, for example dealing with compound (composed) terms.
More importantly, simply matching the term (or string) ignores the "ontological structure".
What if "Heart Attack" in ontology A is coded as caused-by
high blood pressure, whereas in ontology B it might be coded as withdrawl-from-trial-non-fatal
.
In this case it might be valid to match the two terms, but not trivially so.
And this assumes the equivalent term exists at all.
It's a classical problem called Semantic/Ontology Matching, Alignment, or Harmonization. The research out there involves lexical similarity, term usage in free text, graph homomorphisms, curated mappings (like MeSH/WordNet), topic modeling, and logical inference (first- or higher-order logic). But which is the most user friendly and production ready solution, that can be integrated into a Java(/Clojure) or Python app? I've looked at Ontology matching: A literature review but they don't seem to recommend anything ... any suggestions or experiences?
Upvotes: 0
Views: 80
Reputation: 311
Have a look at http://oaei.ontologymatching.org/2014/results/ . There were several tracks open for matchers to be sent in and be evaluated. Not every matcher participates in every track. So you might want to read the track descriptions and pick one that seems to be the most similar to your problem. For example if you don't have to deal with multiple languages you probably don't have to check the MultiFarm track. After that check the results by having a look at Recall
, Precision
and F-Measure
and decide for yourself. You also might want to check out some earlier years.
Upvotes: 1