callow
callow

Reputation: 517

Searching semantically tagged documents in MarkLogic

Can any one please point me to some simple examples of semantic tagging and querying semantically tagged documents in MarkLogic?

I am fairly new in this area,so some beginner level examples will do.

Upvotes: 1

Views: 311

Answers (2)

grtjn
grtjn

Reputation: 20414

In case you are talking about enriching your content using semantic technology, that is not directly provided by MarkLogic.

You can enrich your content externally, for instance by calling a public service like the one provided by OpenCalais, and then insert the enrichments to the content before insert.

You can also build lists of lookup values, and then using cts:highlight to mark such terms within your content. That could be as simple as:

let $labels := ("MarkLogic", "StackOverflow")
return
  cts:highlight($doc, cts:word-query($labels), <b>{$cts:text}</b>)

Or with a more dynamic replacement using spraql:

let $labels := map:new()
let $_ := 
  for $result in sem:sparql('
    PREFIX demo: <http://www.marklogic.com/ontologies/demo#>

    SELECT DISTINCT ?label
    WHERE {
      ?s a demo:person. 
      {
        ?s demo:fullName ?label 
      } UNION {
        ?s demo:initialsName ?label 
      } UNION {
        ?s demo:email ?label 
      }
    }
  ')
  return
    map:put($labels, map:get($result, 'label'), 'person')
return
  cts:highlight($doc, cts:word-query(map:keys($labels)), 
    let $result := sem:sparql(concat('
      PREFIX demo: <http://www.marklogic.com/ontologies/demo#>

      SELECT DISTINCT ?s ?p
      {
        ?s a demo:', map:get($labels, $cts:text), ' .
        ?s ?p "', $cts:text, '" .
      } 
    '))
    return
      if (map:contains($labels, $cts:text))
      then
        element { xs:QName(fn:concat("demo:", map:get($labels, $cts:text))) } {
          attribute subject { map:get($result, 's') },
          attribute predicate { map:get($result, 'p') },
          $cts:text
        }
      else ()   
  )

HTH!

Upvotes: 3

mblakele
mblakele

Reputation: 7840

When you say "semantically tagged" do you mean regular XML documents that happen to have some triples in them? The discussion and examples at http://docs.marklogic.com/guide/semantics/embedded are pretty good for that.

Start by enabling the triple index in your database. Then insert a test doc. This is just XML, but the sem:triple element represents a semantic fact.

xdmp:document-insert(
  'test.xml',
  <test>
    <source>AP Newswire</source>
    <sem:triple date="1972-02-21" confidence="100">
      <sem:subject>http://example.org/news/Nixon</sem:subject>
      <sem:predicate>http://example.org/wentTo</sem:predicate>
      <sem:object>China</sem:object>
    </sem:triple>
  </test>)

Then query it. The example query is pretty complicated. To understand what's going on I'd insert variations on that sample document, using different URIs instead of just test.xml, and see how the various query terms match up. Try using just the SPARQL component, without the extra cts query. Try cts:search with no SPARQL, just the cts:query.

xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics" 
  at "/MarkLogic/semantics.xqy";
sem:sparql('
  SELECT ?country
  WHERE {
<http://example.org/news/Nixon> <http://example.org/wentTo> ?country
  }
  ',
 (),
 (),
 cts:and-query((
   cts:path-range-query( "//sem:triple/@confidence", ">", 80) ,
   cts:path-range-query( "//sem:triple/@date", "<",     xs:date("1974-01-01")),
   cts:or-query((
     cts:element-value-query( xs:QName("source"), "AP Newswire"),
     cts:element-value-query( xs:QName("source"), "BBC"))))))

Upvotes: 3

Related Questions