Reputation: 2852
<Docs>
<Doc>
<Title>Electromagnetic Fields</Title>
<Info>
<Vol name="Physics"/>
<Year>2006</Year>
</Info>
<SD>
<Info>
<Para>blah blah blah.<P>blh blah blah.</P></Para>
</Info>
</SD>
<LD>
<Info>
<Para>blah blah blah.<P>blah blah blah.</P></Para>
<Para>blah blah blah.<P>blah blah blah.</P></Para>
<Para>blah blah blah.<P>emf waves blah.</P></Para>
<Para>blah blah blah.<B>emf waves</B> blah.</Para>
<Para>blah blah blah.<P>emf waves blah.</P></Para>
<Para>blah waves blah.<B>emf</B> waves blah.</Para>
<Para>emf blah blah.<I>waves blah.</I></Para>
<Para>blah blah blah.<B>emf waves</B> blah.</Para>
<Para>blah blah blah.<P><I>emf</I> waves blah.</P></Para>
</Info>
</LD>
</Doc>
</Docs>
Query 1 -
for $x in ft:search("Article", ("emf","waves"), map{'mode':='all words'})/ancestor::*:Doc
return $x/Title
I am getting 62 Hits
Query 2 -
for $x in ft:search("Article", ("emf","waves"), map{'mode':='all words'})
return $x/ancestor::*:Doc/Title
I am getting 159 Hits
Query 3 -
for $x in doc("Article")/Doc[Info[Vol/@name="Physics" and Year ge "2006" and Year le "2010"]]
[SD/Info/Para/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/text() contains text {"emf","waves"} all words or
LD/Info/Para/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/B/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/I/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/U/text() contains text {"emf","waves"} all words]
return $x/Title
This results in 224 hits. In the 3rd one, I am using all the nodes which are actually present. I
, B
and U
are for Italic, Bold and Underline the text.
Why this difference ?
Upvotes: 0
Views: 133
Reputation: 25034
Your first query searches for Doc
elements which have a certain property, and returns one result for each such Doc
element.
Your second query searches for nodes of any kind which have a (related) property, and returns one result for each such node.
Your third query searches for text nodes which have another (related) property.
Whenever there are Doc
elements containing more than one node matching the full-text search criterion, the first and second queries will return different numbers of hits. And similarly for the third query, vis-a-vis the others.
Upvotes: 1
Reputation: 5256
Queries 1 and 2 pretty much look the same, however the path expression in Q1 results in Doc
elements. So if there are multiple matching nodes below a single Doc
, that Doc
will count just once in Q1, whereas each node is counted individually in Q2. This is due to the fact that the node sequence resulting from a path expression, by definition, is duplicate-free.
Q3 is different, but while Q1 and Q2 depend on the properties of a full-text index, Q3 doesn't. If e.g. the index is case-sensitive, you'll get less results from it than from a contains text
predicate.
So from the quoted counts, I'd assume that the text index comes up with 159 matching nodes in 62 documents, while being specified as more restrictive than a plain contains text
.
Upvotes: 1