atif
atif

Reputation: 1147

full text search and xml content in sql server 2005

I have a table with xml data type field in sql server 2005, i am storing xml content in that and I have enabled the full text search on it.

The problem i am facing is that the search is not picking the attribute values stored in field. It searches for the text enteries on the node.

Here is my query:

  select * from document_content where FREETEXT (doc_content_xml,'"2010 SKCA 136 para 46"')

In data I have following xml node which contains this text as an attribute but search result showing 0 result.

  <p class="para">
  <number value="2010 SKCA 136 para 46" class="num">
  [46]
 </number>
 </p>

It is searching if the markup is like

  <p class="para">
  <number value="2010 SKCA 136 para 46" class="num">
  2010 SKCA 136 para 46
 </number>
 </p>

Any help will be appreciated.

Upvotes: 2

Views: 744

Answers (1)

Wolf5
Wolf5

Reputation: 17140

As far as I know, those attributes will never be indexed. I am in the same situation as you and nothing I have read has told me otherwise. It is the way SQL fulltext indexing works on the XML datatype. There is something named iFilter that defines how it is indexed, and it can not be overriden. The same goes for other datatypes.

MS SQL server will only index XML values and only attributes that is "top level and is not a number".

You can store the XML as a "FILE" datatype in the SQL and implement your own iFilter dll that you then register in Windows on the SQL server and register the iFilter filetype to match that of the filetype of your "FILE" datatype. This is advanced stuff you have to do in C++.

Unless there is XPath queries being run on the XML, you could always change the type to varchar or any other TEXT types that would be indexed as text.

Maybe adding a metadata column that exists only for searching could be done. A copy of the XML in a varchar(max) column that would be indexed as text.

My XML contains your typical key, value info where the key is stored as an attribute and I can not access it with Contains() as only the value is indexed.

UPDATE: The solution I ended up with is fulltext indexing the XML, then do a 2 pass on the searching. Pass 1 narrows down the result of XMLs by a fulltext search. Pass 2 does the XPath query to end up with a result. This is way faster then only doing XPath.

Upvotes: 1

Related Questions