Reputation: 34188
I have gone through a small article for how to index data using Lucene.Net but few code meaning was not clear to me those are
Document doc = new Document();
doc.Add(new Field("ID", oData.ID.ToString() + "_" + oData.Type, Field.Store.YES, Field.Index.UN_TOKENIZED));
doc.Add(new Field("Title", oData.Title, Field.Store.YES, Field.Index.TOKENIZED));
doc.Add(new Field("Description", oData.Description, Field.Store.YES, Field.Index.TOKENIZED));
doc.Add(new Field("Url", oData.Url, Field.Store.YES, Field.Index.TOKENIZED));
writer.AddDocument(doc);
What is the meaning of this line doc.Add(new Field("ID", oData.ID.ToString() + "_" + oData.Type, Field.Store.YES, Field.Index.UN_TOKENIZED));
What is the meaning of Field.Index.UN_TOKENIZED and Field.Index.TOKENIZED
if possible please discuss about the importance of these words in details UN_TOKENIZED and Field.Index.TOKENIZED
.
Upvotes: 1
Views: 652
Reputation: 6269
Lucene has deprecated TOKENIZED
and UN_TOKENIZED
, they're now named ANALYZED
and NOT_ANALYZED
.
The meaning of NOT_ANALYZED
is, that the fields contents will not be run through an analyzer. In effect they're considered a single 'term' if searched. As an example for where this is useful the documentation names unique product ids (i.e. EANs or UPCs).
The meaning of ANALYZED
means that the fields contents will be analyzed and (possibly) be broken down into more than one 'term'. The Lucene documentation mentions this is useful for common text. The accepted answer to this question explains some commonly used analyzers very well.
For further reference please also refer to the Lucene.net and Lucene documentations.
Upvotes: 3