Reputation: 14645
For example I have several tags per document. I can
doc.addField("tags1", "tag1");
doc.addField("tags", "tag2");
doc.addField("tags", "tag23)
)Both approaches will work. The question is how different will be scoring for those types of indexing? (i.e. field normalization factor, tf/idf count, field length calucaltion, slope factor etc)
Upvotes: 2
Views: 708
Reputation: 12402
Lucene will concatenate all the values for a multivalued filed behind the scene anyway, so it'd not be much different than your first case, if at all. If you use tags only as filters (give me all docs tagged with tag2), then you definitely won't see any difference.
Upvotes: 1
Reputation: 804
I would think the multi-value would be more accurate.
imagine a tokenized string "spider web developer"
vs
multi-value field with the values "spider" and "web developer"
a search for "web developer" would match both fields but the match vs the multi-value field could be seen as more accurate.
Upvotes: 0