Reputation: 11
I have a solr index generated from a catalog of PDF files and correspoing metadata fields pertaining to the pdf files themselves. Still, I would like to provide my users an option to exclude in the query any text indexed from within a PDF. This is so the query results would be based on the metadata fields instead and not biased by the vast text within the pdf files.
I have thought of maybe having two indexes (cores) - one with the indexed pdf files and one without.
Is there another way?
Upvotes: 0
Views: 202
Reputation: 81
You can look at field aliases
If you have 3 index fields
Then you can create two field aliases
One advantage of using a field alias over qf is if your users have bookmarks like q=quicksearch:value, you can change the alias for quicksearch without affecting the user's bookmark.
Upvotes: 0
Reputation: 9789
Sounds like you are doing a general search against a default field. Which means you have a lot of copyField instructions (or just one copyField * -> text), which include the PDF content field.
You can create a second destination and copyField everything but the PDF content field into that as well. This way, users can search against or another combined field.
However, remember that this parses all content according to the analysis chain of the destination field. So, eDisMax with a list of source fields may be a better approach there. And, remember, you can use several request handlers (like 'select') and define different default parameters there. That usually makes the client code a bit easier.
Upvotes: 1
Reputation: 896
You do not need to use 2 separate indexes. You can use the edismax
parser and specify the qf
parameter at query time. That will help determine what fields are searched.
Upvotes: 0