GoodJuJu
GoodJuJu

Reputation: 1570

Apache Solr only return fields that value/query string was found in

I am just getting started with Apache Solr.

I have successfully run through the Apache tutorials and have now created my own collection and indexed my files.

Whilst the documentation is extensive I cannot find if there is a way to query all fields, but only return the fields that the search string/query was found in.

For example, if I have a file: Filename: Weekly Report For Company X.pdf

Associated / indexed meta-data:

"id":"S:\\Weekly Reports\\JAN\\Weekly Report For Company X.PDF",
"date":["2017-11-02T19:14:07Z"],
"pdf_pdfversion":[1.6],
"company":["Microsoft"],
"access_permission_can_print_degraded":[true],
"subject":["weekly report; reports; weekly"],
"contenttypeid":["0x010100F29081EC69D67544A17D8172A093E42E"],
"dc_format":["application/pdf; version=1.6"],

If I query for "Weekly Report" I only want to return the 'id' and 'subject' fields as these are the only fields that contain the actual queried values. If other fields contained the string, I would want them returned too.

I'm leaning towards 'it cannot be done' (but hope I am wrong) as I liken it to a SQL query. It has to know what fields to return in the SQL statement and does not remove fields based on no matching string.

Since I don't know the matched fields before running the query I cannot use the filter list option at the point of executing the query.

Is this possible?

Upvotes: 0

Views: 1114

Answers (1)

Mysterion
Mysterion

Reputation: 9320

While this may be not precisely what you want, but you could mimic similar behaviour with highlighting.

All you need to do - is to create dismax query with qf being all fields that you have (e.g qf=id,subject,company)

Then you need to request highlighting, request all fields for it (hl.fl=id,subject,company) and enable hl.requireFieldMatch which would force Solr to return only fields which were matched for the query.

In this case you will have a highlighting section, that will contain ids of the matched documents and only highlighted contents of matched fields

Upvotes: 3

Related Questions