Reputation: 3184
My understanding is that indexing a PDF, Word, Excel, etc. document through Solr will allow searching but not highlighting. I have this code to perform the indexing:
String urlString = "http://localhost:8983/solr";
SolrServer solr = new HttpSolrServer(urlString);
ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract");
for (MultipartFile file : files) {
if (file.getOriginalFilename().equals("")) {
continue;
}
File destFile = new File(destPath, file.getOriginalFilename());
file.transferTo(destFile);
up.addFile(destFile);
up.setParam("literal.id", destFile.getAbsolutePath());
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
try {
solr.request(up);
} catch (SolrServerException sse) {
sse.printStackTrace();
}
}
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
I have read that in order to enable highlighting I will need to "store/parse the content?" How can this be done? Thanks for your help.
Upvotes: 0
Views: 946
Reputation: 22555
You will need to modify the Schema file for your Solr instance and set stored="true"
for the content
field. I am assuming that you are using the default field settings for the ExtractingRequestHandler want to return highlight results against that field.
Please reference the Field Options By Use Case for a matrix and notes on what field options must be enabled for Highlighting and other features to work correctly.
Upvotes: 2