C. Güzelhan
C. Güzelhan

Reputation: 171

Update the configuration of a field in Solr 6.6.0

I am using Apache Solr 6.6.0 in order to build a search engine by recursively indexing all files in a folder.

How I do it is as follows: 1) I create an index based on the cloud example. 2) I index all files that are in the given folder.

bin\solr start -e cloud -noprompt
java -Dc=gettingstarted -Dauto=yes -Ddata=files -Drecursive=yes -jar example\exampledocs\post.jar <path_to_folder>

Later when I search for a query in the user interface, I see that, even though it provides me top matches, it does not provide me the document content. After some research, I found a field named "_text_" and its configuration in the managed-schema file:

<field name="_text_" type="text_general" multiValued="true" indexed="true" stored="false"/>

As you see, the field is not stored, which I think it is the reason why the response does not provide the content.

Am I on the right track ? If so, how can I edit the configuration of this field ? Should I delete it and create a new one with the same name and with stored=true ?

Thank you.

Upvotes: 0

Views: 95

Answers (1)

Andrea
Andrea

Reputation: 2764

The _text_ field is not supposed to be stored because it is used as a "catch all" field. So first, you should check the Solr configuration in order to make sure that it contains only the file content. If it is so, then you could mark that field as stored.

But, generally speaking, files contents are only indexed, not stored, because

  • a GUI, in order to let the end user see the content, has some other way to access to the file content (e.g. often static resources like txt files are published in a separate Apache instance, so from a client perspective, it's just a matter of making an HTTP URL)
  • it increases your index size a lot

So, in other words: use Solr for search and once you get a given item metadata, use its identifier for going in some other system and "view" the corresponding content. This is the usual* scenario, especially for dealing with unstructured data like txt files

  • "usually" doesn't mean it's always valid. There could be some case where you want Solr to do that, or in general there could be some other good reason to mark a field as stored (e.g. highlighting)

Upvotes: 1

Related Questions