chingu
chingu

Reputation: 157

Solr "Content" field vs "_text_" field

I am wondering what is the difference between content field vs _text_ field. I had an issue where I indexed all of my documents/pdfs, but for some reason I could not access the actual text/info in those documents/pdfs. I noticed I had no "content" field so I just created one and am currently reindexing. However, I noticed there is a _text_ field I have that has stored=false. Do both of these fields take all the text from documents/pdfs?

Upvotes: 4

Views: 1829

Answers (1)

Hector Correa
Hector Correa

Reputation: 26690

The _text_ is a field defined by default on a new Solr core (see https://lucene.apache.org/solr/guide/7_5/schemaless-mode.html).

The default managed-schema file in a new Solr core does not show anything to indicate that it is populated with anything, so I suspect it's up to you to populate it.

The _text_ field can be used to dump a copy of all the text in the document but this is something that you have to do (either manually populating the _text_ field or using copyFields.)

The fact that _text_ is indexed but not stored means that you can search for text inside of it (because it's indexed) but you cannot fetch and display its value to the user (because it is not stored).

Upvotes: 4

Related Questions