The Bndr
The Bndr

Reputation: 13394

DataImportHandler: docs fetched but not imported

I need to crawl recursive trough an filesystem to find any xml files in order to index those, running Solr 6.4

In the first run I used the Solr cloud example with 2 nodes an added an data Import handler with the following config:

<dataConfig>
  <dataSource type="FileDataSource"
              encoding="ISO-8859-1" />
  <document>
    <entity
      name="document"
      processor="FileListEntityProcessor"
      baseDir="/path/to/xmldata"
      fileName=".*\.xml$"
      recursive="true"
      rootEntity="false"
      dataSource="null">
      <entity
    name="xpathE"
        processor="XPathEntityProcessor"
        url="${document.fileAbsolutePath}"
        useSolrAddSchema="true"
        stream="true"
    onError="continue">
      </entity>
    </entity>
  </document>
</dataConfig>

After I start the dataimport process, Solr seams to access the filesystem and ends with the Message, that 148 documents are fetched. But, there is not a single document added to the index.

Here is the importHandler feedback:

{

  "responseHeader": {
    "status": 0,
    "QTime": 0
  },
  "initArgs": [
    "defaults",
    [
      "config",
      "DIHconfigfile.xml"
    ]
  ],
  "command": "status",
  "status": "idle",
  "importResponse": "",
  "statusMessages": {
    "Total Requests made to DataSource": "0",
    "Total Rows Fetched": "148",
    "Total Documents Processed": "0",
    "Total Documents Skipped": "0",
    "Full Dump Started": "2017-02-09 10:53:03",
    "": "Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.",
    "Committed": "2017-02-09 10:53:03",
    "Time taken": "0:0:0.140"
  }
}

Why Solr did not add a single document to the index?

Upvotes: 2

Views: 427

Answers (1)

The Bndr
The Bndr

Reputation: 13394

To answer my own question and in case if someone hast the same problem:

The problem above occurs, if the DIH did not find any matching field in the fetched file. In my case there was no working dynamic schema and also no Xpath Definition that matches an XML tag to an Solr field, like: <field column="name" xpath="/document/head/Person"/>

As longs as there is no solr filed mandatory in the schema.xml, Solr does not log any error. Everything is optional for Solr, if no single Solr filed is has set the required=true attribute.

Upvotes: 1

Related Questions