Singam
Singam

Reputation: 503

Solr Data Import Handler (DIH) fails to Index all the records from MySQL View

I have a View in my MySQL DB and building a POC of Solr Indexing using DIH. In my direct select query, there are 6 records, but the Solr query returns just 4 (even though it does say that it has Fetched 6 records).

MySql View

CREATE VIEW FORUMS_SURVEYS AS
SELECT F.TITLE, F.DESCRIPTION, F.CREATED, FC.TYPE, FC.SUBTYPE FROM FORUM F JOIN FORUM_CATEGORY FC ON F.FORUM_CATEGORY_ID=FC.ID
UNION ALL
SELECT S.TITLE, S.DESCRIPTION, S.DATED AS CREATED, "" AS TYPE, "" AS SUBTYPE FROM SURVEY S; 

Select from the View

select * from FORUMS_SURVEYS;

Result - Fetched Rows: 6 (as expected)

Run DIH on Solr with the following

db-data-config.xml

<dataConfig>
    <dataSource driver="com.mysql.cj.jdbc.Driver" url="jdbc:mysql://localhost:3306/bjm" user="<user>" password="<password>" />
    <document>
       <entity name="forums_surveys" query="SELECT * FROM forums_surveys" transformer="HTMLStripTransformer">
            <field column="TITLE" name="title" indexed="true" type="text" />
            <field column="DESCRIPTION" name="description" indexed="true" type="text" stripHTML="true"/>
            <field column="CREATED" name="created" indexed="true" type="text" />
            <field column="TYPE" name="type" indexed="true" type="text" />
            <field column="SUBTYPE" name="subtype" indexed="true" type="text" />
        </entity>
    </document>
</dataConfig>

On UI Admin , Result of DataImport

Last Update: 16:13:12

(Duration: 01s)
Requests: 1 1/s, Fetched: 6 6/s, Skipped: 0 , Processed: 0 
Started: 42 minutes ago

Again, good to see the text "Fetched: 6"

However, things go unpleasant when I query the result from UI Admin with query Params q=title:* (notice the field in JSON below response > numFound)

"responseHeader":{
    "status":0,
    "QTime":8,
    "params":{
      "q":"title:*",
      "_":"1587565922553"}},
  "response":{"numFound":4,"start":0,"docs":[
//removed the 4 records for brevity
]

Upvotes: 0

Views: 425

Answers (1)

Abhijit Bashetti
Abhijit Bashetti

Reputation: 8668

You need to have a unique field which can be id. The unique field is required to maintain the uniqueness of the documents of the solr.

The uniqueKey element specifies which field is a unique identifier for documents. Although uniqueKey is not required, it is nearly always warranted by your application design.

For example, uniqueKey should be used if you will ever update a document in the index.

You can define the unique key field by naming it:

<uniqueKey>id</uniqueKey>

Upvotes: 0

Related Questions