Adam
Adam

Reputation: 388

Solr Cloud - Querying on unique field returns different results

I running into an issue where a query to our Solr search will return different values. However I am querying on the id, which is set to be the Unique Key Field.

enter image description here

So in the Solr Admin UI I will run a query like.

enter image description here

The relevant response info is below.

 "response": {
    "numFound": 1,
    "start": 0,
    "maxScore": 7.4537606,
    "docs": [
      {
        "title": [
          "ICARDA forced to move"
        ],
        "moduleid_s": "58",
        "id": "client1.com.58.1673",
        "enddate_dt": "2015-09-25T23:59:00Z",
        "url": "mysite.com/item.aspx?id=1673",
        "startdate_dt": "2015-09-25T00:00:00Z",

Now running that query a few times will eventually lead to a different response.

 "response": {
    "numFound": 1,
    "start": 0,
    "maxScore": 7.453251,
    "docs": [
      {
        "title": [
          "ICARDA forced to move"
        ],
        "moduleid_s": "58",
        "id": "client1.com.58.1673",
        "enddate_dt": "2015-09-25T23:59:00Z",
        "url": "mysiteNewUrl.com/item.aspx?id=1673",
        "startdate_dt": "2015-09-25T00:00:00Z",

Notice that the url is different.

With Debug Query checked. You can see that the different urls are in the GET_FIELDS section.

Why/how can I get different information? I'm querying off the id which is marked an the unique field. From my understanding there should never be more than of those. Could this be a synchronization issue? I'm using the Solr admin UI query with a single core selected.

Is there was way to check if only one document with that id is in the Index?

UPDATE:

I ran a facet query and that unique returns 2

<lst name="facet_fields">
 <lst name="id">
<int name="client1.com.58.1673">2</int>

vs one that isn't having the issue.

<lst name="facet_fields">
 <lst name="id">
<int name="client1.com.58.163">1</int>

Is this right? Does this explain my issue in that there are duplicate documents, but if that's the case why aren't two documents getting returned instead of just different data?

Upvotes: 0

Views: 622

Answers (1)

Alexandre Rafalovitch
Alexandre Rafalovitch

Reputation: 9789

Is this a SolrCloud setup or a single-collection one? If it is cloud, you most likely ended up with one record in two different cores. Possibly due to a router or an upgrade bug.

The good news, you should be able to find all the records that have this problem by doing facet.field=id, facet.mincount=2. Then you could delete/reinsert them for consistency.

And no, you should not be able to end up in this state, so there is either mis-configuration, upgrade failure or some forced commands to ignore the unique requirement.

Upvotes: 1

Related Questions