Reputation: 388
I running into an issue where a query to our Solr search will return different values. However I am querying on the id, which is set to be the Unique Key Field.
So in the Solr Admin UI I will run a query like.
The relevant response info is below.
"response": {
"numFound": 1,
"start": 0,
"maxScore": 7.4537606,
"docs": [
{
"title": [
"ICARDA forced to move"
],
"moduleid_s": "58",
"id": "client1.com.58.1673",
"enddate_dt": "2015-09-25T23:59:00Z",
"url": "mysite.com/item.aspx?id=1673",
"startdate_dt": "2015-09-25T00:00:00Z",
Now running that query a few times will eventually lead to a different response.
"response": {
"numFound": 1,
"start": 0,
"maxScore": 7.453251,
"docs": [
{
"title": [
"ICARDA forced to move"
],
"moduleid_s": "58",
"id": "client1.com.58.1673",
"enddate_dt": "2015-09-25T23:59:00Z",
"url": "mysiteNewUrl.com/item.aspx?id=1673",
"startdate_dt": "2015-09-25T00:00:00Z",
Notice that the url is different.
With Debug Query checked. You can see that the different urls are in the GET_FIELDS
section.
Why/how can I get different information? I'm querying off the id which is marked an the unique field. From my understanding there should never be more than of those. Could this be a synchronization issue? I'm using the Solr admin UI query with a single core selected.
Is there was way to check if only one document with that id is in the Index?
UPDATE:
I ran a facet query and that unique returns 2
<lst name="facet_fields">
<lst name="id">
<int name="client1.com.58.1673">2</int>
vs one that isn't having the issue.
<lst name="facet_fields">
<lst name="id">
<int name="client1.com.58.163">1</int>
Is this right? Does this explain my issue in that there are duplicate documents, but if that's the case why aren't two documents getting returned instead of just different data?
Upvotes: 0
Views: 622
Reputation: 9789
Is this a SolrCloud setup or a single-collection one? If it is cloud, you most likely ended up with one record in two different cores. Possibly due to a router or an upgrade bug.
The good news, you should be able to find all the records that have this problem by doing facet.field=id, facet.mincount=2. Then you could delete/reinsert them for consistency.
And no, you should not be able to end up in this state, so there is either mis-configuration, upgrade failure or some forced commands to ignore the unique requirement.
Upvotes: 1