Reputation: 63
(Newbie question) I've inherited a SOLR4 installation and I'm learning as I go.
I need to update some documents by setting/adding the value of a field (field name="optout"). I'm using the Atomic document updates available in SOLR4. But, I'm doing something wrong. My documents are getting updated, but some fields now have multiple values. (Some documents already have a value for "optout").
But my real concern is that I can no longer find these records by searching for them. They seem to have dropped out of the index.
This is a typical document returned from search:
<doc><str name="id">myColl-myId</str>
<str name="recordSystem_rid">622103814</str>
<long name="_version_">1464135593682272256</long>
<bool name="optout">false</bool>
</doc>
Some documents do not have the optout flag set.
My update URL looks like:
http://prodsolr01.cco:8983/solr/records/update?stream.body=<add><doc><field name="id">myColl-myId</field><field name="recordSystem_rid">622103814</field><field name="_version_">1462876089586024448</field><field name="optout" update="add">true</field></doc></add>&commit=true
After the update I can not find this record using the query used to get the record previously.
Do modified documents need to be re-indexed?
If I search by id, I can find the record, but it has been modified and now looks like this:
<doc>
<str name="id">myColl-myId</str>
<arr name="recordSystem_rid">
<long>622103814</long>
<long>622103814</long>
</arr>
<long name="_version_">1470576227169337344</long>
<bool name="optout">true</bool>
</doc>
Notice that there are two "recordSystem_rid" values.
Why are there two values for this field?
Any insights to this would be helpful.
UPDATE: Adding an exerpt from the schema.xml Based on the schema available here: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/collection1/conf/schema.xml?view=markup
<schema name="example" version="1.5">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="recordsystem_rid" type="long" index="true" stored="true" multiValued="false" omitNorms="true" />
<field name="_version_" type="long" index="true" stored="true" />
... dynamicFields are defined next....
</fields>
<uniqueKey>id</uniqueId>
<solrQueryParser defaultOperator="AND" />
... copy fields are defined.....
... typed are defined ....
</schema>
Upvotes: 0
Views: 1559
Reputation: 11
version is an internal field, updated automatically when doing an atomic update. You don't need to update it manually.
Nico
Upvotes: 0
Reputation: 27487
I believe the problem lies with your update url. With Solr Atomic updates if you want to change an existing field you are supposed to use "set". It's not clear to me what happens when you don't specify an action. Personally I'd avoid leaving it to default behavior in Solr when that behavior is undocumented.
So from your original update url:
http://prodsolr01.cco:8983/solr/records/update?stream.body=<add><doc><field name="id">myColl-myId</field><field name="recordSystem_rid">622103814</field><field name="_version_">1462876089586024448</field><field name="optout" update="add">true</field></doc></add>&commit=true
I'd change it to this:
http://prodsolr01.cco:8983/solr/records/update?stream.body=<add><doc><field name="id">myColl-myId</field><field name="recordSystem_rid" update="set">622103814</field><field name="_version_">1462876089586024448</field><field name="optout" update="add">true</field></doc></add>&commit=true
Note the addition of the update="set"
to the recordSystem_rid field update.
I'd also be careful when mixing the Atomic update approach and the Optimistic Concurrency update approach using _version_
. Definitely supported but I'd want to test it out carefully.
Upvotes: 1