Avner Shahar-Kashtan
Avner Shahar-Kashtan

Reputation: 14700

Single-value field returned as multi-value from query

I'm using Solr.NET to index data and later query it on a local Solr server (Solr.net v0.4, Solr v5.3.1), and getting strange exceptions.

My indexed record is a simple class (simplified here):

public class Record
{
    [SolrUniqueKey]
    public long Id {get;set;}

    [SolrUniqueKey]
    public string Data {get;set;}
}

Which I'm adding to the index by calling ISolrOperations.Add(). I didn't define the schema beforehand - it was autogenerated by the data I put in.

Elsewhere, I'm querying this index using ISolrReadOnlyOperations.Query(), asking only for the Id field. This query apparently returns results, but crashes with an ArgumentException:

"Could not convert value 'System.Collections.ArrayList' to property 'Id' of document type My.Namespace.Record"

Meaning that while I stored the Id property as a long, it's being retrieved as an ArrayList of longs. I get the same error if I try to retrieve other fields - I store one string, but retrieve a collection of them. This crashes, because it's trying to create an instance of Record, where the Id property is a single long.

Browsing the index via the web interface shows that the property really is multi-valued - the JSON I see contains an array for all properties. Likewise, in the schema browser, I can see that my fields are defined as multivalued (for Properties and Schema, not Indexing). In the index's managed-schema file I can see my fields are defined as strings (for string fields) or tlongs for the numeric field.

  1. Why is Solr (or Solr.Net) indexing my single-value fields as multi-valued?
  2. Can I prevent this from happening without manually editing the schema? Using a field attribute, perhaps?
  3. Can I retrieve only a single value for a multi-valued property, so in case I can't fix the schema, I can simply retrieve the data into my single-valued Record object?

Upvotes: 2

Views: 2357

Answers (1)

Avner Shahar-Kashtan
Avner Shahar-Kashtan

Reputation: 14700

I've found both a solution and a workaround.

  1. New indexes/cores in Solr 5.3.1, if not given a solrconfig.xml file explicitly, copy the default file found in <solr dir>\server\solr\configsets\data_driven_schema_configs\conf. This file defines an updateProcessRequestChain defining what happens when new documents are added without a schema. By default, the types defined there are multivalued:

    <processor class="solr.AddSchemaFieldsUpdateProcessorFactory"> <str name="defaultFieldType">strings</str> <lst name="typeMapping"> <str name="valueClass">java.lang.Long</str> <str name="valueClass">java.lang.Integer</str> <str name="fieldType">tlongs</str> </lst> </processor>

Note the strings and tlongs data types. To prevent this, you can change the solrconfig.xml in your core's conf folder to use the single-valued data types (string, tlong, etc), or change the default value for newly created cores.

  1. The workaround is to read the results as a Dictionary<string,object>, instead of having Solr.NET deserialize the results into a document object automatically.

This means initializing a second Solr operations object for this type:

   Startup.Init<Record>(indexUrl); // Typed
   Startup.Init<Dictionary<string, object>>(indexUrl); // Untyped.

and later, get an instance of ISolrOperations<Dictionary<string,object>> and manually read my Key and Data attributes from it, casting the object payload to an ArrayList and extracting the value.

Upvotes: 3

Related Questions