Azure AISearch Indexer: "'JSON arrays with element type 'Float' map to Collection(Edm.Double)"

Question

I have the following problem. I am trying to build an indexer in Azure AI Search. I have a skillset with a “Custom.WebApiSkill” skill. This provides me with the following response body:

{
  "values": [
    {
      "recordId": "1",
      "data": {
        "embedding": [
          -0.013657977,
          0.004854262,
          -0.015335504,
          -0.010732211,
          ...
        ]
      }
    }
  ]
}

As part of the indexer, I am now trying to map the “embedding” value of the response body to a field in my index:

  "outputFieldMappings": [
    {
      "sourceFieldName": "/document/pages/*/embedding",
      "targetFieldName": "content_vector",
      "mappingFunction": null
    }
  ]

My index field "content_vector" looks like that:

    {
      "name": "content_vector",
      "type": "Collection(Edm.Single)",
      "key": false,
      "retrievable": true,
      "stored": true,
      "searchable": true,
      "filterable": false,
      "sortable": false,
      "facetable": false,
      "synonymMaps": [],
      "dimensions": 1536,
      "vectorSearchProfile": "myHnswProfile"
    }

However, I receive the following error when executing:

The data field 'content_vector/0' in the document with key 'aHR0cHM6Ly9zdHJhZ3Byb3RvdHlwZGV2My5ibG9iLmNvcmUud2luZG93cy5uZXQvdGVzdGRhdGEvS29tbXVuaWthdGlvbnN0ZWNobmlrLUZpYmVsLnBkZg2' has an invalid value of type 'Collection(Edm.Double)' ('JSON arrays with element type 'Float' map to Collection(Edm.Double)'). The expected type was 'Collection(Edm.Single)'.

How can I make sure that my custom WebApi returns the embedding array with float32 values, or how can I make sure that my indexer interprets the values as float32 (Edm.Single) and not as float64 (Edm-Double)?

Thank you very much!

I tried to use numpy in my Custon WebAPI (python) to convert the values of "embedding" to float32, but that didn't worked.

Something like that:

embedding_float32 = np.array(embedding, dtype=np.float32).tolist()

UPDATE:

I tried using “numpy” to convert the array to “float32”, just like you showed in your first code snippet. Nevertheless, the indexer interprets it as float64 (Edm.Double):

The data field 'content_vector/0' in the document with key 'xyz' has an invalid value of type 'Collection(Edm.Double)' ('JSON arrays with element type 'Float' map to Collection(Edm.Double)'). The expected type was 'Collection(Edm.Single)

Is there a possibility that the indexer interprets the values as float32 (Edm.Single) or that I force the data type in my CustomWebAPI? The problem is that Python does not natively differentiate between float32 and float64 and therefore treats and returns the value as float64 by default.

Here is the link to my WebAPI in GitHub: https://github.com/Alexkanns/CustomWebAPI/blob/main/init.py

Azure AISearch Indexer: "'JSON arrays with element type 'Float' map to Collection(Edm.Double)"

Answers (1)

Related Questions

Azure AISearch Indexer: &quot;&#39;JSON arrays with element type &#39;Float&#39; map to Collection(Edm.Double)&quot;

Answers (1)

Related Questions

Azure AISearch Indexer: "'JSON arrays with element type 'Float' map to Collection(Edm.Double)"