Reputation: 2048
I have some very large blobs, so I set AzureSearch_SkipContent
on the blob row with the following code :
if (b.Properties.Length >= 134217728)
{
b.Metadata["AzureSearch_SkipContent"] = "true";
await b.SetMetadataAsync();
}
But when I review the warning and errors I can see that the indexer has attempted to index the content even though I have asked it to skip, the error I see is (this is under errors, so I guess it's not going to index anything for this blob) :
{
"key": null,
"errorMessage": "The blob '113443f46d1b184650bf4b0d5b0b3806055c43558a676b778de13f1b7ef4da93' has the size of 218285352 bytes, which exceeds the maximum size for document extraction for your current service tier."
},
If I look at this blob in storage explorer I see
Upvotes: 1
Views: 263
Reputation: 4671
UPDATE Jan 3, 2018
To make this scenario work gracefully, we are adding indexStorageMetadataOnlyForOversizedDocuments
indexer configuration setting. It takes a bool value and is false
by default, so set it to true
in the indexer configuration to enable it. This is fresh off the presses and will be deployed in production worldwide by January 19.
ORIGINAL RESPONSE
Both "true"
and "True"
are valid values of AzureSearch_SkipContent
. The problem is that AzureSearch_SkipContent
does not mean that the blob content is ignored.
Blob content contributes in two ways:
AzureSearch_SkipContent
means that Azure Search only performs #1 and not #2, but the blob still needs to be downloaded, so blob size quota comes into play.
Currently, the only other per-blob processing option is AzureSearch_Skip
, which completely skips the blob. You can also use MaxFailedItems
/ MaxFailedItemsPerBatch
to a specific number of errors, as described in Dealing with errors.
I think what would be really useful for this situation is the ability for Azure Search to automatically extract only the storage metadata for large blobs, without you having to process all of your blobs individually. Please feel free to add a suggestion for this on our User Voice site.
Upvotes: 1
Reputation: 2045
It needs to a capital T in true
if (b.Properties.Length >= 134217728)
{
b.Metadata["AzureSearch_SkipContent"] = "True";
await b.SetMetadataAsync();
}
When in doubt use the literal and convert to string
b.Metadata["AzureSearch_SkipContent"] = true.ToString();
or
bool skipIndex = true;
b.Metadata["AzureSearch_SkipContent"] = skipIndex.ToString();
Upvotes: 1