Reputation: 1
I have a blob storage that has a number of folders, each folder has a number of pdf documents. I now want to create an azure search index which indexes the data by folder level, but includes a complex type structure (Collection(edm.ComplexType) that allows me to include all the documents. So the index looks like this:
{"name": "index",
"fields":
[
{"name": "id", "type": "Edm.String", "filterable": true, "key": true, "searchable": true, "sortable": true, "facetable": false},
{"name": "folderName", "type": "Collection(Edm.String)", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"},
{"name": "documents", "type": "Collection(Edm.ComplexType)",
"fields": [
{"name": "documentName", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"},
{"name": "content", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false, "analyzer": "en.microsoft", "synonymMaps": ["synonymsmap"]},
{"name": "documentType", "type": "Collection(Edm.String)", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"},
{"name": "language", "type": "Collection(Edm.String)", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"}
]
}
]
}
Does anyone know how I should approach this? I have been creating and populating indexes using rest api.
I am thinking maybe I need to create a folder-level index structure and populate the folder-level details from some sql-table before populating the sub-fields with the blobs through skillset and indexer etc?
EDITS: Maybe my ideas above are completely off-track. What I want to do is to search a term and return folder names based on the aggregate relevancy of documents within folders. Not sure if this is achievable in search or have to be processed afterwards. Any pointers?
Upvotes: 0
Views: 696
Reputation: 18387
Does anyone know how I should approach this? I have been creating and populating indexes using rest api.
A: If you want this structure, then you're right. You'll need to create your index and push data by yourself (rest api is one of the options)
I am thinking maybe I need to create a folder-level index structure and populate the folder-level details from some sql-table before populating the sub-fields with the blobs through skillset and indexer etc?
A: This is not a good idea, when searching using a particular term, you'll need to query all the possible indexes and do the ordering by yourself.
I personally would create a simple structure, which no complex types:
{
"name": "index",
"fields":
[
{"name": "id", "type": "Edm.String", "filterable": true, "key": true, "searchable": true, "sortable": true, "facetable": false},
{"name": "folderName", "type": "Collection(Edm.String)", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"},
{"name": "documentName", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"},
{"name": "content", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false, "analyzer": "en.microsoft", "synonymMaps": ["synonymsmap"]},
{"name": "documentType", "type": "Collection(Edm.String)", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"},
{"name": "language", "type": "Collection(Edm.String)", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true, "analyzer": "en.microsoft"}
]
}
want to retrieve all documents by a particular folder?
search=*&$filter=folderName eq 'abc'
want to retrieve a particular documents in a particular folder?
search=*&$filter=folderName eq 'abc' and documentName eq 'x.docx'
want to all documents that contain a particular term?
search=mickey mouse&$orderBy=folderName
simple and effective
Upvotes: 0