Reputation: 17388
I have items and files. There is a 1:m relationship between items and files. Items are stored in a relational database and files in folders. The association between items and files is stored in the relational database. Files can be pdfs, word docs, email etc. I intend to POC cognitive search to be able to search items and associated documents.
My current understanding is, that a pull approach might be cheaper in comparison to the push approach when using cognitive search (the latency requirements are not stringent and eventual consistency is OK). Hence, I intend to move the data into a cosmos database, which can then be indexed via the pull approach. Curious, how does this work with the documents? Would I need to crack them on prem?
There is also the option of attachments and blob storage of documents. The latter is most likely more future proofed. I would think that if I put documents into blob storage, cognitive search indexing would still need to crack the documents and apply skills?
Upvotes: 0
Views: 100
Reputation: 5225
This sounds like a good approach. In terms of data sources, Cognitive Search supports CosmosDB and blob storage and some relationship databases. I would probably:
If you use indexers, they will take care of the document cracking. If you push data directly into the index, you will need to crack the documents yourself.
This gives a simple walkthrough of creating an indexer with the portal (skillset is optional, and change the data source to your own data): https://learn.microsoft.com/en-us/azure/search/cognitive-search-quickstart-blob
Upvotes: 2