Searching through related indices on Algolia

Question

I'm trying find out if there's performant way to search through my current data structures, or if I have to restructure them.

I have the following structure for my indices:

Publication (attributes: id, title, keywords)
PublicationFile (attributes: id, publication_id, text, page_number)

A publication has many publication files, publication file contains the contents of the file and the page it was found in (text and page_number).

title, keywords, and text are the searchable attributes, so if someone searches for 'economy' I want to search through both my indices.

I would like to perform a search that searches through both indices and returns the results in a manner that allows me to do something like this:

Publication1 keyword1 keyword2 Found results in Publication1's file contents in: [file a (pages: 1, 2, 3), file b (pages: 5)]

So I kind of want the search that happens to return results grouped by a publication's ID. The only way I can think of right now is to search both indices and then loop through the results and link the file/page matches to a publication.

In summary my questions are:

Is there a way I can structure my data to avoid the nested loops to process it?
Is there a way I can do this through Algolia without having to modify my structure? I would ideally want to re-use Algolia's frontend searching code and avoid processing this data on my backend.

Maxime · Accepted Answer

To answer your questions:

1) Yes, I'll get into more details below

2) No unfortunately not, you'll have to modify your data structure.

Here is how I'd recommend you structure your data to achieve what you're trying to do.

{
  objectID: "publicationFieIdId",
  publicationId: '',
  title: '',
  keywords: ['', ''],
  text: "",
  page_number: 1,
  published_at: 1485892992 // timestamp
}

Essentially you need to flatten your 2 indices into a single one to achieve what you're trying to do. Modifying the data structure is going to be less headache down the road than maintaining that client side code. and perform better too.

Few articles or documentation links that could be useful to explain why:

https://blog.algolia.com/inside-the-engine-part-7-better-relevance-via-dedup-at-query-time/

https://www.algolia.com/doc/guides/search/distinct/

Hope this helps!

Maxime

Searching through related indices on Algolia

Answers (1)

Related Questions