Bruno
Bruno

Reputation: 103

Autocomplete and text search memory issues in apostrophe-cms: need ideas

I’m having trouble to use the text search and the autocomplete because I have a piece with +87k documents, some of them being big (~3.4MB of text).

I already:

  1. Removed every field from the text index, except title , searchBoost and seoDescription ; these are the only fields copied to highSearchText and the field lowSearchText is always set to an empty string.
  2. Modified the standard text index, including the fields type, published and trash in the beginning of it. I'm also modified the queries to have equality conditions on these fields. The result returned by the command db.aposDocs.stats() shows: type_1_published_1_trash_1_highSearchText_text_lowSearchText_text_title_text_searchBoost_text: 12201984 (~11 MB, fits nicely in memory)
  3. Verified that this index is being used, both in ‘toDistinc’ query as well in the final ‘toArray’ query.

What I think is the biggest problem

The documents have many repeated words in the title, so if the user types a word present in 5k document titles, the server suffers.

Idea I'm testing

The MongoDB docs says that to improve performance the entire collection must fit in RAM (https://docs.mongodb.com/manual/core/index-text/#storage-requirements-and-performance-costs, last bullet).

So, I created a separate collection named “search” with just the fields highSearchText (string, indexed as text) and highSearchWords (array, also indexed), which result in total size of ~ 19 MB.

By doing the same operations of the standard apostrophe autocomplete in this collection, I achieved much faster, but similar results.

I had to write events to automatically update the search collection when the piece changes, but it seems to work until now.

Issues

I'm testing this search collection with the autocomplete. For the simple text search, I’m just limiting the sorted response to 50 results. Maybe I'll have to use the search collection as well, because the search could still breaks.

Is there some easier approach I'm missing? Please, any ideas are welcome.

Upvotes: 0

Views: 162

Answers (0)

Related Questions