Keyne Viana
Keyne Viana

Reputation: 6202

Simple but heavy application consuming a lot of resources. How to Optimize?

Currently I have one monitor application in production. The job of this monitor is to collect specific entries on social networking like facebook, twitter, youtube and so on.

Here are one simple example of an API call from Twitter:

http://search.twitter.com/search?q=Stackoverflow&format=json

Basically, this is what the system does:

  1. Select the search term from database given an specific order
  2. Call the API
  3. Collect all tweets statuses IDs and users IDs from the current search
  4. Check on the database if it exists
  5. Run the tweets insertion eliminating existing tweets and users and preventing duplicated entry errors.

We finished with two tables, one for users and another for tweets.

THE PROBLEM

After the MySql database reached 200.000 entries on the tweets table (on the first months), the application that visualize that data started to consume too much resources when performing the select query on the existing tweets.

Why?

The system has separated accounts, each one has certain search terms related to their specific business. When we perform a select, we need to select only the ones that are associated with the terms of our account. We cannot see tweets the aren't related to us. But one tweet can be on many accounts.

The actual query (Hurting my eyes)

SELECT * 
   FROM tweets 
 WHERE 
   content LIKE '%searchterm1%' 
     OR LIKE '%searchterm2%' 
     OR LIKE '%searchterm3%' 
     OR LIKE '%searchterm4%' 
     OR LIKE '%searchterm5%' 
     OR LIKE '%searchterm6%' 
     OR LIKE '%searchterm7%' 
     OR LIKE '%searchterm8%' 
   (and son on...)

The two possible solutions

a) Create a tweets_searches table which will have two foreign keys: tweet_id and search_id. So that each tweet on the tweets table can be related to one specific search term.

So instead of search for a specific string, we will join these tables.

b) Continue searching, but instead, with fulltext searches using MATCH () AGAISNT ().

THE MAIN QUESTION

Is that enough to reduce the resources consumption like CPU and RAM? Is there anything better I can to do?

Upvotes: 3

Views: 415

Answers (1)

Nemoden
Nemoden

Reputation: 9056

Disclaimer: this is one of my comments on this question which might be the answer:


I think match ... against is appropriate here. It is so-called "fulltext search". For more complex searches, I'd use Sphinx - it indexes your database on its own (has own mechanism for it) and perform searches way faster than MySQL does

Upvotes: 4

Related Questions