Reputation: 4377
I am developing a web app that lets users search for items in a database. The items are categorized into several categories. Each search item returned is displayed differently depending on its field values. Currently i built a class to handle display and another to handle search. The search class builds the SQL query using several user input, queries the database for the ID of items that match the user input and sends the IDs in an array to the display class.
an excerpt of the code that does this:
//the sql query is actually a little more complex than this
$query = "SELECT items.id FROM items, subcategories WHERE {$name} AND items.`base_school` = '{$_SESSION['base_school']}' AND items.subcategory = subcategories.id AND subcategories.parent_category = {$search_category} ORDER BY `time_added` DESC {$limit}";
$result = $DB_CONNECTION->query($query);
$newly_added = array();
while (list($id) = $result->fetch_row()) $result[] = $id;
searchDisplay::print_result($result);
The display class queries the database for the full details of each item one after the other and displays it as it should be displayed.
My question is would it be better/faster to query the database for the full details of the item (about 23 fields from 3 different tables) the first time the database is queried and store the data in an array and then pass that array to the display class rather than have the display class query the database for each item using the items unique ID? My current solution runs fine for now but i need to know if they would be any problem with my approach when the database starts to grow (about 500,000 rows).
Secondly, data from the database is filtered using several search criteria supplied by the user. Would it be better to build a complex query with a few joins that would accommodate all the user's criteria or to write a simple query that accommodates the major filters and then use PHP to filter the few remaining results that don't match the search criteria?
Upvotes: 0
Views: 123
Reputation: 323
You need to keep latency in mind. Often with networked apps that run slowly, latency is the culprit.
Even if each individual query is tiny and can be executed quickly, they all have latency. You say worst case, 100 queries.
Even if there is only 10 milliseconds of latency for each query (keep in mind each query requires overhead from network drivers, the actual round trip time on the wire, etc) you needlessly add 100*10ms = 1 second (incredibly long by computer terms).
Most likely it would take much less than one second to simply execute 1 query that gets all the info in one shot. Then you only incur the latency penalty once.
So I suggest rewriting your approach to use one query, and pass around arrays, as you suggest.
Things like this slip through testing all the time because people test in an environment where the latency between client and server is very very low (for instance on same server without much activity). Then app goes real world, and the client and server are both busy and hundreds of miles apart...
Upvotes: 1
Reputation: 12721
You should always try to avoid putting a query in a loop. A single query, even if it is complex, is usually faster and scales better. Like all "rules", there are exceptions. If the loop is much faster than the complex query, then you should stick with the loop since you know it won't grow to do doing tens or hundreds of iterations (right?).
As far as filtering in the DB or PHP, typically it's better to do it in the DB and avoid transferring useless data over the network. Using the HAVING option in MySQL is usually equivalent to what you would do in PHP to filter things.
Upvotes: 1
Reputation: 11
In my opinion filtering of results need to be done at each step OTHERWISE the query will become slow as the data grows bigger and bigger. Hence the strategy mentioned in last paragraph is the optimal one.
Upvotes: 1