Reputation: 3973
I have a large database (MS SQL Server) of longitudinal data that I am querying to get information to display in my web application (PHP-driven). This database contains records for lots of higher education programs over lots of years... but sometimes there's no displayable data for a given year/program combination. When somebody hits the web application without specifically choosing a year, I want the application to automatically choose the year with the most available data to display.
So, here's my question... should I query the database once to get all available information for all years, and then loop through the returned arrays in PHP to figure out which years have the best data available? Or would it be better to execute something like the following, then pick the best year of data, and then run a second SQL query?
SELECT ProgYear, SUM(CASE WHEN WageDisp>0 THEN 1 ELSE 0 END) as WageCount,
SUM(CASE WHEN UNCDisp>0 THEN 1 ELSE 0 END) as UNCCount,
SUM(CASE WHEN CurrDisp>0 THEN 1 ELSE 0 END) as CurrCount,
SUM(CASE WHEN ConEdDisp>0 THEN 1 ELSE 0 END) as ConEdCount
FROM (SELECT ProgID, ProgYear, SUM(CASE WHEN MAX(NumWage0, NumWage1, NumWage2, NumWage3, NumWage4, NumWage5, NumWage6, NumWage7, NumWage8)>0 THEN 1 ELSE 0 END) as WageDisp,
SUM(CASE WHEN MAX(UNC0, UNC1, UNC2, UNC3, UNC4, UNC5, UNC6, UNC7, UNC8)>0 THEN 1 ELSE 0 END) as UNCDisp,
SUM(CASE WHEN MAX(Curr0, Curr1, Curr2, Curr3, Curr4, Curr5, Curr6, Curr7, Curr8)>0 THEN 1 ELSE 0 END) as CurrDisp,
SUM(CASE WHEN MAX(ConEd0, ConEd1, ConEd2, ConEd3, ConEd4, ConEd5, ConEd6, ConEd7, ConEd8)>0 THEN 1 ELSE 0 END) as ConEdDisp)
WHERE ProgID in (SELECT item from fnSplit(?, '|'))
The PHP application will be hitting a SQL Server on the local host, so transmission times for the data should be pretty minimal. Just wondering about tips/best practices.
Thanks!
Upvotes: 0
Views: 418
Reputation: 1229
A well-thought out SQL statement is basically always better than sorting through a large array to find the best data. I wonder whether your query is effective (If it is, then I wonder whether your database is optimal)
Ideally however, you would have a table with a cache of summarized data from which you can quickly retrieve such summary information. Depending on how often the raw data changes this would require an update of summary data every once in a while (e.g., when you update the raw data).
Upvotes: 1