Reputation: 49
I am attempting to improve the performance of a program that tracks the most popular paths taken through a website. a given URL is registered in table sessionpage with a unique_id:
+-----------+--------------------------+---------------------+
| unique_id | page_url | mod_date_stamp |
+-----------+--------------------------+---------------------+
| 2378 | /resources/series75.html | 2008-10-03 22:49:00 |
+-----------+--------------------------+---------------------+
Then another table, session, counts the number of hits for a set of up to 5 urls using the unique_id from sessionpage.
session
+-----------+---------+-----------------+-----------+---------------------+-------
| unique_id | counter | sequence_length | yearmonth | mod_date_stamp | page1 |
+-----------+---------+-----------------+-----------+---------------------+-------
| 1 | 2 | 2 | 201203 | 2012-03-28 15:42:38 | 5298 |
+-----------+---------+-----------------+-----------+---------------------+-------
+-------+-------+-------+-------+
page2 | page3 | page4 | page5 |
+-------+-------+-------+-------+
6075 | 0 | 0 | 0 |
+-------+-------+-------+-------+
Right now, the stats program is fetching the unique_id from session page for each URL, which is taking way too long. I want to join sessionpage back to session multiple times so that the page1 - page5 fields show the url path (or 0/null) rather than the unique_id from sessionpage to speed up the program.
The end result should look like:
+-----------+---------+-----------------+-----------+---------------------+-------
| unique_id | counter | sequence_length | yearmonth | mod_date_stamp | page1 |
+-----------+---------+-----------------+-----------+---------------------+-------
| 1 | 2 | 2 | 201203 | 2012-03-28 15:42:38 | /path/index.html |
+-----------+---------+-----------------+-----------+---------------------+-------
+-------+-------+-------+-------+
page2 | page3 | page4 | page5 |
+-------+-------+-------+-------+
/path3/disk.html | 0 | 0 | 0 |
+-------+-------+-------+-------+
Thanks in advance.
Upvotes: 1
Views: 1286
Reputation: 2118
Based on your comment that page1, page2.. are foreign keys columns, I think it's simple.
Something like:
SELECT
s.unique_id, s.counter, s.sequence_length, s.yearmonth,
s.mod_date_stamp,
pg1.page_url as page1, pg2.page_url as page2,
pg3.page_url as page3, pg4.page_url as page4,
pg5.page_url as page5
FROM
session s
LEFT JOIN
sessionpage pg1 ON pg1.unique_id = s.page1
LEFT JOIN
sessionpage pg2 ON pg2.unique_id = s.page2
LEFT JOIN
sessionpage pg3 ON pg3.unique_id = s.page3
LEFT JOIN
sessionpage pg4 ON pg4.unique_id = s.page4
LEFT JOIN
sessionpage pg5 ON pg5.unique_id = s.page5
I think this will work.
But when there is no value on column (on table session), you should use null instead of 0 (zero).
Another tip, on the SELECT statement, if you don't want NULL values you can use:
COALESCE(pg5.page_url, ' ') as page5
Upvotes: 1