Reputation: 229
I have the following tables:
Users
UserId Name
--------- --------
1 a
2 b
3 c
1 d
7 e
UsersComments
commentId Message UserId
1 aa 1
2 bb 3
3 cc 2
4 dd 3
I want to create a view in which i can refer users to the amount of comments they have done, the idea is to add also some other interactions they have on the website so having a colum on users table its not a solution.
So far i have this query:
CREATE VIEW userstats AS ( (SELECT UserId FROM Users) , (SELECT count(commentId) FROM UsersComments GROUP BY UserId))
The problem is that there may be users that havent done any comment yet, the other problem is how to join the amount of comments per user to its corresponding id.
SELECT UserId, count(commentId) FROM UsersComments GROUP BY UserId
isnt a solution cause as i said, there will be other metrics for users such as likes or uploaded contents.
How could this be done?
Upvotes: 1
Views: 543
Reputation: 108370
It can be done without creating a view.
The question you seem to asking is "What query can return the specified resultset?"
Based on what information we can glean from the syntax you posted (does that query even run?)
This will return the specified resultset,
SELECT u.userid
, COUNT(c.userid) AS count_usercomments
FROM users u
LEFT
JOIN usercomments c
ON c.userid = u.userid
GROUP BY u.userid
If you are going to create a view, you first need a SELECT query that returns the resultset you want to return.
Because of the performance impact introduced by using views in MySQL (the performance impact can be negligible in some cases, but can range from significant to crippling in others), I'm not going to provide syntax for creating a view. (I believe that introducing a view unnecessarily has the potential to create a bigger problem than whatever problem is being solved by creating a view.
FOLLOWUP
Q: Why do you think that views are not a good idea?
A: I didn't say that views were not a good idea. What I said was: introducing an unnecessary view has the potential to create problems that are bigger than whatever problem the view was implemented as a solution for.
When we don't have an understanding of how MySQL processes views (which is much different than how views are processed in other databases), we can unwittingly create some significant performance problems.
Q: I would love to be able to just use a result set but making this queries take too long,
A: I would love a pony. And free pizza.
A VIEW
object is a query. That query produces a resultset whenever a query is executed against the VIEW.
Taking a query that "takes too long" and turning that into a view definition doesn't do anything to make that query faster.
Q: wouldnt it be faster if i already have a view constantly being updated?
A: Well, if we had a properly indexed table we could run queries against, that might make things faster.
But again, in MySQL, a VIEW
object is just a query. It is not a stored result set that is "constantly being updated".
When we run a query that references a view (whether that's a VIEW
object defined in the databse, or an inline view, defined in the SQL text of a query... the view query is executed and the rows returned are materialized into what MySQL refers to as a derived table.
As an example, here's a query that includes an inline view definition. (This is just an example for illustration, this is not an example of a query that we would actually want to use.)
SELECT v.col
FROM (
SELECT t.col
FROM hugetable t
) v
WHERE v.col = 'foo'
When that query is executed, MySQL first executes the query for the inline view v
. That query runs just the same as it would if we were to issue that query separately; the difference is that the resultset (i.e. the set of rows returned) from that query are temporarily stored as a "derived table". (That's the terminology that MySQL uses for it, "derived table".
MySQL will use the MEMORY engine to store that, if the resultset is small enough, and if it doesn't contain any column datatypes not supported by the MEMORY engine. Otherwise, MySQL uses the MyISAM storage engine, and spins that resultset out to disk.
Once the derived table is populated, then the outer query runs, using the "derived table" as the rowsource, just like a regular MyISAM table. There are no indexes defined on the derived table, so a full scan is basically the only access method that MySQL has for it. The outer query will check each row in the derived table to see if it meets the predicates in the outer query. (There are actually some improvements in MySQL 5.6.x, where the optimizer can create an index on the derived table; before 5.6, there are no indexes on the derived table.)
Once the query completes, the derived table is deallocated. The resultset is not persisted, it's gone.
A subsequent reference to the same view, even in the same session, will cause that view query to be executed again.
If we create a VIEW
object, for example:
CREATE VIEW myview
AS
SELECT t.col
FROM hugetable t
And we write a query that references the VIEW
, for example:
SELECT v.col
FROM myview v
WHERE v.col = 'foo'
The behavior is execution plan is exactly the same as with the inline view. (There's an extra step of looking up the view definition, to get the SQL for the query, and maybe some privilege checks, et al.), but aside from that, at execution time, it behaves exactly like the inline view.
Note that predicates in the outer query do not get pushed down into the view query. And there is no "stored result set" for the view. It's materialized each time the view is accessed.
Views are not a "bad idea" per se. There are times that a view is the best way to return a resultset. (I make frequent use of inline views my queries.)
The problems come when we don't understand how views are processed in MySQL. When we create VIEW
objects, those tend to get referenced in other queries where they aren't the most appropriate solution.
For example, this query returns an equivalent resultset to the query example above. But a difference here is that the predicate on col is now included in the view definition, MySQL can make effective use of an index to eliminate boatloads of rows, and access only the required rows. This produces a much smaller derived table (more likely to use the MEMORY engine), and the outer query will run much more quickly because there are fewer rows to scan.
SELECT v.col
FROM (
SELECT h.col
FROM hugetable h
WHERE h.col = 'foo'
) v
One issue with a VIEW
object is that it "hides" information about the query, and restricts our options in influencing the optimizer to produce a resultset efficiently.
Upvotes: 1
Reputation: 23719
CREATE VIEW userstats
AS
SELECT
Users.UserId,
COUNT(commentId) AS n_comments
FROM
Users
LEFT JOIN
UsersComments
ON
Users.UserId=UsersComments.UserId
GROUP BY
Users.UserId;
As Users and UsersComments tables share a column with the same name, you can use USING
keyword instead of ON
to make the query shorter:
...
LEFT JOIN
UsersComments
USING (UserId)
...
Upvotes: 2