Reputation: 3
I have a SQL Server table called AccessLog
that is inserted into every time a user clicks "Log In" on my website. The important fields are UserID
, BrowserName
and Date
. The intent is to get an idea of which browsers are used the most to access the site.
Here is a sample data set:
| UserID | BrowserName | Date |
|--------|-------------|-------|
| 1 | Chrome | 05/28 |
| 1 | Chrome | 05/29 |
| 2 | Firefox | 05/29 |
| 1 | Chrome | 05/30 |
| 3 | Firefox | 05/31 |
This is the SQL code I use (found here):
SELECT BrowserName
,(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER ()) pct
,COUNT(*) ct
FROM AccessLog
GROUP BY BrowserName
These are the results I get with that query:
| BrowserName | pct | ct |
|-------------|------|----|
| Chrome | 60.0 | 3 |
| Firefox | 40.0 | 2 |
But these are the results I want:
| BrowserName | pct | ct |
|-------------|------|----|
| Chrome | 33.3 | 1 |
| Firefox | 66.6 | 2 |
How can I modify my query to count each distinct UserID
and BrowserName
pair once, to not artificially inflate the counts?
Upvotes: 0
Views: 282
Reputation: 1269873
You want to use count(distinct)
. But that is not possible as a window function -- directly -- in SQL Server. So use the dense_rank()
trick:
SELECT BrowserName,
(COUNT(DISTINCT UserId) * 100.0 /
(DENSE_RANK() OVER (ORDER BY UserId ASC) +
DENSE_RANK() OVER (ORDER BY UserId DESC)
)
) as pct,
COUNT(DISTINCT UserId) as ct
FROM AccessLog
GROUP BY BrowserName
Upvotes: 0
Reputation: 7503
You can try the following by creating a cte
. Here is the demo.
with total_users as
(
select
BrowserName,
count(distinct UserId) as total_users
from myTable
group by
BrowserName
)
select
BrowserName,
round((total_users * 100.0)/sum(total_users) over (), 1) as pct,
total_users as ct
from total_users
Output:
| browsername | pct | ct |
| ----------- | ----- | --- |
| Chrome | 33.3 | 1 |
| Firefox | 66.6 | 2 |
Upvotes: 1