Reputation: 1
I'm trying to show a 'proportion' field for website user data where proportion will be the measure of how much a certain data source contributes to their conversion (whatever that may be, it is irrelevant).
This is the kind of output I'm hoping to achieve with SQL:
Month | ID | Country | Data Source |Proportion
Jan-15 | 1 | UK | GA | 0.5
Jan-15 | 1 | UK | Omniture | 0.5
Jan-15 | 2 | France | GA | 1
Jan-15 | 3 | Germany | GA | 0.25
Jan-15 | 3 | Germany | Omniture | 0.25
Jan-15 | 3 | Germany | Email | 0.25
Jan-15 | 3 | Germany | Moz | 0.25
Feb-15 | 1 | UK | GA | 0.5
Feb-15 | 1 | UK | Omniture | 0.5
Feb-15 | 2 | France | Omniture | 0.5
Feb-15 | 2 | France | GA | 0.5
Feb-15 | 3 | Germany | Omniture | 0.33
Feb-15 | 3 | Germany | Email | 0.33
Feb-15 | 3 | Germany | Moz | 0.33
Mar-15 | 1 | UK | Omniture | 0.5
Mar-15 | 1 | UK | GA | 0.5
Mar-15 | 2 | France | Omniture | 0.5
Mar-15 | 2 | France | Email | 0.5
And this is the SQL that I am currently working and failing with:
SELECT
MONTH(registrationDate), country, DataSource, 1/COUNT(ID)
FROM
data_table
WHERE
registrationDate IS NOT NULL
GROUP BY
MONTH(registrationDate), ID
This is only giving one instance of the proportion. Using the example above, user with ID 1 in January will only have one record with Proportion = 0.5.
Any help in showing this proportion value correctly shared between data sources would be much appreciated!
Upvotes: 0
Views: 65
Reputation: 1269593
You need to combine the results with the original data. Here is a method using JOIN
:
SELECT dt.*, ddt.value
FROM data_table dt JOIN
(SELECT MONTH(registrationDate) as mon, ID,
1.0/COUNT(applicantId) as value
FROM data_table
WHERE registrationDate IS NOT NULL
GROUP BY MONTH(registrationDate), ID
) ddt
ON ddt.id = dt.id AND
ddt.mon = MONTH(dt.registrationDate);
Your question has ID
, ApplicationId
, and RegistrationId
sprinkled throughout. I'm not sure what the right column(s) to use are.
EDIT:
To include year (which really is a good idea in all cases):
SELECT dt.*, ddt.value
FROM data_table dt JOIN
(SELECT YEAR(registrationDate) as yyyy, MONTH(registrationDate) as mon, ID,
1.0/COUNT(applicantId) as value
FROM data_table
WHERE registrationDate IS NOT NULL
GROUP BY YEAR(registrationDate), MONTH(registrationDate), ID
) ddt
ON ddt.id = dt.id AND
ddt.mon = MONTH(dt.registrationDate) AND
ddt.yyyy = YEAR(dt.registrationDate);
Upvotes: 1