mizichael
mizichael

Reputation: 69

Ratio of values from two subqueries evaluating to zero

Sorry for the confusing title, but essentially what I have is one base table, Shopping History, unique by shopper, visit, and item. Something like this:

Shopper | Visit ID | Item
John    |     A    | Milk
John    |     A    | Butter
John    |     B    | N/A
Bill    |     C    | Butter
Bill    |     D    | Eggs

The end goal with this table is to get it unique by shopper only, with one "ratio" column for each item, in which lies the percentage of visits for that shopper in which they purchased that item (or any item at all). For Shopping History, I'm looking for the end result to look like:

Shopper | Any Item | Milk | Butter | Eggs
John    |    50%   |  50% |   50%  |  0%
Bill    |    100%  |  0%  |   50%  |  50%

As such, here's what I currently have:

WITH ItemCounts AS
    (SELECT
        COUNT(DISTINCT VisitID) 'Count of Visits',
        Shopper
    FROM ShoppingHistory
    GROUP BY Shopper
)
SELECT
    CastedTable.[Shopper],
    FORMAT(SUM(CastedTable.[Has Item?]) / ItemCounts.[Count of Visits], 'P2') 'Item Rate',
    FORMAT(SUM(CastedTable.[Has Milk?]) / ItemCounts.[Count of Visits], 'P2') 'Milk Rate',
    FORMAT(SUM(CastedTable.[Has Butter?]) / ItemCounts.[Count of Visits], 'P2') 'Butter Rate',
    FORMAT(SUM(CastedTable.[Has Eggs?]) / ItemCounts.[Count of Visits], 'P2') 'Egg Rate'
FROM (SELECT
    ShoppingHistory.Shopper,
    ShoppingHistory.Visit,
    CASE WHEN ShoppingHistory.VisitID <> 'N/A' THEN 1 ELSE 0 END 'Has Item?",
    MAX(CASE WHEN ShoppingHistory.Item = 'Milk' THEN 1 ELSE 0 END 'Has Milk?",
    MAX(CASE WHEN ShoppingHistory.Item = 'Butter' THEN 1 ELSE 0 END 'Has Butter?",
    MAX(CASE WHEN ShoppingHistory.Item = 'Eggs' THEN 1 ELSE 0 END 'Has Eggs?"
FROM ShoppingHistory
GROUP BY
    ShoppingHistory.Shopper,
    ShoppingHistory.VisitID,
    'Has Item?'
) CastedTable
    INNER JOIN ItemCounts
        ON CastedTable.[Shopper] = ItemCounts.Shopper
GROUP BY [Shopper]

CastedTable gets me one row per shopper and visit, with one column per item acting as a flag, representing whether or not the shopper bought that item during that visit.

ItemCounts is just getting me a denominator for my percentage - the total number of visits per shopper.

Then, I want to sum each item flag and divide by the total number of visits, getting me the percentage of visits in which they bought that item (or, in the case of "Item Rate", the percentage of visits in which they bought any item).

However, my results are looking strange. Specifically, when I print out both the summed item count (SUM(CastedTable.[Has Item?])) and the number of visits (ItemCounts.[Count of Visits]) to ensure I'm not going crazy, I'm seeing something like this:

Shopper | Any Item | Milk | Butter | Eggs | Sum of Item Flag | Sum of Visits
John    |    0%    |  0%  |   0%   |  0%  |       1          |       2
Bill    |    0%    |  0%  |   0%   |  0%  |       2          |       2

Essentially, printing out the Sum of Flag and Sum of Visits columns, I'm seeing what I'd expect, and simply dividing the flag column by visit column would give me the percentages I want. However, in those percentage columns themselves I'm just seeing 0%.

That's a lot of background, but essentially I'm having trouble figuring out why I'm seeing these 0%. I'm fairly new to SQL and can't seem to put my finger on what I'm doing wrong.

Any help is greatly appreciated!

Upvotes: 1

Views: 48

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269603

SQL Server does integer division, so 1 / 2 = 0 rather than 0.5. I find that the simplest solution is to multiply by 1.0:

FORMAT(SUM(CastedTable.[Has Item?]) * 1.0 / ItemCounts.[Count of Visits], 'P2') Item_Rate,

Upvotes: 2

Related Questions