Shanka
Shanka

Reputation: 821

Joining multiple Tables and getting Count data in TSQL

I have three tables which need to be joined. I am able to join two of them and get the desired result, and this resulting table needs to be joined with another table.

Table1:

+----------+---------+------+
| Username | Country | Team |
+----------+---------+------+
| abc      | US      | New  |
| abc      | CAN     | New  |
| bcd      | US      | Old  |
+----------+---------+------+

Table2:

+----------+-------------+----------+------------+
| Username | CompanyCode | Document | Entry Date |
+----------+-------------+----------+------------+
| abc      |           1 |      112 | 24/06/2014 |
| abc      |           2 |      123 | 24/06/2014 |
| bcd      |           3 |      456 | 24/06/2014 |
| efg      |           4 |      984 | 24/06/2014 |
+----------+-------------+----------+------------+

I have written the following code..

SELECT Username, CompanyCode, Document, IIF(MONTH([Entry Date]) = 6 AND YEAR([Entry Date]) = 2014, 'TRUE', 'FALSE') AS [Posted], 
       COALESCE(tNew.Country, 'not there') AS DC, COALESCE(tNew.Team, 'not there') AS Team FROM Table2
OUTER APPLY
    (
        SELECT TOP 1 Country, Team FROM Table1
        WHERE Table1.[Username] = Table2.[Username]
    ) tNew

...which results in (Table99)...

+----------+--------------+----------+------------+--------+-----------+-----------+
| Username | Company Code | Document | Entry Date | Posted |  Country  |   Team    |
+----------+--------------+----------+------------+--------+-----------+-----------+
| abc      |            1 |      112 | 24/06/2014 | TRUE   | US        | New       |
| abc      |            2 |      123 | 24/06/2014 | TRUE   | US        | New       |
| bcd      |            3 |      456 | 24/06/2014 | TRUE   | US        | Old       |
| efg      |            4 |      984 | 24/06/2014 | TRUE   | not there | not there |
+----------+--------------+----------+------------+--------+-----------+-----------+

Now I have another table, Table3:

+--------------+--------------+
| Company Code | Company Name |
+--------------+--------------+
|            1 | MS           |
|            2 | APL          |
|            3 | GOO          |
|            4 | IBM          |
|            5 | AMZ          |
+--------------+--------------+

I want to join Table99 with Table3 on Company Code with Count of Document WHERE Posted = TRUE AND Country <> 'not there' resulting in...

+--------------+--------------+-----------------+
| Company Code | Company Name | Total Documents |
+--------------+--------------+-----------------+
|            1 | MS           |               1 |
|            2 | APL          |               1 |
|            3 | GOO          |               1 |
|            4 | IBM          |               0 |
|            5 | AMZ          |               0 |
+--------------+--------------+-----------------+

Upvotes: 2

Views: 60

Answers (3)

Stuart J Cuthbertson
Stuart J Cuthbertson

Reputation: 438

I think you have overcomplicated your query a bit. SQL's APPLY operator is primarily intended to be used with table-valued functions. More generally, it can be useful whenever there is no simple join condition between two tables.

In your case, though, there's a really simple join condition - the Username column in both Table1 and Table2. Joins are the essence of SQL and (generally) should be your first port of call whenever you need to combine information from multiple tables.

So in this case, given the end result you need to achieve, you can do this simply using two left joins, no APPLY:

with PostedDocs as (
    -- Define your requirements on the EntryDate value here
    select  CompanyCode
            ,Username
            ,Document
    from    Table2
    where   MONTH(EntryDate) = 6 
    and     YEAR(EntryDate) = 2014
    )
select      CO.CompanyCode
            ,CO.CompanyName
            ,TotalDocuments =count(distinct case when USR.Country is null then null else DOC.Document end) 

from        Table3      CO
left join   PostedDocs  DOC on CO.CompanyCode=DOC.CompanyCode
left join   Table1      USR on DOC.Username=USR.Username

group by    CO.CompanyCode
            ,CO.CompanyName
order by    CompanyCode asc

This gives exactly the result you want, for the inputs you provide in your question. It is a more SQL-oriented way of thinking and likely to perform better on large tables.

Note that I'm starting with Table3. It makes sense to do this because in your end result set, you want one row for each row in Table3. The left joins make the result set bigger, duplicating the Table3 rows, but then I aggregate them back up with group by.

The count function doesn't include null values, and you can take advantage of that here: where the left joins don't succeed, values will be null, so there's no need to convert to values like "not there".

A few tangential points

  • Generally in SQL we don't use spaces in column names. My answer has slightly different column names to your question, because I took the spaces out to make it easier to work with the column names.

  • It can be helpful to give tables meaningful aliases like I have done (DOC, USR, CO).

  • In your original code, here:

    OUTER APPLY
    (
        SELECT TOP 1 Country, Team FROM Table1
        WHERE Table1.[Username] = Table2.[Username]
    ) tNew
    

    you should be aware that SQL Server does not guarantee consistent result ordering unless you include an ORDER BY clause. So, if you were continuing to use this bit of code, at some point in the future the TOP 1 row returned for user abc might be the Canada one, not the US one. I'm not at all sure if that matters to you, but it might be important. In general, it is bad practice to use TOP 1 like this if you care about consistent results. You should order the results, or use DISTINCT or a row_number() function if those are more appropriate in your situation.

Upvotes: 1

Felix Pamittan
Felix Pamittan

Reputation: 31879

Do a LEFT JOIN on Table3 and the original query and then use conditional SUM to count:

SELECT
    t3.CompanyCode,
    t3.CompanyName,
    SUM(CASE WHEN t.Posted = 'TRUE' AND t.Country <> 'not there' THEN 1 ELSE 0 END)
FROM Table3 t3
LEFT JOIN (
    SELECT 
        Username, 
        CompanyCode, 
        Document,
        tnew.Country, 
        IIF(MONTH(EntryDate) = 6 AND YEAR(EntryDate) = 2014, 'TRUE', 'FALSE') AS [Posted], 
        COALESCE(tNew.Country, 'not there') AS DC,
        COALESCE(tNew.Team, 'not there') AS Team
    FROM Table2
    OUTER APPLY(
        SELECT TOP 1 Country, Team FROM Table1
        WHERE Table1.[Username] = Table2.[Username]
    ) tNew
) t
    ON t3.CompanyCode = t.CompanyCode
GROUP BY t3.CompanyCode, t3.CompanyName
ORDER BY t3.CompanyCode

TRY IT HERE

Upvotes: 2

user586399
user586399

Reputation:

You can deal with derived tables as if they are real tables:

SELECT Username, Table2.CompanyCode, Document, IIF(MONTH([Entry Date]) = 6 AND YEAR([Entry Date]) = 2014, 'TRUE', 'FALSE') AS [Posted], 
       COALESCE(tNew.Country, 'not there') AS DC, COALESCE(tNew.Team, 'not there') AS Team FROM Table2
OUTER APPLY
    (
        SELECT TOP 1 Country, Team FROM Table1
        WHERE Table1.[Username] = Table2.[Username]
    ) tNew

    JOIN Table3 ON Table2.CompanyCode = Table3.CompanyCode

Upvotes: 1

Related Questions