Reputation: 133

SQL Join Tables from Top 1

I have done much searching on StackOverflow and have been unable to adapt other SQL Join "Top 1" answers to my own case, so I am hoping someone else can point out what I am missing.

TableA

UniqueTableID    TaskID     ReferenceID
1                Task_1     Group_1
2                Task_2     Group_2
3                Task_2     Group_3
4                Task_3     Group_4
5                Task_3     Group_5
6                Task_4     Group_6

TableB

GroupID     GroupName
Group_1     Group_AAA
Group_2     Group_BBB
Group_3     Group_CCC
Group_4     Group_DDD
Group_5     Group_EEE
Group_6     Group_FFF

What I want is to return a GroupName for each TaskID (TableA.ReferenceID = TableB.GroupID). As for the TaskID's with multiple GroupName references, I don't care which one is returned, so I attempt to use TOP 1.

This query (and many variations thereof):

SELECT A.TaskID, B.GroupName
FROM [TableA] A
JOIN [TableB] B
ON B.GroupID = (
    SELECT TOP 1 [GroupID] 
    FROM [TableB] 
    WHERE [GroupID] = A.ReferenceID
)

Gives me this table:

Result

TaskID      GroupName
Task_1      Group_AAA
Task_2      Group_BBB
Task_2      Group_CCC
Task_3      Group_DDD
Task_3      Group_EEE
Task_4      Group_FFF

How can I get this instead:

Wanted Result

TaskID      GroupName
Task_1      Group_AAA
Task_2      Group_BBB
Task_3      Group_DDD
Task_4      Group_FFF

Thank you for your time and help!

Upvotes: 2

Answers (3)

Mohammed

Reputation: 313

Use this:

declare @data1 as table(UniqueTableID int primary key ,TaskID varchar(50),ReferenceID varchar(50))
insert into @data1 values (1,'Task_1','Group_1')
insert into @data1 values (2,'Task_2','Group_2')
insert into @data1 values (3,'Task_2','Group_3')
insert into @data1 values (4,'Task_3','Group_4')
insert into @data1 values (5,'Task_3','Group_5')
insert into @data1 values (6,'Task_4','Group_6')

declare @data2 as table(GroupID varchar(50),GroupName varchar(50))
insert into @data2 values ('Group_1','Group_AAA')
insert into @data2 values ('Group_2','Group_BBB')
insert into @data2 values ('Group_3','Group_CCC')
insert into @data2 values ('Group_4','Group_DDD')
insert into @data2 values ('Group_5','Group_EEE')
insert into @data2 values ('Group_6','Group_FFF')

SELECT B.TaskID,A.GroupName
FROM @data2 A INNER JOIN (select distinct TaskID,(select top 1 ReferenceID from @data1 d2 where d2.TaskID = d1.TaskID)as ReferenceID from @data1 d1) B
ON A.GroupID = B.ReferenceID

Upvotes: 0

Hogan

Reputation: 70523

I think the best way to do this is with row number.

While you can do a group by on the main table (like Ron Smith's answer), you have to group by all columns which are not GroupName. Joining to a table using row number you don't.

Here is an example of how to do it for your data:

SELECT A.TaskID, s.GroupName
FROM [TableA] A
JOIN (SELECT [GroupID],
             [GroupName],
             ROW_NUMBER() OVER (Partition By GroupID ORDER BY GroupName) AS RN
) as subselect s
ON A.ReferenceID = s.GroupID AND s.RN = 1

To explain:

You are using row number to give a unique number to each item in the sub-query. You can pick what to partition by (in this case GroupID) and what to order by (in this case GroupName). Because you only select the items with RN = 1 you will always get the lowest GroupName with this query. But you can change the order by if you want.

As I've set it up you will get the same results as your example, but I think it is clear how to change it.

Upvotes: 2

Ron Smith

Reputation: 3266

If you want the first GroupName alphabetically for each task, you can use min:

SELECT
    A.TaskID,
    min(B.GroupName) as GroupName
FROM [TableA] A
    JOIN [TableB] B
    ON B.GroupID = A.ReferenceID
GROUP BY A.TaskID