Reputation: 4169
I need to select only the first row from a query that joins tables A and B. On table B exist multiple records with same name. There are not identifiers in any of the two tables. I cannot change the scheme either because I do not own the DB.
TABLE A
NAME
TABLE B
NAME
DATA1
DATA2
Select Distinct A.NAME,B.DATA1,B.DATA2
From A
Inner Join B on A.NAME = B.NAME
This gives me
NAME DATA1 DATA2
sameName 1 2
sameName 1 3
otherName 5 7
otherName 8 9
but I need to retrieve only one row per name
NAME DATA1 DATA2
sameName 1 2
otherName 5 7
I was able to do this by adding the result into a temp table with a identity column and then select the minimum id per name.
The problem here is that I require to do this in one single statement.
Upvotes: 9
Views: 39874
Reputation: 1
The tag of this question indicates that it would be a solution for DB2, but this is very similar to MS-SQL server, if so try these solutions:
Using CROSS, it will be possible to display what exists only in both tables
select A.*, B.DATA1, B.DATA2
from A
cross apply (select top 1 * from B where B.name = A.name) B
But it is possible to change to OUTER to display what exists in A without the obligation to exist in B
select A.*, B.DATA1, B.DATA2
from A
OUTER apply (select top 1 * from B where B.name = A.name) B
In the structure of the apply statement, it would also be possible to include an ORDER statement, since there is no indication of the order of exits in table B
Upvotes: 0
Reputation: 2169
You can use row number to get one row for each name, try something like below
Select name,data1,data2 from
(Select A.NAME,B.DATA1,B.DATA2,row_number() over(partitioj by a.name order by a.name) rn
From A
Inner Join B on A.NAME = B.NAME) where rn=1
Upvotes: 0
Reputation: 3134
SELECT A.NAME, bb.DATA1, bb.DATA2
From A Inner Join B on A.NAME = B.NAME
WHERE B.DATA1 = (SELECT MIN(DATA1) FROM B WHERE NAME = A.NAME)
This will give your desired result, providing B.DATA1 values are unique within the set relating to table A.
If they're not unique, the only other way I know is using CROSS APPLY in MSSQL 2005 and above.
Upvotes: 0
Reputation: 5431
Using a GROUP BY may get you part way there, but beware. If you do something like this:
Select A.NAME, min(B.DATA1), min(B.DATA2)
From A Inner Join B on A.NAME = B.NAME
Group by A.NAME;
You will get the result you are looking for:
NAME DATA1 DATA2
sameName 1 2
otherName 5 7
But only because of the data you are testing with. If you change the data, so that instead of:
otherName 8 9
you had:
otherName 8 4
It would return:
NAME DATA1 DATA2
sameName 1 2
otherName 5 4
Note that otherName does not return DATA1 and DATA2 from the same record!
Update: A self-join with a comparison on one of the data values may help you, such as:
SELECT a.*, b.* FROM a,b
LEFT JOIN b b2 ON b.name = b2.name AND b.data2 < b2.data2
WHERE a.name = b.name AND b2.data2 IS NOT NULL;
However, this will only work if the values in DATA2 are unique per NAME.
Upvotes: 7
Reputation: 23798
This will work:
with temp as (
select A.NAME, B.DATA1, B.DATA2,
row_number() over (partition by A.NAME order by A.NAME) as rownum
from TABLEA A inner join TABLEB B
on A.NAME = B.NAME
)
select NAME, DATA1, DATA2 from temp where rownum = 1
If you want to select the least value of data1 and within it data2, then use this variation:
with temp as (
select A.NAME, B.DATA1, B.DATA2,
row_number() over (partition by A.NAME order by B.DATA1, B.DATA2) as rownum
from TABLEA A inner join TABLEB B
on A.NAME = B.NAME
)
select NAME, DATA1, DATA2 from temp where rownum = 1
Both the queries will give one row per name.
Upvotes: 10
Reputation: 11991
Try to dedupe B like this
SELECT A.NAME, bb.DATA1, bb.DATA2 FROM A JOIN B bb ON A.NAME = B.NAME WHERE NOT EXISTS (SELECT * FROM B WHERE NAME = bb.NAME AND (DATA1 > bb.DATA1 OR DATA1 = bb.DATA1 AND DATA2 > bb.DATA2))
Add more OR clauses if more DATAx columns exist.
If A contains duplicates too, simply use DISTINCT as in the OP.
Upvotes: 0
Reputation: 22254
If you can add to a temp table and then query from that, you can do it in one go.
WITH T AS (temp table select), RN AS (select min row-numbers from T) SELECT T.NAME, T.DATA1, T.DATA2 FROM T INNER JOIN RN on T.row_number = RN.row_number
There are many other ways to write this, but that's how I've been doing similar things.
Upvotes: 0
Reputation: 19738
Not sure if this will solve your problem or not, but you could try using the GROUP BY clause and group by one of the name columns.
Upvotes: 0