Reputation: 11479
I have two tables that have a one-to-many relationship. Table A has an ID column and table B has an A_ID column.
I want my output to contain one row per record in table A. This row should have some values from A, but also some from one of the rows in B. For each record in table A there are 0, 1, or more records in B - I want to be able to sort these and return just one (e.g. the record with the largest column Z).
Table A - Artists
╔═════╦════════════════╦════════════════════════╗
║ ID ║ Artist ║ Real name ║
╠═════╬════════════════╬════════════════════════╣
║ 383 ║ Bob Dylan ║ Robert Allen Zimmerman ║
║ 395 ║ Marilyn Manson ║ Brian Hugh Warner ║
║ 402 ║ David Bowie ║ David Robert Jones ║
╚═════╩════════════════╩════════════════════════╝
Table B - Tracks
╔══════╦═══════════╦══════════════════════╦════════════════╗
║ ID ║ Artist_ID ║ Track Name ║ Chart position ║
╠══════╬═══════════╬══════════════════════╬════════════════╣
║ 1458 ║ 383 ║ Maggie's Farm ║ 22 ║
║ 1598 ║ 383 ║ Like a Rolling Stone ║ 4 ║
║ 1674 ║ 395 ║ Personal Jesus ║ 13 ║
║ 1782 ║ 383 ║ Lay Lady Lay ║ 5 ║
╚══════╩═══════════╩══════════════════════╩════════════════╝
Using the above tables as an example, we might want to return the following:
╔════════════════╦══════════════════════╦══════════╗
║ Artist ║ Top charting ║ Position ║
╠════════════════╬══════════════════════╬══════════╣
║ Bob Dylan ║ Like a Rolling Stone ║ 4 ║
║ Marilyn Manson ║ Personal Jesus ║ 13 ║
║ David Bowie ║ null ║ null ║
╚════════════════╩══════════════════════╩══════════╝
Note that the result set contains two columns from the child table, but only for the row with the minimum chart position for each Artist_ID. (And since poor old Bowie doesn't have any tracks in table B, he gets null values in the columns from the child table.)
If I wanted to return a single column from B, I'd use a subquery in the select statement. (I suppose I could include multiple subqueries in the select statement, but this would presumably result in a severe performance hit and doesn't seem very elegant.)
Because I need to return multiple columns from the subquery, I could JOIN
it instead and treat it like a table. Unfortunately, this seems to lead to duplicate rows in my output - i.e. when there are multiple child records, each gets a row in the output.
What's the best way around this?
Upvotes: 0
Views: 407
Reputation: 7692
Something a little bit less cumbersome and more elegant:
select top (1) with ties *
from artists a
left join tracks t on a.Id = t.ArtistId
order by row_number() over(partition by a.Id order by t.ChartPos);
Upvotes: 1
Reputation: 2317
This will work, you just need to create a subquery to get the row number for each track for a given artist and select the first one (i.e. where the row number is 1). Copy and paste the below and it will work, then you can swap things out/edit for your purposes:
DECLARE @TableA TABLE (ID INT, Artist VARCHAR(100), RealName VARCHAR(100))
DECLARE @TableB TABLE (ID INT, Artist_ID INT, TrackName VARCHAR(100), ChartPosition INT)
INSERT INTO @TableA (ID, Artist, RealName)
VALUES (383, 'Bob Dylan', 'Robert Allen Zimmerman'),
(395, 'Marilyn Manson', 'Brian Hugh Warner'),
(402, 'David Bowie','David Robert Jones')
INSERT INTO @TableB (ID, Artist_ID, TrackName, ChartPosition)
VALUES
(1458, 383, 'Maggie''s Farm', 22),
(1598, 383, 'Like a Rolling Stone', 4),
(1674, 395, 'Personal Jesus', 13),
(1782, 383, 'Lay Lady Lay', 5)
SELECT
[Artist],
[Top Charting],
[Position]
FROM ( SELECT
ta.Artist AS [Artist],
tb.TrackName AS [Top Charting],
tb.ChartPosition AS [Position],
ROW_NUMBER() OVER (PARTITION BY ta.Artist ORDER BY tb.ChartPosition) [RowNum]
FROM @TableA ta
LEFT JOIN @TableB tb ON ta.ID = tb.Artist_ID
) Result
WHERE Result.RowNum = 1
Upvotes: 1
Reputation: 247630
Since you want to return the TOP chart_position
for each artist
, I would suggest looking at using a windowing function similar to row_number()
.
This function will create a unique sequence for each artist. This sequence can be generated in a specific order, in your case you will order by the
chart_position`. You'll then return the row with the "TOP" value or where the sequence equals 1.
The code would be similar to:
select artist,
Top_Chart = track_name,
Position = chart_position
from
(
select a.artist,
t.track_name,
t.chart_position,
seq = row_number() over(partition by a.id order by t.chart_position)
from artists a
left join tracks t
on a.id = t.artist_id
) d
where seq = 1;
If you aren't using a version of SQL Server that supports windowing functions (like SQL Server 2000), then you could write this query using a self-join to the tracks
table with an aggregate function to get the highest charting track, then join to the artists.
select a.artist,
Top_Chart = t.track_name,
Position = t.chart_position
from artists a
left join
(
select t.artist_id,
t.track_name,
t.chart_position
from tracks t
inner join
(
select chart_position = min(chart_position),
artist_id
from tracks
group by artist_id
) m
on t.artist_id = m.artist_id
and t.chart_position = m.chart_position
) t
on a.id = t.artist_id;
See SQL Fiddle with Demo. They both give the same result.
Upvotes: 1