Reputation: 1298
I have two tables that I've create an inner join between and for some strange reason it's returning more records that I expected.
Table 1 has 6 records in it including 2 records of 'Version 1', 2 records of 'Version 2', and 2 records of 'Version 3'. Table 2 has 10 records in it including 10 records of 'Version 1'. If I do a inner join between them on the Version fields I would have thought I'd only get 2 records: Table 1 and Table 2 only match on 'Version 1' of which there are only 2 in Table 1.
However, the following query returns 20 records (if anything I could of understood 60 records if it's converting to cross join, which also would be wrong):
SELECT * FROM TABLE2 T2 INNER JOIN TABLE1 T1 on T2.VersionKey = T1.VersionKey
Can anyone help explain what's happening?
I have another set of tables that are VERY similar which are giving me the expect results and I can't see the difference here.
Here are the basic table structures and data.
CREATE TABLE [dbo].[Table1] (
[ID] [int] IDENTITY(1,1) NOT NULL,
[VersionKey] [int] NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Table2] (
[ID] [int] IDENTITY(1,1) NOT NULL,
[VersionKey] [int] NULL,
CONSTRAINT [PK_Table2] PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [Table1] (VersionKey)
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 3
GO
INSERT INTO [Table2] (VersionKey)
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION
SELECT 1
GO
Upvotes: 1
Views: 913
Reputation: 1298
Credit goes to Martin above, but to clarify the answer for others following...
As eloquently explained above by others (thanks again for your help) the results are correct and, using my example, it should produce 20 records. However, here's the explanation as to why it's happening which was my original question.
The reason for the duplicate records is because the table you are joining to has duplicate records in the field you are joining on.
That solves the mystery for why I didn't see this in my other tables which contain the 2100 and 3 records respectively. There are no duplicates in the table with 3 records, hence, no duplicates in the final query results.
Why you normally may not see this is because you may typically be joining off the indexed primary key field like an ID field versus what I've done here. In those cases, since those records values will always likely be unique you'll never get duplicates.
Thanks again everyone for your help.
Here's my corrected code from above again with the bug fixed to reproduce the results.
CREATE TABLE [dbo].[Table1] (
[ID] [int] IDENTITY(1,1) NOT NULL,
[VersionKey] [int] NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Table2] (
[ID] [int] IDENTITY(1,1) NOT NULL,
[VersionKey] [int] NULL,
CONSTRAINT [PK_Table2] PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [Table1] (VersionKey)
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 3
GO
INSERT INTO [Table2] (VersionKey)
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1
GO
SELECT * FROM TABLE2 T2 INNER JOIN TABLE1 T1 on T2.VersionKey = T1.VersionKey
Upvotes: 0
Reputation: 452977
Your setup script has a bug.
INSERT INTO [Table2] (VersionKey)
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION
SELECT 1
That final UNION
needs to be UNION ALL
or you will remove duplicates and only insert 1 row not 10.
A fixed script and query results on SEDE is here https://data.stackexchange.com/stackoverflow/query/478676/fixed-script
It returns 20 rows, this is expected and correct.
Logically each row in the left table is compared to every row in the right table. If the join predicate evaluates to true then the match is preserved.
If the inner join predicate is something that is always true, e.g On 1=1
every row in the left ends up successfully matching every row on the right so you just end up with a cross join.
In your case both rows in Table1 with a VersionKey of 1 match 10 rows in Table2 so together they contribute 20 rows. The remaining rows in Table1 don't successfully join to anything and contribute 0 rows.
Upvotes: 1