Reputation: 83063
I have a table called TempAllAddresses
with the following columns - ID
, Address
, State
. I want to populate a new table with Address
, State
, and Count
. Count
should represent how many records there are in the TempAllAddresses table that have an address like this address followed by a wildcard. If that made no sense, here's an example to illustrate -
Let's say I have a record like this:
ID Address State
12345 13 Phoenix NY
What I want to do is insert a new record into a new table called AddressCount
that has 13 Phoenix for the Address
, NY for the State
, and the number of records in the table that have NY as the State and an address LIKE '13 Phoenix%' for the Count
.
I want to accomplish this with an inner join of TempAllAddresses on itself. This is what I've tried, but it doesn't seem to accomplish what I'm looking for:
SELECT t1.Address, t1.State, COUNT(t2.address) As NumEntities
FROM TempAllAddresses t1
INNER JOIN TempAllAddresses t2
ON t1.state = t2.state
AND T2.Address LIKE t1.address + '%'
GROUP BY t1.State, t1.Address
The Count is definitely off, though. It should be equivalent to running "SELECT COUNT(*) FROM TempAllAddresses WHERE State=thisRecordsState and Address LIKE thisRecordsAddress + '%'
". How can I accomplish this? What am I doing wrong?
Edit:
The count seems to be off in the following way - If I have a record like I mentioned above, and then I have 2 other records that also have a state of NY, and then have addresses of "13 Phoenix Road" and "13 Phoenix Rd", then I want to get in my final table a record like this:
13 Phoenix NY 3
Instead, I seem to be getting:
13 Phoenix NY 9
I'm not quite sure what's happening here... some sort of cartesian product? Permutations...? Can anyone explain this?
Edit 2: A further edit since I seem to be misunderstood (and really need a solution :( )... Here is a query with a correlated subselect that accomplishes what I'm looking for. I'd like to do the same thing with an inner join of the table on itself rather than a subselect.
SELECT Address, State,
(SELECT Count(*)
FROM TempAllAddresses innerQry
WHERE innerQry.address LIKE outerQry.address + '%'
AND innerQry.state = outerQry.state) As NumEntities
FROM TempAllAddresses outerQry
Basically, for each record, I want to get the number of records in the table that have the same state and an address that begins with this address (or is equal to... I do want to include this address as part of the count).
Upvotes: 4
Views: 503
Reputation: 1995
Here's two solutions, one using a CROSS APPLY and the other using an INNER JOIN like you wanted originally. I hope this helps. :)
DECLARE @TempAllAddresses TABLE
(
ID INT PRIMARY KEY IDENTITY(1, 1) NOT NULL
, [Address] VARCHAR(250) NOT NULL
, [State] CHAR(2) NOT NULL
)
INSERT INTO @TempAllAddresses
VALUES ('13 Phoenix', 'NY')
, ('13 Phoenix St', 'NY')
, ('13 Phoenix Street', 'NY')
, ('1845 Test', 'TN')
, ('1337 Street', 'WA')
, ('1845 T', 'TN')
SELECT
TempAddresses.ID
, TempAddresses.[Address]
, TempAddresses.[State]
, TempAddressesCounted.AddressCount
FROM @TempAllAddresses TempAddresses
CROSS APPLY
(
SELECT
COUNT(*) AS AddressCount
FROM @TempAllAddresses TempAddressesApply
WHERE TempAddressesApply.[Address] LIKE (TempAddresses.[Address] + '%')
AND TempAddressesApply.[State] = TempAddresses.[State]
) TempAddressesCounted
SELECT
TempAddresses.ID
, TempAddresses.[Address]
, TempAddresses.[State]
, COUNT(*) AS AddressCount
FROM @TempAllAddresses TempAddresses
INNER JOIN @TempAllAddresses TempAddressesJoin
ON TempAddressesJoin.[Address] LIKE (TempAddresses.[Address] + '%')
AND TempAddressesJoin.[State] = TempAddresses.[State]
GROUP BY TempAddresses.ID
, TempAddresses.[Address]
, TempAddresses.[State]
Upvotes: 1
Reputation: 73
Have you tried analytical functions - they are often the easiest solution. I am not familiar with your table structure, but it should be something like this:
SELECT t1.Address, t1.State,
COUNT(t2.address) OVER (PARTITION BY t2.state) As NumEntities
FROM TempAllAddresses t1
INNER JOIN TempAllAddresses t2
ON t1.state = t2.state
AND T2.Address LIKE t1.address + '%'
GROUP BY t1.State, t1.Address
You can even add ORDER BY
in the OVER
clause. See Oracle FAQs for some explanation.
Upvotes: 0
Reputation: 26098
Nested GroupBy:
SQL:
SELECT Address, State, count(1) As NumEntities
FROM (
SELECT min(t1.Address) as Address, t1.State
FROM TempAllAddresses t1
INNER JOIN TempAllAddresses t2
ON t1.state = t2.state
AND T2.Address LIKE t1.address + '%'
GROUP BY t1.State, t2.Address
) GROUP By State, Address
Upvotes: 0
Reputation:
There is double counting going on when there are multiple rows with exactly the same address.
Try:
SELECT t1.Address, t1.State, COUNT(t2.address) As NumEntities
FROM (select distinct Address, State from TempAllAddresses) t1
INNER JOIN TempAllAddresses t2
ON t1.state = t2.state
AND T2.Address LIKE t1.address + '%'
GROUP BY t1.State, t1.Address
Upvotes: 0
Reputation: 4871
QUERY A:
SELECT t1.Address, t1.State, COUNT(t2.address) As NumEntities
FROM TempAllAddresses t1
INNER JOIN TempAllAddresses t2
ON t1.state = t2.state
AND T2.Address LIKE t1.address + '%'
GROUP BY t1.State, t1.Address
is not equivalent to
QUERY B:
SELECT Address, State,
(SELECT Count(*)
FROM TempAllAddresses innerQry
WHERE innerQry.address LIKE outerQry.address + '%'
AND innerQry.state = outerQry.state) As NumEntities
FROM TempAllAddresses outerQry
because B produces 1 row for each row in the original table (TempAllAddresses
), whereas A will group together rows in the original table that have the same state and address. To solve this, GROUP BY t1.ID, t1.State, t1.Address
instead.
Upvotes: 0
Reputation: 8327
EDIT: [snip old stuff]
Try this:
SELECT t1.Address, t1.State, COUNT(distinct t2.id) As NumEntities
FROM TempAllAddresses t1
INNER JOIN TempAllAddresses t2
ON t1.state = t2.state
AND T2.Address LIKE t1.address + '%'
GROUP BY t1.State, t1.Address
Upvotes: 0
Reputation: 3996
Try this instead:
SELECT Orig_Address, State, COUNT(Similar_Address)
From
(
SELECT t1.Address Orig_Address,
t1.State State,
t2.address Similar_Address
FROM TempAllAddresses t1
INNER JOIN TempAllAddresses t2
ON t1.state = t2.state
AND T2.Address LIKE t1.address + '%'
AND t1.address <> t2.address
)
GROUP BY State, Orig_Address
EDIT: forgot to include the difference between t1.address and t2.address, as @Spiny Norman said, since you probably do not want to compare an address to itself.
HTH
Upvotes: 0