Reputation: 49
Looking for some guidance on where to start with a project that likely uses pattern matching in some form. Currently I have a table with a column of property identification numbers that resembles below:
PropertyID| Property Name|
A001 Jefferson
A002 Madison
A002-01 Madison Outhouse
A002-02 James Madison Statue
A003 Franklin
A004 Jackson
A004-field2 Jackson Fields
... ...
A2001 Tubman
The PropertyID field is the primary key for the dataset. The string before the hyphen represents a parent property, while the values after the hyphen indicate sub-properties associated with the parent. What I'm attempting to create is a new column called "Parent Property Name" That would look like below:
PropertyID| Property Name| | Parent Property Name|
A001 Jefferson Jefferson
A002 Madison Madison
A002-01 Madison Outhouse Madison
A002-02 James Madison Statue Madison
A003 Franklin Franklin
A004 Jackson Jackson
A004-field2 Jackson Fields Jackson
... ...
A2001 Tubman Tubman
Basically what I'm trying to accomplish is if the string is unique before the hyphen/does not have a hyphen return "Property Name" as "Parent Property Name", If there is a hyphen return the "Property Name" value of of the parent (i.e. matching PropertyID before hyphen that does not contain a hyphen itself.
Any thoughts on how I should approach this would be greatly appreciated.
Upvotes: 0
Views: 817
Reputation: 603
This can be achieved by a JOIN as follows: We are extracting the Substring part of the PropertyID before the "-" symbol and joining back to the same table with this. IF a record does not have "-" symbol, induce one so the substring part always returns something and also the CHARINDEX() -1 part does not return a negative value to the SUBSTRING function for length argument.
DECLARE @Temp AS TABLE ( PropertyID NVARCHAR(MAX), PropertyName NVARCHAR(MAX))
INSERT INTO @Temp ( PropertyID ,
PropertyName )
SELECT 'A001', 'Jefferson' UNION ALL
SELECT 'A002', 'Madison' UNION ALL
SELECT 'A002-01', 'Madison Outhouse' UNION ALL
SELECT 'A002-02', 'James Madison Statue' UNION ALL
SELECT 'A003', 'Franklin' UNION ALL
SELECT 'A004', 'Jackson' UNION ALL
SELECT 'A004-field2', 'Jackson Fields' UNION ALL
SELECT 'A2001', 'Tubman'
SELECT *, CHARINDEX('-',T1.PropertyID)
FROM @Temp T1 LEFT JOIN @Temp T2
ON SUBSTRING(T1.PropertyID,1,CHARINDEX('-',T1.PropertyID+'-')-1) = T2.PropertyID
Upvotes: 0
Reputation: 24470
This should do what you're after: SQL Fiddle
with CTE as
(
select PropertyId
, PropertyName
, PropertyId ParentId
, PropertyName ParentName
, PropertyId AncestorId
, PropertyName AncestorName
from Property
where PropertyId not like '%-%'
union all
select b.PropertyId
, b.PropertyName
, a.PropertyId ParentId
, a.PropertyName ParentName
, a.AncestorId
, a.AncestorName
from CTE a
inner join Property b
on b.PropertyId like a.PropertyId + '-%'
and b.PropertyId not like a.PropertyId + '-%-%' --avoid having grandchildren too soon
)
select *
from CTE
order by PropertyId
The first block of code gets all root elements; i.e. those without hyphens. These values are their own parents & ancestors. These are also the only values that will appear as ancestors of other results, since they're at the top level.
select PropertyId
, PropertyName
, PropertyId ParentId
, PropertyName ParentName
, PropertyId AncestorId
, PropertyName AncestorName
from Property
where PropertyId not like '%-%'
We then use the recursive bit to capture the child elements. This takes those elements from the table whose PropertyId is the same as the table's existing records' PropertyId, only with a hyphen and a new value on the end. We exclude those which have a second hyphen (or more) after this, to ensure we don't list a great grandchild as a child.
In this same block, we assign the values differently; rather than Property, Parent, and Ancestor receiving the same values, Property is the current property, Parent is the PropertyId of the parent record, and Ancestor is the Parent record's Ancestor.
select b.PropertyId
, b.PropertyName
, a.PropertyId ParentId
, a.PropertyName ParentName
, a.AncestorId
, a.AncestorName
from CTE a
inner join Property b
on b.PropertyId like a.PropertyId + '-%'
and b.PropertyId not like a.PropertyId + '-%-%'
I wondered if my use of LIKE
was appropriate, or if alternatives such as CHARINDEX
may perform better. In investigating I found this post which shows that CharIndex is best, but per the comments others have found different results, and running this myself I also saw inconsistent behaviours on different machines. So I suspect this is a scenario prone to premature optimisation; i.e. make sure what's there works; then if you need to optimise it test those optimisations out on the environment this code will be run on to ensure the stats are relevant.
That said, here's a version using CHARINDEX
, should you wish to test / compare. This is based on @GordonLinoff's solution. SQL Fiddle
with CTE as
(
select PropertyId
, PropertyName
, PropertyId ParentId
, PropertyName ParentName
, PropertyId AncestorId
, PropertyName AncestorName
, 0 LastHyphenCharIndex
from Property
where charindex('-', PropertyId) = 0
union all
select b.PropertyId
, b.PropertyName
, a.PropertyId
, a.PropertyName
, a.AncestorId
, a.AncestorName
, charindex('-',b.PropertyId, a.LastHyphenCharIndex)
from CTE a
inner join Property b
--on left(b.PropertyId, charindex('-',b.PropertyId, a.LastHyphenCharIndex)) = a.PropertyId + '-'
on left(b.PropertyId, nullif(charindex('-',b.PropertyId, a.LastHyphenCharIndex),0)-1) = a.PropertyId
)
select *
from CTE
order by PropertyId
Upvotes: 0
Reputation: 1270513
I think you can do this using a join
: SQL Fiddle
select t.*, tparent.PropertyId
from t join
t tparent
on left(t.PropertyId, charindex('-', t.PropertyId + '-') - 1) = tparent.PropertyId;
You can also use first_value()
: SQL Fiddle
select t.*,
first_value(propertyId) over (partition by left(t.PropertyId, charindex('-', t.PropertyId + '-') - 1)
order by PropertyId) as parent_propertyid
from t;
Upvotes: 1
Reputation: 50173
You can use subquery
:
select *,
(case when charindex ('-', ltrim(rtrim(PropertyID))) > 0
then (select top 1 t1.PropertyName
from table t1
where t1.PropertyID = left(t.PropertyId, charindex('-', t.PropertyId + '-') - 1)
) else PropertyName
end) as [Parent Property Name]
from table t;
Upvotes: 0