Reputation: 5384
I have two tables:
Keywords
where I store unique keywords.
CREATE TABLE [dbo].[Keywords]
[KeywordID] [int] IDENTITY(1,1) NOT NULL,
[Description] [varchar](200) NOT NULL
select * from Keywords
1 MVC
2 HTML
3 C#
4 ASP.NET MVC
5 MVC3
KeywordSynonymous
where I point certain keywords to be synonymous of others.
CREATE TABLE [dbo].[KeywordSynonymous]
[KeywordID] [int] NOT NULL,
[KeywordSynonymousID] [int] NOT NULL
Both fields are FK to the Keywords table and both combined fields are used as PK on this table.
In here I would to state that 'MVC' and 'MVC3' are synonymous, and maybe that 'MVC3' and 'ASP.NET MVC' are also synonymous.
select * from KeywordSynonymous
1 5
5 4
CONCEPTS
1)
If Keyword 'MVC' is synonymous of 'MVC3'
and 'MVC3' is synonymous of 'ASP.NET MVC'
then conceptually MVC is ALSO synonymous of 'ASP.NET MVC'
2)
If Keyword 'MVC' is synonymous of 'MVC3'
then it's also true the VICEVERSA and that 'MVC3 is synonymous of 'MVC'
QUESTION
Imagine on my website I'm doing a search and the user could type anything, but for our example he could type 'MVC' or 'MVC3' ...
How can I get with one SQL statement ALL the possible synonymous ensuring that both Concept 1 and 2 are met?
Meaning that:
>> if the user types 'MVC', my sql should return 'MVC, MVC3', 'ASP.NET MVC'.
>> if the user types 'MVC3', my sql should return 'MVC, MVC3', 'ASP.NET MVC'.
>> if the user types 'ASP.NETMVC', my sql should return 'MVC, MVC3', 'ASP.NET MVC'.
================================================================
UPDATE
I feel I have to add a bit about the website i am developing. It's a market place where young professionals will be able to sell their services using new ways to promote themselves.
Since we want to allow any profession, I cannot foresee at this time what "keywords" will define better each profession. So I will allow the users to define those keywords.
My problem is that I need to allow UserX to search for these young professionals by profession and keywords. I need to allow these users to match their searched keywords to existing keywords so that current and future searches will automatically match the right profiles.
That's why i don't have all the keywords in advance and surely cannot identify future keywords and their respective synonymous. I also cannot expect the users to match ALL existing keywords to all relevant ones ...so that's why i need Concept 1 working.
================================================================
STACKOVERFLOW TAGS
The Keyword's module should work very similar to StackOverflow Tags (Keywords) where if I set the TAGS to be SQL, you guys that are searching for TSQL or SQL SERVER ...should also see this post.
:-)
Upvotes: 1
Views: 852
Reputation: 5384
OK, so how about this one:
DECLARE @TempKeywordID TABLE (KeywordID int)
INSERT INTO @TempKeywordID (KeywordID)(select KeywordID from Keywords where [Description] = @SearchKeyword)
DECLARE @intFlag INT
SET @intFlag = 1
WHILE (@intFlag <=(Select Count(KeywordSynonymousID) from KeywordSynonymous)) --Loop for all records in KeywordSynonymous
BEGIN
INSERT INTO @TempKeywordID (KeywordID)(Select KeywordSynonymousID from KeywordSynonymous where KeywordID in (Select KeywordID from @TempKeywordID))
INSERT INTO @TempKeywordID (KeywordID)(Select KeywordID from KeywordSynonymous where KeywordSynonymousID in (Select KeywordID from @TempKeywordID))
SET @intFlag = @intFlag + 1
END
SELECT * FROM Keywords WHERE KeywordID IN (SELECT * FROM @TempKeywordID)
Upvotes: 0
Reputation: 15319
You should definitely use Common Table Expressions. This is ideal solution for your problem, because it does not change your current DB schema and most importantly, CTE is an elegant and the logical solution due to the recursiveness that your KeywordSynonymous
table has.
To accomplish this, its better to first create a view that selects all rows in KeywordSynonymous in both directions. In your case, this table returns the rows
select * from KeywordSynonymous
1 5
5 4
and what the view below will do is to show
select * from KeywordSynonymousAll
1 5 0
2 NULL 0
3 NULL 0
4 NULL 0
4 5 1
5 1 1
5 4 0
This view is the data structure that will simplify the recursive query. It adds a third column to identify when an reversion has been made. This is required to satisfy your concept number 2.
So, here it is the view:
create view KeywordSynonymousAll as
select KeywordID, KeywordSynonymousID, 0 as reversed
from KeywordSynonymous
union
select K.KeywordID, null as KeywordSynonymousID, 0 as reversed
from Keywords K
where not exists(select null
from KeywordSynonymous
where KeywordID = K.KeywordID)
union
select KeywordSynonymousID, KeywordID, 1 as reversed
from KeywordSynonymous
And the query
declare @search varchar(200);
set @search = 'MVC3'; -- TEST HERE for different search keywords
with Synonymous (keywordID, SynKeywordID) as (
-- initial state: Get the keywordId and KeywordSynonymousID for the description as @search
select K.keywordID, KS.KeywordSynonymousID
from Keywords K
inner join KeywordSynonymous KS on KS.KeywordID = K.keywordId
where K.Description = @search
union all
-- also initial state but with reversed columns (because we want lookup in both directions)
select KS.KeywordSynonymousID, K.keywordID
from Keywords K
inner join KeywordSynonymous KS on KS.KeywordSynonymousID = K.keywordId
where K.Description = @search
union all
select S.SynKeywordID, KS.KeywordSynonymousID
from Synonymous S
inner join KeywordSynonymousAll KS on KS.KeywordID = S.SynKeywordID
where KS.reversed = 0 -- to avoid infinite recursion
union all
select KS.KeywordSynonymousID, S.SynKeywordID
from Synonymous S
inner join KeywordSynonymousAll KS on KS.KeywordID = S.KeywordID
where KS.reversed = 1 -- to avoid infinite recursion
)
-- finally output the result
select distinct K.Description
from Synonymous S
inner join Keywords K on K.KeywordID = S.keywordID
For set @search = 'MVC3'
, the result set is
ASP.NET MVC
MVC
MVC3
The same result set happens for set @search = 'MVC'
and set @search = 'ASP.NET MVC'
For set @search = 'C#'
and set @search = 'HTML'
you get nothing
EDIT
In my previous post, I said that result set would be empty for C# and HTML. If you want also to return those values, then change the last part of query to:
-- finally output the result
select distinct T.Description
from (
select K.Description
from Synonymous S
inner join Keywords K on K.KeywordID = S.keywordID
union
select Description
from Keywords
where Description = @search) T
Now, for set @search = 'C#'
, the result set is
C#
and for set @search = 'HTML'
, the result set is
HTML
Hope this helps
Upvotes: 3
Reputation: 115530
Beacuse of your conditions (concepts), the Synonymous table is not normalized. This is main source of your problem and the complex queries/triggers needed to solve it.
I would keep the Keyword table:
CREATE TABLE [dbo].[Keywords]
[KeywordID] [int] IDENTITY(1,1) NOT NULL,
[Description] [varchar](200) NOT NULL
select * from Keywords
1 MVC
2 HTML
3 C#
4 ASP.NET MVC
5 MVC3
6 C sharp
and make the Synonymous table differently:
CREATE TABLE [dbo].[KeywordSynonymity]
[SynonymityID] [int] NOT NULL,
[KeywordID] [int] NOT NULL
select * from KeywordSynonymous
1 1 --- for the 1 (MVC) and 5 (MVC3)
1 5 --- being synonymous
2 3 --- for the 3 (C#) and 6 (C sharp)
2 6 --- being synonymous
Then to add that MVC3
and ASP.NET MVC
are also synonymous, you'll just have to add a row (1,4) in the Synonymity table.
If then - for unknown reasons but lets assume nonetheless that - you want to combine MVC3
and C#
as synonymous, you'll have to change all rows with SynonymityID=2 (synonymous to C#) to =1 (synonymous to MVC).
But all your queries will be simpler as the table is normalized.
Upvotes: 2
Reputation: 26280
1 is called Symmetric Relation and 2 is called Transitive Relation.
I suggest that you maintain them as you add new keywords. You can do it this way. When a keyword is added to database, if there is no synonym to it already, designate it "master" keyword. Otherwise link the new keyword to the existing master keyword.
Here is a stored procedure to add the new keywords this way:
CREATE PROCEDURE [dbo].[AddKeyword]
@newKeyword [varchar](200),
@synonymKeyword [varchar](200) = NULL
AS
BEGIN
SET NOCOUNT ON;
set transaction isolation level serializable
begin transaction
if EXISTS (select 1 from Keywords where [Description] = @newKeyword)
begin
commit transaction
return
end
declare @masterKeywordId int
select
@masterKeywordId = ISNULL(KeywordSynonymous.KeywordID, Keywords.KeywordID)
from
Keywords
left join
KeywordSynonymous
on
Keywords.KeywordID = KeywordSynonymous.KeywordSynonymousID
where
[Description] = @synonymKeyword
insert into Keywords VALUES (@newKeyword)
if @masterKeywordId is not null
insert into KeywordSynonymous VALUES (@masterKeywordId,SCOPE_IDENTITY())
commit transaction
END
In this stored procedure you pass a new keyword to add and optionally you also pass a known synonym. This synonym does not have to be "master". If it's present it's "master" keyword id will be looked up, and the newly created keyword will be linked against that "master" id.
And this is how you select them all in the end:
CREATE PROCEDURE [dbo].[GetSynonymKeywords]
@keyword [varchar](200)
AS
BEGIN
SET NOCOUNT ON;
declare @masterKeywordId int
select
@masterKeywordId = ISNULL(KeywordSynonymous.KeywordID, Keywords.KeywordID)
from
Keywords
left join
KeywordSynonymous
on
Keywords.KeywordID = KeywordSynonymous.KeywordSynonymousID
where
[Description] = @keyword
select
KeywordId,[Description]
from
Keywords
where
KeywordId = @masterKeywordId
union
select
Keywords.KeywordId,[Description]
from
KeywordSynonymous
join
Keywords
on
KeywordSynonymous.KeywordSynonymousID = Keywords.KeywordId
where
KeywordSynonymous.KeywordId = @masterKeywordId
END
This stored procedure first finds the keyword ID given a passed keyword. Then it looks up the "master" keyword for this id. Then it returns the master keyword and all keywords that are synonyms to this master keyword.
Example of adding new words:
EXEC [dbo].[AddKeyword] @newKeyword = N'MVC'
EXEC [dbo].[AddKeyword] @newKeyword = N'ASP.NET MVC', @synonymKeyword = 'MVC'
EXEC [dbo].[AddKeyword] @newKeyword = N'MVC3', @synonymKeyword = 'ASP.NET MVC'
Note that in the third line you could have specified 'MVC' as the synonym, it would have worked just as well.
Example of retrieving keywords:
[dbo].[GetSynonymKeywords] @keyword = N'MVC3'
[dbo].[GetSynonymKeywords] @keyword = N'ASP.NET MVC'
[dbo].[GetSynonymKeywords] @keyword = N'MVC3'
All three returns the same list of values.
I'm putting isolation level to serialized in the AddKeyword SP to make sure there is no concurrency problem feel free to modify it according your concurrency model, serialized might not be appropriate for you.
You also can pull out GetMasterId (the block that appears in both SPs) into an UDF if you so inclined, or do any other modification that fits your particular scenario.
Upvotes: 1
Reputation: 16673
To achieve at least #1 you could use recursive Common Table Expressions (CTE)
with the definition here
Upvotes: 2