Reputation: 13
I have a word list in a variable. The words are comma delimited. I am trying to save them to individual record in a database. I found another question which does this and works, but it saves every word. I tried to modify it so that I only save unique words, and count duplicate as I go. I think the logic below is correct, but my syntax is not working.
** is not concatenating the next word onto the end of the @WordsUsed variable
** doesn't seem to be doing anything at all ... I'm getting no records written to the words table
The entire code follows:
declare @SplitOn nvarchar(5) = ','
BEGIN
DECLARE @split_on_len INT = LEN(@SplitOn)
DECLARE @start_at INT = 1
DECLARE @end_at INT
DECLARE @data_len INT
DECLARE @WordsUsed varchar(max)
DECLARE @FoundWord int
DECLARE @Word varchar(100)
Set @WordsUsed = '**'
WHILE 1=1
BEGIN
SET @end_at = CHARINDEX(@SplitOn,@txt1,@start_at)
SET @data_len = CASE @end_at WHEN 0 THEN LEN(@txt1) ELSE @end_at-@start_at END
set @Word = SUBSTRING(@txt1,@start_at,@data_len)
SET @FoundWord = CHARINDEX('*' & @Word & '*', @WordsUsed)
if @FoundWord = 0 SET @WordsUsed = @WordsUsed & '*' & @Word & '*'
if @FoundWord = 0 INSERT INTO mydata.dbo.words (ProjectNumber, WordCount, Word) VALUES( '5', '1', @Word )
if @FoundWord > 0 Update mydata.dbo.words set WordCount = WordCount + 1 where projectnumber = 5 and word = @word
IF @end_at = 0 BREAK
SET @start_at = @end_at + @split_on_len
END
RETURN
END;
Upvotes: 0
Views: 565
Reputation: 5084
Here's a function you can try. It splits an unlimited-sized string based on the delimiter of your choice. The output of the function is a table - so you can then select distinct from that to store your word list. You'd call it something like:
insert YourWordlistTable ( Word ) --> just making up table/column names here
select distinct Data
from dbo.StringSplit( @yourVar, ',' )
Here's the definition of the function:
create function dbo.StringSplit
(
@string nvarchar( max ),
@delimiter nvarchar( 255 )
)
returns @t table ( Id int, Data nvarchar( 4000 ) )
as begin
with Split( startPosition, endPosition )
as
(
select
cast( 0 as bigint ) as startPosition,
charindex( @delimiter, @string ) as endPosition
union all
select
endPosition + 1, charindex( @delimiter, @string, endPosition + 1 )
from
Split
where
endPosition > 0
)
insert @t
select
row_number() over ( order by ( select 1 ) ) as Id,
substring( @string, startPosition, coalesce( nullif( endPosition, 0 ), len( @string ) + 1 ) - startPosition ) collate Latin1_General_CS_AS as Data
from
Split
option( maxrecursion 0 );
return;
end
I originally posted, then deleted, then reposted this when I realized I'd given an inline function I use that never got called on strings longer than 100 words. I've since modified it to support indefinite recursion - although it can't be an inline function this way.
Inline functions are generally faster because SQL can incorporate the inline function's statement into the query plan of the statements that call the function. However, that does not appear to be an option available in an inline function.
Upvotes: 1
Reputation: 162
why can't you use
select count(distinct value) from dbo.split(',',@txt1)
where @txt1 is he string will give you distinct count of words
If you are looking for word wise count
select value,count(1) from dbo.split(',',@txt1) group by value
shoud do this
Upvotes: 0