Shayne David
Shayne David

Reputation: 77

Group by column exact word or same first two words or same first three words

I am trying to group by the column with same exact word or same first two words or same first three words

tblsample:

name             
Brown sugar  
Brown sugar Creme
Tiger zoo
Tiger  
Blue red green
Blue red green yellow

Expected result:

name              cntnme
Brown sugar         2
Tiger zoo           1
Tiger               1
Blue red green      2

I only can group by exact same words.

Select  name, count(name) as cntnme from tblsample group by name

Upvotes: 4

Views: 478

Answers (3)

George Joseph
George Joseph

Reputation: 5932

The following is an example of how to do this

create table dbo.t(x varchar(30))

insert into dbo.t values('Brown sugar')
insert into dbo.t values('Brown sugar Creme')
insert into dbo.t values('Tiger zoo')
insert into dbo.t values('Tiger')
insert into dbo.t values('Blue red green')
insert into dbo.t values('Blue red green yellow')
insert into dbo.t values('Blue red green yellow green')

--First  i identify the records that have more than one word using space as a delimiter and then i look them up with words that match using a like condition.

with data
  as (select x, x as parent_word
        from dbo.t
        where charindex(' ',x) <> 0
      union all
      select b.x,a.parent_word
        from data a
        join dbo.t b 
          on a.x <> b.x
         and b.x like concat(a.x,'%')
      )
--In this block i find out the parent_word associated wit each of the words, and use min to identify them out. In this example (Blue,red green yellow),(Blue,red,green,yellow,green) and (Blue,red,gree) would all be segregated by the lowest value of parent which is (Blue,red,green)
    ,data_parent_word
    as (
        select x,min(parent_word) as min_value
          from data
        group by x
        )
 select min_value as word_val, count(x) as cnt
   from data_parent_word
 group by min_value
  union all
  select x,1
    from dbo.t 
   where charindex(' ',x) = 0

Finally i do a union all to get the records of what i obtained with the list of "single word"

This gives the output as follows

+----------------+----------+
|    word_val    |    cnt   |
+----------------+----------+
| Blue red green |        3 |
| Brown sugar    |        2 |
| Tiger zoo      |        1 |
| Tiger          |        1 |
+----------------+----------+

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270713

I am trying to group by the column with same exact word or same first two words or same first three words.

This suggests an approach like this:

with words2_3 as (
      select t.*
      from t
      where name like '% %' and name not like '% % % %'
     )
select w.name, count(*)
from t join
     words2_3 w
     on t.name + ' ' like w.name + ' %'
group by w.name;

Here is a db<>fiddle.

Your results are inconsistent with your problem statement, so these results do not agree with your statement.

Upvotes: 1

AlwaysLearning
AlwaysLearning

Reputation: 8829

Sounds like you want to use the SOUNDEX (Transact-SQL) operator, e.g.:

create table dbo.Names (
  Name nvarchar(50) not null
);
insert dbo.Names values
  ('Brown sugar'),
  ('Brown sugar Creme'),
  ('Tiger zoo'),
  ('Tiger'),
  ('Blue red green'),
  ('Blue red green yellow'),
  ('Blue red green yellow orange');

select Name=min(Name), CntNme=count(1)
from dbo.Names
group by soundex(Name);
Name            CntNme
--------------- ------
Blue red green  3
Brown sugar     2
Tiger           2

Upvotes: 3

Related Questions