super-user
super-user

Reputation: 1057

SQL String: Counting Words inside a String

I searched through many of the questions here but all I found with decent answer is for different language like Javascript etc.

I have a simple task in SQL that I can't seem to find a simple way to do. I just need to count the number of "words" inside a SQL string (a sentence). You can see why "words" is in quotes in my examples. The "words" are delimited by white space.

Sample sentences:

1. I am not your father.
2. Where are your brother,sister,mother?
3. Where are your brother, sister and mother?
4. Who are     you?

Desired answer:

1. 5
2. 4
3. 7
4. 3

As you can see, I need to count the "words" disregarding the symbols (I have to treat them as part of the word). So in sample no. 2: (1)Where (2)are (3)your (4)brother,sister,mother? = 4

I can handle the multiple whitespaces by doing a replace like this:
REPLACE(string, ' ', ' ') -> 2 whitespaces to 1 REPLACE(string, ' ', ' ') -> 3 whitespaces to 1 and so on..

What SQL function can I use to do this? I use SQL Server 2012 but needs a function that works in SQL Server 2008 as well.

Upvotes: 5

Views: 29837

Answers (5)

bendodge
bendodge

Reputation: 514

As of SQL Server 2016, you can use STRING_SPLIT() to accomplish this easily. Filtering out empty values avoids counting extra rows from consecutive delimiters.

SELECT COUNT([value])
FROM STRING_SPLIT('Where are your brother, sister and mother?', ' ')
WHERE [value] <> ''

SELECT COUNT([value])
FROM STRING_SPLIT('Who are     you?', ' ')
WHERE [value] <> ''

Results:

7
3

Upvotes: 0

Alireza Sattarzadeh
Alireza Sattarzadeh

Reputation: 1

I found this query more useful than the first. it omit extra characters and numbers and symbols, so it would count just words within a passage...

drop table if exists #t
create table #t (id int identity(1,1), c1 varchar(2000))
insert into #t (c1) 
values 
('Alireza Sattarzadeh Farkoush '),
('yes it is the   best .'),
('abc def ghja a the . asw'),
('?>< 123 ...!  z a b'),
('Wallex is   the greatest exchange in the .. world a after binance ...!')

select c1 , Count(*)
from (
select id, c1, value  
from #t t
cross apply (
select rtrim(ltrim(value)) as value from string_split(c1,' ')) a 
where len(value) > 1 and value like '%[a-Z]%' 
) Final 
group by c1

Upvotes: 0

Madhivanan
Madhivanan

Reputation: 13700

To handle multiple spaces too, use the method shown here

Declare @s varchar(100)
set @s='Who are     you?'
set @s=ltrim(rtrim(@s))

while charindex('  ',@s)&gt;0
Begin
    set @s=replace(@s,'  ',' ')
end

select len(@s)-len(replace(@s,' ',''))+1 as word_count

https://exploresql.com/2018/07/31/how-to-count-number-of-words-in-a-sentence/

Upvotes: 1

t-clausen.dk
t-clausen.dk

Reputation: 44316

This is a minor improvement of @ZoharPeled's answer. This can also handle 0 length values:

DECLARE @t AS TABLE(id int identity(1,1), string varchar(100))

INSERT INTO @t VALUES
  ('I am not your father.'),
  ('Where are your brother,sister,mother?'),
  ('Where are your brother, sister and mother?'),
  ('Who are     you?'),
  ('')

;WITH CTE AS
(
  SELECT
    Id,
    REPLACE(REPLACE(string,' ', '><'), '<>', '') string
  FROM @t
)
SELECT 
  id,
  LEN(' '+string)-LEN(REPLACE(string, '><', ' ')) CountWords
FROM CTE

Upvotes: 3

Zohar Peled
Zohar Peled

Reputation: 82474

Here is one way to do it:

Create and populate sample table (Please save is this step in your future questions)

DECLARE @T AS TABLE
(
    id int identity(1,1),
    string varchar(100)
)

INSERT INTO @T VALUES
('I am not your father.'),
('Where are your brother,sister,mother?'),
('Where are your brother, sister and mother?'),
('Who are     you?')

Use a cte to replace multiple spaces to a single space (Thanks to Gordon Linoff's answer here)

;WITH CTE AS
(
SELECT  Id,
        REPLACE(REPLACE(REPLACE(string, ' ', '><' -- Note that there are 2 spaces here
                               ), '<>', ''
                       ), '><', ' '
                ) as string
FROM @T
)

Query the CTE - length of the string - length of the string without spaces + 1:

SELECT id, LEN(string) - LEN(REPLACE(string, ' ', '')) + 1 as CountWords
FROM CTE 

Results:

id  CountWords
1   5
2   4
3   7
4   3

Upvotes: 13

Related Questions