M S
M S

Reputation: 1

How to count no. of occurrences of characters in a string in sql?

I want count of all characters that appear in word. For example if I write Peter, I want output like this

P 1
e 2
t 1
e 2
r 1

Any help is appreciated.

Upvotes: 0

Views: 124

Answers (2)

Alan Burstein
Alan Burstein

Reputation: 7918

What John Posted above is known as a Unigram function; AKA N-Gram of size 1. For a N-Grams function that supports tokens of any size see: Nasty Fast N-Grams Part 1. There's an 8K and VARCHAR(MAX) version. Using ngrams8k:

DECLARE @string VARCHAR(100) = 'Peter';

SELECT   ng.Position, ng.Token, COUNT(ng.Token) OVER (PARTITION BY ng.Token) AS cnt
FROM     dbo.ngrams8k(@string,1) AS ng
ORDER BY ng.Position;

Returns:

Position    Token      cnt
----------- ---------- -----------
1           P          1
2           e          2
3           t          1
4           e          2
5           r          1

Note that the second parameter in the function is the token size, here's a basic example of what you can do beyond unigrams. Let's say we wanted to search a string for all instances of a specific substring, "ABC" in this example:

DECLARE 
  @string VARCHAR(8000) = '123ABC xxx yyy ABCDEABC...ABB',
  @searchFor VARCHAR(100) = 'ABC';

SELECT ng.* 
FROM   dbo.Ngrams8k(@string,LEN(@searchFor)) AS ng
WHERE  ng.Token = @searchFor;

Returns:

position    token
----------- -------
4           ABC
16          ABC
21          ABC

Upvotes: 0

John Cappelletti
John Cappelletti

Reputation: 81960

If open to a TVF. Being a Table-Valued Function, it can be used in a CROSS APPLY

Example

Select *
      ,Cnt = sum(1) over (partition by RetVal) 
 From  [dbo].[tvf-Str-Parse-Char]('Peter') 
 Order By RetSeq

Returns

RetSeq  RetVal  Cnt
1       P       1
2       e       2
3       t       1
4       e       2
5       r       1

The Function if Interested

CREATE FUNCTION [dbo].[tvf-Str-Parse-Char] (@String varchar(max))
Returns Table
As
Return (
    with   cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
           cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From cte1 a,cte1 b,cte1 c,cte1 d,cte1 e,cte1 f)

    Select RetSeq=N
          ,RetVal=Substring(@String,N,1) 
     From  cte2
) 
--Max 1 Million Observations
--Select * from [dbo].[tvf-Str-Parse-Char]('this is a string') 

Upvotes: 3

Related Questions