Dave Sexton
Dave Sexton

Reputation: 11188

How to extract a unique list of characters from one or columns in a table

As part of a data cleansing exercise I need to provide a unique list of characters used within a couple of name fields. So for example suppose I have a person table with two columns called first_name and last_name with only one row like so:

first_name last_name 
Elizabeth  Smith

I want to be able to produce the following result:

letter
a
b
e
E
h
i
l
m
S
t
z

I am thinking a recursive common table expression is the way to go.

Upvotes: 2

Views: 173

Answers (2)

Igor Borisenko
Igor Borisenko

Reputation: 3866

Try this SQL-Server 2005+

declare @tbl table (v1 varchar(100), v2 varchar(100))

insert into @tbl values ('asd','ssft'),(null,'la')

;with cte
as
(SELECT isnull(t.v1,'')+ISNULL(t.v2,'') as V, 1 as i, SUBSTRING(isnull(t.v1,'')+ISNULL(t.v2,''),1,1) Vi
FROM @tbl t
UNION ALL
SELECT V, CTE.i+1, SUBSTRING(V,i+1,1)
FROM cte
WHERE LEN(CTE.v)>=CTE.i+1
)
SELECT DISTINCT Vi --COLLATE Latin1_General_CS_AS
FROM cte
ORDER BY Vi --COLLATE Latin1_General_CS_AS

Also, you may need to specify CASE Sensitive collation with COLLATE Latin1_General_CS_AS (commented in the code) if you want both capital and little letters to appear in result set.

Upvotes: 1

Kyle Hale
Kyle Hale

Reputation: 8120

One possible approach: Create a letters table and hook it to the names using CHARINDEX() and just return matches.

with letters (letter) as (

select TOp 253 CHAR(ROW_NUMBER() OVER (ORDER BY (SELECT 1)) )
 from 
 master.dbo.spt_values t1 CROSS JOIN master.dbo.spt_values t2 
order by
 ROW_NUMBER() OVER (ORDER BY (SELECT 1))
),

name (name) as (

select  'Elizabeth Smith' COLLATE Latin1_General_CS_AS as name 
)

select  letters.* from 
letters
inner join name on charindex(letter,  name) > 0 
and letter <> ' '

Upvotes: 1

Related Questions