Reputation: 27
I want to split a string into multiple values based on a special symbol. e.g., Here is the string
JdwnrhþTHIMPHUþOTHþþ10/1991þ02/02/2011þBHUTAN
I want it to be:
Jdwnrh THIMPHU OTH 10/1991 02/02/2011 BHUTAN
I am using the following SQL:
DECLARE @delimiter VARCHAR(50)
SET @delimiter='þ'
;WITH CTE AS
(
SELECT
CAST('<M>' + REPLACE(REPLACE(CAST(DATA as nvarchar(MAX)), @delimiter , '</M><M>'), '&', '&') + '</M>' AS XML)
AS BDWCREGPREVADDR_XML
FROM [JACS_RAVEN_DATA_OLD].dbo.BDWCREGPREVADDR
)
SELECT
BDWCREGPREVADDR_XML.value('/M[1]', 'varchar(50)') As streetNo,
BDWCREGPREVADDR_XML.value('/M[2]', 'varchar(50)') As suburb,
BDWCREGPREVADDR_XML.value('/M[3]', 'varchar(3)') As stateCode,
BDWCREGPREVADDR_XML.value('/M[4]', 'varchar(10)') As postalCode,
BDWCREGPREVADDR_XML.value('/M[7]', 'varchar(50)') As country,
BDWCREGPREVADDR_XML.value('/M[5]', 'varchar(50)') As dateFrom,
BDWCREGPREVADDR_XML.value('/M[6]', 'varchar(50)') As dateTo
FROM CTE
GO
The query works well on all the strings other than the one provided as an example. For above the string, the query returns the following:
'Jdwnrh' ' ' 'IMPHU' 'O' ' ' '10/1991' '02/02/2011' 'BHUTAN'
It seems the code takes letters 'TH' as a new attribute and split the string on it. Does anyone know how to resolve this issue?
Upvotes: 3
Views: 82
Reputation: 1
the key(þ) is wrong ,if you change another word ,it's ok. when use key(þ) and key(z), there are two results: enter image description here enter image description here I think maybe the key(þ) has some special meaning. hope to help you
Upvotes: 0
Reputation: 103525
This seems to be related to your collation. In Latin1_General_CS_AS
, the þ
character is considered equivalent to th
(because it's an Old English letter that sounds like "th" when pronounced).
print replace('thornþ' collate Latin1_General_CS_AS,'þ','1')
' output: 1orn1
This is not the case for all collations; for example, in Latin1_General_BIN
they are separate:
print replace('thornþ' collate Latin1_General_BIN,'þ','1')
' output: thorn1
So perhaps you could look at changing the collation of the column which contains the þ characters.
Upvotes: 2