MGM
MGM

Reputation: 65

ASCII (?) Special characters issue

I have a contact table where it stores names of the contact persons. So our research team copies the names from websites and pastes it into application, while copying we got some special characters got stored into database. Check below examples which will reflect as "?"(ASCII code=63) while extracted to text file. Examples of contact Last names are listed as follows.

EX: 1) Sheffield 2) Griffin-Smith 3) Lhoёst

Is there a way to query list of all special characters available with ASCII code as 63 in the "Last_name" column of my contact table so that I could figure out and send them to researchers to update the correct names.

Thanks in advance!

Upvotes: 0

Views: 5711

Answers (6)

Yeshim
Yeshim

Reputation: 11

I would prefer to leave a comment but my reputation wasn't enough to comment Narti's answer. Anyway although MGM marked Narti's answer as correct answer, he/she commented that it does not exactly answer his/her question. Well it answered my exact same question.

I modified my script that didn't replace ascii 63 and I was able to replace it by casting and then replacing.

From this: (This doesn't work)

update TheTable set last_name = replace(last_name, char(63),'');

To this:(When you cast and make it an actual '?', now you can replace '?' and remove it

update TheTable set last_name = replace(Cast(last_name AS VARCHAR(1000)), '?','');

Just wanted to add this to this thread.

Thanks,

Upvotes: 1

Dan Guzman
Dan Guzman

Reputation: 46415

Be aware that ASCII characters are those in the 0-127 code point range. Characters in the 128-255 range (sometimes casually called extended ASCII characters) depend on the code page of the column collation. When you insert characters that do not map to a character in the target collation code page, SQL Server will either change the character to '?' or map it to a close alternate character (e.g. 'Ǜ' to 'U'). I mention this because it is not only '?' that may be an issue. This character loss occurs when data is converted so it is not possible to identify the original character after data has been inserted.

The script below will list most problem characters according to your default database collation. Note that this checks all mathematically possible UCS-2 code points rather than the subset of valid UCS-2 characters.

WITH 
      t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
    , t256 AS (SELECT 0 AS n FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d)
    , t64k AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) - 1 AS num FROM t256 AS a CROSS JOIN t256 AS b)
SELECT 
      num AS UnicodeCodePoint
    , NCHAR(num) AS UnicodeChar
    , CAST(NCHAR(num) AS char(1)) AS AsciiChar
FROM t64k
WHERE 
    CAST(NCHAR(num) AS char(1)) = '?';

Upvotes: 0

MGM
MGM

Reputation: 65

create function fnSpecialChars(@MyString as varchar(100)) returns bit as 
begin
    declare @result as bit
    declare @i as int
    declare @n as int
    set @result = 0
    set @i = 1 
    if @MyString = '' or @MyString is null 
        set @result = 0 
    else
        begin
            set @n = datalength(@MyString)
            while @i <= @n and @result = 0
                begin
                    if ascii(substring(@MyString, @i, 1)) = 63 --between 128 and 255 
                        set @result = 1
                    else
                        set @i = @i + 1
                end
        end
    return @result
end
GO




SELECT * FROM tbl_crm_h_contact WHERE dbo.fnSpecialChars(last_nm) = 1

Upvotes: 0

Narti
Narti

Reputation: 181

If you want to find all entries which contain non ASCII Characters you can do the following:

select * from TheTable where Last_name != Cast(Last_name AS VARCHAR(1000)) 

Upvotes: 3

Ajay2707
Ajay2707

Reputation: 5808

You can directly use that character.

If you know that it is in keyboard like 63 is ascii of ? then run this query as

declare @t table (col varchar(50))

insert into @t values('ajay'),('bbb?bb'),('bbb?cc'),('55?bb')

select * from @t where col like '%?%'

Else

you can use regular expression for this.

To find out not alphabet, simple query as

SELECT Col1
FROM TABLE
WHERE Col1 like '%[^a-Z0-9]%'

Find all special characters in a column in SQL Server 2008

Check if field contains special character in SQL

The logic behind '%[^a-Z0-9]%' is,

  1. a-Z0-9 is search all alphabetic character
  2. ^ is used to not operator, means no alphabets
  3. Both % give occurance, zero or more.

For escaping charcter search.

http://web.archive.org/web/20150519072547/http://sqlserver2000.databases.aspfaq.com:80/how-do-i-search-for-special-characters-e-g-in-sql-server.html

Upvotes: 0

Guffa
Guffa

Reputation: 700840

The character with ASCII code 63 is not some strange character, it's the question mark. (It's no surprise that it is shown as a question mark, as it is a question mark.)

To find the records that contains the question marks, you can use the like operator:

select * from TheTable where Last_name like '%?%'

Upvotes: 0

Related Questions