npalle
npalle

Reputation: 71

RTF to SQL with special characters (ñ and accent)

I have a column with data type rtf in SQL and I want to convert it to type varchar(max) rtf can contain accent and character Ñ

I need only the text from this data.

Example RTF:

{\rtf1\ansi\ansicpg1252\deff0\deflang11274{\fonttbl{\f0\fnil\fprq1\fcharset0 Verdana;}{\f1\fswiss\fprq2\fcharset0 Verdana;}{\f2\fnil\fcharset0 MS Sans Serif;}{\f3\fnil\fcharset2 Symbol;}} {\colortbl ;\red8\green0\blue0;\red0\green0\blue0;\red0\green0\blue255;} {*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard{\pntext\f3\'B7\tab}{*\pn\pnlvlblt\pnf3\pnindent0{\pntxtb\'B7}}\qj\cf1\f0\fs16 Pasaje a\'e9reo de \b AEROLINEAS ARGENTINAS\b0 (Buenos Aires/Salvador/Buenos Aires) en clase \b\ldblquote V\rdblquote \cf2\b0\i sin equipaje incluido\i0\par \pard{\pntext\f3\'B7\tab}{*\pn\pnlvlblt\pnf3\pnindent0{\pntxtb\'B7}}Traslados en servicio regular\par {\pntext\f3\'B7\tab}7 noches de alojamiento con r\'e9gimen seg\'fan se indica\par \pard{\pntext\f3\'B7\tab}{*\pn\pnlvlblt\pnf3\pnindent0{\pntxtb\'B7}}\qj\cf0 Asistencia al viajero \b TRAVEL ACE DIAMANTE \b0 con seguro de cancelaci\'f3n \b CON RESTRICCION DE CAUSA \b0 para pasajeros con edad de hasta 70 a\'f1os cumplidos. A partir de los 71 a\'f1os se deber\

FUNTION USED:

ALTER FUNCTION [dbo].[RTF2TXT]
(@In VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
    If isnull(@In,'') = '' return ''
    If @In not like '{\rtf%' return @In
    Declare @Len int
    Declare @Loc int = 1
    Declare @Char char(1) = ''
    Declare @PrevChar char(1) = ''
    Declare @NextChar char(1) = ''
    Declare @InMarkup int = 0
    Declare @InBrackets int = -1
    Declare @Out varchar(max) = ''

        Set @Len = len(@In)
        While @Loc < @Len begin
            Set @PrevChar = @Char
            Set @Char = SUBSTRING(@In, @Loc, 1)
            If @Loc < @Len set @NextChar = SUBSTRING(@In, @Loc + 1, 1) else set @NextChar = ''
            Set @Loc = @Loc + 1
            If @Char = '{' and @PrevChar != '\' begin
                Set @InBrackets = @InBrackets + 1
                Continue
            End
            If @Char = '}' and @PrevChar != '\' begin
                Set @InBrackets = @InBrackets - 1
                Continue
            End
            If @Char = '\' and @PrevChar != '\' and @NextChar not in ('\','{','}','~','-','_') begin
                Set @InMarkup = 1
                continue
            End
            If @Char = ' ' or @Char = char(13) begin
                Set @InMarkup = 0

            End
            If @InMarkup > 0 or @InBrackets > 0 continue

            Set @Out = @Out + @Char

        End

    Set @Out = replace(@Out, '\\', '\')
    Set @Out = replace(@Out, '\{', '{')
    Set @Out = replace(@Out, '\}', '}')
    Set @Out = replace(@Out, '\~', ' ')
    Set @Out = replace(@Out, '\-', '-')
    Set @Out = replace(@Out, '\_', '-')
    SET @Out = REPLACE(@Out, '\pard', '');
    SET @Out = REPLACE(@Out, '\ulnone', '');
    SET @Out = REPLACE(@Out, '\ul', '');
    SET @Out = REPLACE(@Out, '{', '');
     SET @Out = REPLACE(@Out, '*', '');
    SET @Out = REPLACE(@Out, 'Msftedit 5.41.21.2510;}', '');



    WHILE ASCII(@Out) < 33
    BEGIN
    set @Out = substring(@Out,2,len(@Out))
    END

    set @Out = reverse(@Out)

    WHILE ASCII(@Out) < 33
    BEGIN
    set @Out = substring(@Out,2,len(@Out))
    END

    set @Out = reverse(@Out)

    RETURN LTRIM(RTRIM(@Out))
End

Upvotes: 0

Views: 1221

Answers (1)

David Browne - Microsoft
David Browne - Microsoft

Reputation: 89156

Many common collations use a code page that contains Ñ. So just make sure your column (or database) uses one of them.

In particular Latin1_General_CI_AS is the default collation on a Windows box set to 'Spanish (United States)'.

Of course you can always use nvarchar(max) instead. And on SQL 2019+ you can also use varchar(max) one of the new UTF8 encodings.

Upvotes: 1

Related Questions