Reputation: 71
I have a column with data type rtf in SQL and I want to convert it to type varchar(max)
rtf can contain accent and character Ñ
I need only the text from this data.
Example RTF:
{\rtf1\ansi\ansicpg1252\deff0\deflang11274{\fonttbl{\f0\fnil\fprq1\fcharset0 Verdana;}{\f1\fswiss\fprq2\fcharset0 Verdana;}{\f2\fnil\fcharset0 MS Sans Serif;}{\f3\fnil\fcharset2 Symbol;}} {\colortbl ;\red8\green0\blue0;\red0\green0\blue0;\red0\green0\blue255;} {*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard{\pntext\f3\'B7\tab}{*\pn\pnlvlblt\pnf3\pnindent0{\pntxtb\'B7}}\qj\cf1\f0\fs16 Pasaje a\'e9reo de \b AEROLINEAS ARGENTINAS\b0 (Buenos Aires/Salvador/Buenos Aires) en clase \b\ldblquote V\rdblquote \cf2\b0\i sin equipaje incluido\i0\par \pard{\pntext\f3\'B7\tab}{*\pn\pnlvlblt\pnf3\pnindent0{\pntxtb\'B7}}Traslados en servicio regular\par {\pntext\f3\'B7\tab}7 noches de alojamiento con r\'e9gimen seg\'fan se indica\par \pard{\pntext\f3\'B7\tab}{*\pn\pnlvlblt\pnf3\pnindent0{\pntxtb\'B7}}\qj\cf0 Asistencia al viajero \b TRAVEL ACE DIAMANTE \b0 con seguro de cancelaci\'f3n \b CON RESTRICCION DE CAUSA \b0 para pasajeros con edad de hasta 70 a\'f1os cumplidos. A partir de los 71 a\'f1os se deber\
FUNTION USED:
ALTER FUNCTION [dbo].[RTF2TXT]
(@In VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
If isnull(@In,'') = '' return ''
If @In not like '{\rtf%' return @In
Declare @Len int
Declare @Loc int = 1
Declare @Char char(1) = ''
Declare @PrevChar char(1) = ''
Declare @NextChar char(1) = ''
Declare @InMarkup int = 0
Declare @InBrackets int = -1
Declare @Out varchar(max) = ''
Set @Len = len(@In)
While @Loc < @Len begin
Set @PrevChar = @Char
Set @Char = SUBSTRING(@In, @Loc, 1)
If @Loc < @Len set @NextChar = SUBSTRING(@In, @Loc + 1, 1) else set @NextChar = ''
Set @Loc = @Loc + 1
If @Char = '{' and @PrevChar != '\' begin
Set @InBrackets = @InBrackets + 1
Continue
End
If @Char = '}' and @PrevChar != '\' begin
Set @InBrackets = @InBrackets - 1
Continue
End
If @Char = '\' and @PrevChar != '\' and @NextChar not in ('\','{','}','~','-','_') begin
Set @InMarkup = 1
continue
End
If @Char = ' ' or @Char = char(13) begin
Set @InMarkup = 0
End
If @InMarkup > 0 or @InBrackets > 0 continue
Set @Out = @Out + @Char
End
Set @Out = replace(@Out, '\\', '\')
Set @Out = replace(@Out, '\{', '{')
Set @Out = replace(@Out, '\}', '}')
Set @Out = replace(@Out, '\~', ' ')
Set @Out = replace(@Out, '\-', '-')
Set @Out = replace(@Out, '\_', '-')
SET @Out = REPLACE(@Out, '\pard', '');
SET @Out = REPLACE(@Out, '\ulnone', '');
SET @Out = REPLACE(@Out, '\ul', '');
SET @Out = REPLACE(@Out, '{', '');
SET @Out = REPLACE(@Out, '*', '');
SET @Out = REPLACE(@Out, 'Msftedit 5.41.21.2510;}', '');
WHILE ASCII(@Out) < 33
BEGIN
set @Out = substring(@Out,2,len(@Out))
END
set @Out = reverse(@Out)
WHILE ASCII(@Out) < 33
BEGIN
set @Out = substring(@Out,2,len(@Out))
END
set @Out = reverse(@Out)
RETURN LTRIM(RTRIM(@Out))
End
Upvotes: 0
Views: 1221
Reputation: 89156
Many common collations use a code page that contains Ñ. So just make sure your column (or database) uses one of them.
In particular Latin1_General_CI_AS is the default collation on a Windows box set to 'Spanish (United States)'.
Of course you can always use nvarchar(max)
instead. And on SQL 2019+ you can also use varchar(max)
one of the new UTF8 encodings.
Upvotes: 1