Reputation: 4179
I have a column in a SQL Server database that stores a text block in the following fashion:
<HTML><HEAD><style type="text/css">BODY,TD,TH,BUTTON,INPUT,SELECT,TEXTAREA{FONT-SIZE: 10pt; COLOR: black; FONT-FAMILY: Arial,Helvetica;}BODY{MARGIN: 5px;}P,DIV,UL,OL,BLOCKQUOTE{MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px;}</style></HEAD><BODY> <p style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px">Patient is a 84 year old female. Patient's histpry includes the following:</p> <p style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px"> </p></BODY></HTML>
All I want to bring back from this particular example above would be:
Patient is an 84 year old female. Patient's histpry includes the following:
I honestly do not even know where to start, is there any HTML escape type functions in SQL Server 2014? I do not have access to CLI and I will need to run the code inside of a stored procedure that I have been tasked with creating.
Upvotes: 3
Views: 2431
Reputation: 67291
With HTML you never can be sure, that the cast to XML will be successful. But, after replacing
with simple blanks, you might go like this:
Declare @S varchar(max)='<HTML><HEAD><style type="text/css">BODY,TD,TH,BUTTON,INPUT,SELECT,TEXTAREA{FONT-SIZE: 10pt; COLOR: black; FONT-FAMILY: Arial,Helvetica;}BODY{MARGIN: 5px;}P,DIV,UL,OL,BLOCKQUOTE{MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px;}</style></HEAD><BODY> <p style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px">Patient is a 84 year old female. Patient''s histpry includes the following:</p> <p style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px"> </p></BODY></HTML>'
SELECT CAST(REPLACE(@S,' ',' ') AS XML).value('(//p/text())[1]','nvarchar(max)');
The result
Patient is a 84 year old female. Patient's histpry includes the following:
Upvotes: 2
Reputation: 81930
If open to a Table-Valued Function, consider the following.
Tired of extracting strings (left, right, charindex, patindex, reverse, etc), I modified a split/parse function to accept two non-like delimiters. In this case >
and </
Also, being a TVF, it is easy to incorporate into a CROSS APPLY if you data is in a table.
Example
Declare @S varchar(max)='<HTML><HEAD><style type="text/css">BODY,TD,TH,BUTTON,INPUT,SELECT,TEXTAREA{FONT-SIZE: 10pt; COLOR: black; FONT-FAMILY: Arial,Helvetica;}BODY{MARGIN: 5px;}P,DIV,UL,OL,BLOCKQUOTE{MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px;}</style></HEAD><BODY> <p style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px">Patient is a 84 year old female. Patient''s histpry includes the following:</p> <p style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px"> </p></BODY></HTML>'
Select *
From [dbo].[tvf-Str-Extract](replace(@S,' ',' '),'>','</')
Where RetVal<>' '
and RetVal not like 'BODY,%'
Returns
RetSeq RetPos RetVal
2 284 Patient is a 84 year old female. Patient's histpry includes the following:
Note: The WHERE is optional and may have to be tweaked to suite you actual needs. Just for fun, try it without the WHERE. Also, in this example, we trapped the
, but as you know, there may be many others i.e. —
.
The Function if Interested
CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By N)
,RetPos = N
,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1)
From (
Select *,RetVal = Substring(@String, N, L)
From cte4
) A
Where charindex(@Delimiter2,RetVal)>1
)
/*
Max Length of String 1MM characters
Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/
Upvotes: 2