Nemo
Nemo

Reputation: 1111

Parse SQL Server Data

I used SSIS and [SharePointListAdapters][1] [1]: http://sqlsrvintegrationsrv.codeplex.com/releases to import data from my SharePoint 2010 list and put it into SQL Server 2008 table. TThere are roughly 500 rows of data. Now the challenge is to parse data appropriately. I have a couple of columns that have html tags around them.

For e.g., Column Project Desc has data like Project Desc

<div class="ExternalClass914BB5DA5CB142EB854C739EAACEE3BB">
    <div>Import Data from SharePoint list to Database.
    </div>
</div>

On googling, I found the functio below. It works as expected. My questions is how do I edit this function to take the column name called Project Desc as a parameter so that all of the data in that particular column will have html tags discarded.

create function dbo.StripHTML( @text varchar(max) ) returns varchar(max) as 
begin 
    declare @textXML xml 
    declare @result varchar(max) 
    set @textXML = @text; 
    with doc(contents) as 
    ( 
        select chunks.chunk.query('.') from @textXML.nodes('/') as chunks(chunk) 
    ) 
    select @result = contents.value('.', 'varchar(max)') from doc 
    return @result 
end 
go 
select dbo.StripHTML('<div class="ExternalClassB1D90504EAFF42BE8A8056E686F4E195"> <div>Import Data from SharePoint list to Database.</div></div>') 

Upvotes: 1

Views: 1265

Answers (3)

Jeff Moden
Jeff Moden

Reputation: 3494

I found this answer on SQLServerCentral.com. Because it's an iTVF (Inline Table Valued Function), it works faster than the function you're using right now. About twice as fast, as a matter of fact. Ref: http://www.sqlservercentral.com/Forums/FindPost1198135.aspx

CREATE FUNCTION dbo.StripHtmlTags2
        (@HtmlText XML )
RETURNS TABLE
     AS 
 RETURN
(
SELECT
    @HtmlText.value('(.)[1]', 'nvarchar(max)') AS result
)

Upvotes: 1

Remus Rusanu
Remus Rusanu

Reputation: 294287

SQL and T-SQL have no means to parse HTML. Do not confuse HTML with XML...

You can use something like HTML Agility Pack to parse HTML from managed code, be it in your SSIS package or as a SQLCLR deployed function.

Upvotes: 1

MatBailie
MatBailie

Reputation: 86715

Once you've created the function, you just use it as if it was a normal field in a table, and put the source table and field in the place that the parameter goes...

SELECT
  dbo.StripHTML( yourTable.yourColumn ) AS yourResults
FROM
  yourTable

(You don't edit the function, you just supply it with data from your table.)

Upvotes: 1

Related Questions