Wyatt Shipman
Wyatt Shipman

Reputation: 1789

SSIS XML Task with Untyped XML

I am pretty new to loading data from XML data sources into SQL Server, but I have had success with a properly formatted XML data source in the past. I have a web service that is being called through SSIS from a third party solution to get data out. This web service spits the data out like below:

<?xml version="1.0" encoding="utf-16"?>
<EpsTableEx xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <ErrorString />
  <ErrorNum>1</ErrorNum>
  <Data>
    <ArrayOfString>
      <string>ObjectId</string>
      <string>Form_Type</string>
      <string>Owner</string>
      <string>CompletedDate</string>
      <string>Delivered</string>
      <string>Name</string>
      <string>EventID</string>
    </ArrayOfString>
    <ArrayOfString>
      <string>183122</string>
      <string>Form1</string>
      <string>91b</string>
      <string>2015-03-02</string>
      <string>2015-04-22</string>
      <string>Onboarding</string>
      <string>21966</string>
    </ArrayOfString>
    <ArrayOfString>
      <string>183152</string>
      <string>Form1</string>
      <string>2879d</string>
      <string>2015-03-02</string>
      <string>2015-04-22</string>
      <string>Onboarding</string>
      <string>21967</string>
    </ArrayOfString>
  </Data>
</EpsTableEx>

The first ArrayOfString is the column headers, instead of the each element being a column header.

In another package with a properly formatted XML web service, I am using an XML Task Editor to get down to the data level with an XPATH Operation, which doesn't work for me here. This results in the ArrayOfString and String being stripped out and all of the data is concatenated together.

I have tried:

  1. Changing the XPATH Operation to different levels of the XML path without success.
  2. Altering the XSD file to try to force it to think the text within the ArrayOfString tag would be the text to insert into the table and then parse it later, but no data came through.
  3. Altering the XSD file to have String1, String2, etc. for the fields defined, hoping it will map to the generic string tags for each item.

I am using SQL Server 2014 and Data Tools to build the SSIS Package.

EDIT: The goal would be to have a solution within SSIS that can be ran nightly.

Edit2: The data would be loaded into a table like below:

CREATE TABLE [dbo].[FormXML](

[ObjectID] [nvarchar](255) NULL,
[Form_Type] [nvarchar](255) NULL,
[Owner] [nvarchar](255) NULL,
[CompletedDate] [nvarchar](255) NULL,
[Delivered] [nvarchar](255) NULL,
[Name] [nvarchar](255) NULL,
[EventID] [nvarchar](255) NULL,
[ADD_DTTM] [datetime] NULL DEFAULT (getdate()))

I don't care about the ErrorString or ErrorNum nodes. I adapted the suggested XML query to insert the data into the table using an Execute SQL Task.

The XML that is getting put into the variable in SSIS is +400k characters. I read online that there was a 2GB limit to a string variable according to this article (http://www.sqlservercentral.com/articles/SQL+Server/97947/). I suspect this is my issue, and I wouldn't be having this problem if i could load the XML from the web service into an XML Variable instead of a string variable.

Upvotes: 0

Views: 679

Answers (2)

Wyatt Shipman
Wyatt Shipman

Reputation: 1789

I solved this using 2 different variables. I fed the XML Output of the Web service into a string variable, let's call it String1. Then I created another variable, called String2, and created an expression to format the String1 into a usable XML String.

My Expression below replaces all instances of "\n", "\r", and " xmlns=\"http://Eprise\"" to get the string formatted properly. Then I used the SUBSTRING expression to only get the porting of the string within the data tags. The FINDSTRING located where the data tag started and the LEN helped find how long the string should actually be.

replace(
    replace(
        replace(
            substring(@[User::String1], 
                      FINDSTRING( @[User::String1] , "<Data", 1) , 
                      LEN( 
                        substring(@[User::String1], 
                                  FINDSTRING( @[User::String1] , 
                                            "<Data", 
                                            1),
                                  LEN(@[User::String1])
                                  )
                        ) -13 
                    )
            ,"\n","")
        ,"\r","")
    ," xmlns=\"http://Eprise\"","")

I had to populate an initial value in String1 that was at least 13 characters so that I could trim the end off of String1 to remove the last closing tag. I also needed the initial value to contain the text "String2 to evaluate to true initially.

Once String2 was formatted, I used and Execute SQL Task. I added a Parameter, using String2 as an Input direction of NVARCHAR type with a name of 0 and length of 2147483647, which is 2 gb, the max for a string in SSIS.

I tweaked the SQL Query from @Shnugo to:

declare @x as xml

set @x=?

insert into Database.dbo.Table ([ObjectID] ,[Form_Type],[Owner],[CompletedDate],[Delivered],[Name],[EventID])

SELECT block.value('string[1]', 'varchar(max)') AS ObjectId
    ,block.value('string[2]', 'varchar(max)') AS Form_Type
    ,block.value('string[3]', 'varchar(max)') AS OWNER
    ,block.value('string[4]', 'varchar(max)') AS CompletedDated
    ,block.value('string[5]', 'varchar(max)') AS Delivered
    ,block.value('string[6]', 'varchar(max)') AS NAME
    ,block.value('string[7]', 'varchar(max)') AS EventID
FROM @x.nodes('/Data/ArrayOfString[position()>1]') AS ArrayOfString(block)

The ? pulls in the value from the parameter which was my formatted XML String and then inserted the data into the table using the XML Query.

Upvotes: 0

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67291

Assuming that this XML is carrying a "table" with data rows I'd suggest not to take the column names from the first block. They can be "hard coded" probably:

Just paste this into an empty SQL query window and execute. Adapt it to your needs...

declare @x XML='<EpsTableEx xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <ErrorString />
  <ErrorNum>1</ErrorNum>
  <Data>
    <ArrayOfString>
      <string>ObjectId</string>
      <string>Form_Type</string>
      <string>Owner</string>
      <string>CompletedDate</string>
      <string>Delivered</string>
      <string>Name</string>
      <string>EventID</string>
    </ArrayOfString>
    <ArrayOfString>
      <string>183122</string>
      <string>Form1</string>
      <string>91b</string>
      <string>2015-03-02</string>
      <string>2015-04-22</string>
      <string>Onboarding</string>
      <string>21966</string>
    </ArrayOfString>
    <ArrayOfString>
      <string>183152</string>
      <string>Form1</string>
      <string>2879d</string>
      <string>2015-03-02</string>
      <string>2015-04-22</string>
      <string>Onboarding</string>
      <string>21967</string>
    </ArrayOfString>
  </Data>
</EpsTableEx>';

SELECT ArrayOfString.block.value('string[1]','int') AS ObjectId
      ,ArrayOfString.block.value('string[2]','varchar(max)') AS Form_Type
      ,ArrayOfString.block.value('string[3]','varchar(max)') AS [Owner]
      ,ArrayOfString.block.value('string[4]','date') AS CompletedDated
      ,ArrayOfString.block.value('string[5]','date') AS Delivered
      ,ArrayOfString.block.value('string[6]','varchar(max)') AS [Name]
      ,ArrayOfString.block.value('string[7]','int') AS EventID
FROM @x.nodes('/EpsTableEx/Data/ArrayOfString[position()>1]') AS ArrayOfString(block)

Upvotes: 1

Related Questions