MitchW
MitchW

Reputation: 13

Loading (large) XML data into Azure SQL database table

I have been working on a solution for transferring a large XML file into Azure SQL database, thus far unsuccessfully. The XML file is accessed via a fixed URL. I have no previous experience using SQL or Azure before this project, so I apologise if I've missed something obvious.

I setup the database with appropriate tables, I just need to populate the tables with my data.

The XML file comes in the following format, however there are several thousand jobs. The file, when downloaded, is approximately 1.2Mb however I want to access it from the URL not local computer.

<Response api-method="Current">
  <Status>OK</Status>
     <Jobs>
        <Job>
           <ID>N1234</ID>
           <Name>Job name here</Name>
           <Description/>Job description here</Description
           <StartDate>2018-10-08T00:00:00</StartDate>
           <DueDate>2018-10-21T00:00:00</DueDate>
           <Manager>
              <ID>12345</ID>
              <Name>John Smith</Name>
           </Manager>
        </Job>
        <Job>
            Repeat above several thousand times
        </Job>
     </Jobs>
</Response>

So far my most successful method works using some built-in Azure Logic Apps. I am using HTML GET to load the XML data, then passing it into the Liquid app which converts XML into JSON. I then expect JSON format can be easily loaded into a database table. This Logic App successfully converts the XML to JSON, but only when I strip the file down to ~80 jobs. Hence I know that my logic app works and that the file size is my issue.

XML to JSON Logic App

Does anyone have a suggestion for a better approach or a method of allowing larger files to be processed? I have also tried using SSMS, implementing T-SQL BulkInsert, OpenRowSet, etc but I've abandoned this approach for the time being.

Small selection of sources I've used while trying to find a solution:

Many thanks

Upvotes: 1

Views: 2686

Answers (1)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67341

I have no experience on how do I load an XML from an URL via Azure SQL Server. In normal SQL-Server there are some weird approaches, but I'd recommend do read the URL from another application.
But - as you managed to load the file somehow - this does not seem to be your issue anyway.

Why do you feel the need to convert this to JSON? SQL-Server can deal with native XML pretty well. The standard transfer format is NVARCHAR(MAX), which is an UCS-2-string, which is almost the same as UTF-16 or simple 2-byte-unicode. Any .Net-string can be sent to SQL-Server as is. The size limit of NVARCHAR(MAX) is big enough for your needs...

Just create a function like the following and call it via your reading application. You can pass the parameter as a string:

This function will accept a 2-byte_encoded unicode string and convert this to XML implicitly:

CREATE FUNCTION dbo.ReadTheXml(@xml XML)
RETURNS TABLE
AS
RETURN
    SELECT @xml.value('(/Response/Status/text())[1]','nvarchar(max)') AS Response_Status
          ,job.value('(ID/text())[1]','nvarchar(max)') AS Job_ID
          ,job.value('(Name/text())[1]','nvarchar(max)') AS Job_Name
          ,job.value('(Description/text())[1]','nvarchar(max)') AS Job_Description
          ,job.value('(StartDate/text())[1]','datetime') AS Job_StartDate
          ,job.value('(DueDate/text())[1]','datetime') AS Job_DueDate
          ,job.value('(Manager/ID/text())[1]','int') AS Job_Manager_ID
          ,job.value('(Manager/Name/text())[1]','nvarchar(max)') AS Job_Manager_Name
    FROM @xml.nodes('/Response/Jobs/Job') A(job)
GO

--Assuming you've got the XML loaded into a string already you can test it like this:

DECLARE @xml NVARCHAR(MAX) =         --the xml as NVARCHAR(MAX) string
N'<Response api-method="Current">
  <Status>OK</Status>
     <Jobs>
        <Job>
           <ID>N1234</ID>
           <Name>Job name here</Name>
           <Description>Job description here</Description>
           <StartDate>2018-10-08T00:00:00</StartDate>
           <DueDate>2018-10-21T00:00:00</DueDate>
           <Manager>
              <ID>12345</ID>
              <Name>John Smith</Name>
           </Manager>
        </Job>
        <Job>
           <ID>blah</ID>
           <Name>One more</Name>
           <Description>This is one more description</Description>
           <StartDate>2018-10-08T00:00:00</StartDate>
           <DueDate>2018-10-21T00:00:00</DueDate>
           <Manager>
              <ID>12345</ID>
              <Name>John Smith</Name>
           </Manager>
        </Job>
     </Jobs>
</Response>';

--Use the function exactly the same way as you'd use a table:

SELECT * INTO #tmpStagingTable 
FROM dbo.ReadTheXml(@xml); --pass in the XML as string

--call the result from the staging table
SELECT * FROM #tmpStagingTable;  --create a staging table *on the fly*
GO

--clean up for testing
DROP FUNCTION dbo.ReadTheXml;
DROP TABLE #tmpStagingTable;

The result should be inserted into a staging table. Do any needed cleansing and business logic against this staging table and proceed from there.

The result

+-----------------+--------+---------------+------------------------------+-------------------------+-------------------------+----------------+------------------+
| Response_Status | Job_ID | Job_Name      | Job_Description              | Job_StartDate           | Job_DueDate             | Job_Manager_ID | Job_Manager_Name |
+-----------------+--------+---------------+------------------------------+-------------------------+-------------------------+----------------+------------------+
| OK              | N1234  | Job name here | Job description here         | 2018-10-08 00:00:00.000 | 2018-10-21 00:00:00.000 | 12345          | John Smith       |
+-----------------+--------+---------------+------------------------------+-------------------------+-------------------------+----------------+------------------+
| OK              | blah   | One more      | This is one more description | 2018-10-08 00:00:00.000 | 2018-10-21 00:00:00.000 | 12345          | John Smith       |
+-----------------+--------+---------------+------------------------------+-------------------------+-------------------------+----------------+------------------+

Upvotes: 2

Related Questions