KeyboardFriendly
KeyboardFriendly

Reputation: 1798

Querying a document in SQL Server 2012 stored as xml

I have been trying to learn Open XML, and I successfully generated a free text 2007 .docx file in Visual Studio c#. After that I saved the file in the .xml format in Word and then Imported the .xml file into sql server 2012 into a table called XML to a column called xml: here is the code for the table:

CREATE TABLE [dbo].[XML](
    [XML_ID] [int] IDENTITY(1,1) NOT NULL,
    [XML] [xml] NOT NULL,
 CONSTRAINT [PK_XML] PRIMARY KEY CLUSTERED 
(
    [XML_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

Here is the insert I ran

INSERT INTO XML(XML)
SELECT * FROM OPENROWSET(
   BULK 'C:\Users\bbt2d\Desktop\Document.xml',
   SINGLE_BLOB) AS x;

When I ran a select * the xml file showed up, but when I tried querying some of the data from the paragraphs I wasn't getting anywhere.

Here is a partial snipper of the xml that is stored in sql server: Was wondering if I went about the process wrong, or how would you query out text from the document.

?mso-application progid="Word.Document"?>
<pkg:package xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">
  <pkg:part pkg:name="/_rels/.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="512">
    <pkg:xmlData>
      <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
        <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml" />
        <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml" />
        <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml" />
      </Relationships>
    </pkg:xmlData>
  </pkg:part>
  <pkg:part pkg:name="/word/_rels/document.xml.rels" pkg:contentType="application/vnd.openxmlformats-package.relationships+xml" pkg:padding="256">
    <pkg:xmlData>
      <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
        <Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml" />
        <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml" />
        <Relationship Id="rId7" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footer" Target="footer1.xml" />
        <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" />
        <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering" Target="numbering.xml" />
        <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes" Target="endnotes.xml" />
        <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes" Target="footnotes.xml" />
        <Relationship Id="rId10" Type="http://schemas.microsoft.com/office/2007/relationships/stylesWithEffects" Target="stylesWithEffects.xml" />
        <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml" />
        <Relationship Id="rId9" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml" />
      </Relationships>
    </pkg:xmlData>
  </pkg:part>
  <pkg:part pkg:name="/word/document.xml" pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml">
    <pkg:xmlData>
      <w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
        <w:body>
          <w:p w:rsidR="00C32D34" w:rsidRDefault="00C32D34">
            <w:pPr>
              <w:pStyle w:val="Heading1" />
              <w:spacing w:line="240" w:lineRule="auto" />
              <w:jc w:val="center" />
              <w:rPr>
                <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" />
                <w:sz w:val="44" />
                <w:szCs w:val="44" />
              </w:rPr>
            </w:pPr>
            <w:r>
              <w:rPr>
                <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" />
                <w:sz w:val="44" />
                <w:szCs w:val="44" />
              </w:rPr>
              <w:t>School District of Marshfield</w:t>
            </w:r>
          </w:p>
          <w:p w:rsidR="00C32D34" w:rsidRDefault="00C32D34">
            <w:pPr>
              <w:autoSpaceDE w:val="0" />
              <w:autoSpaceDN w:val="0" />
              <w:adjustRightInd w:val="0" />
              <w:spacing w:after="0" w:line="240" w:lineRule="auto" />
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
            </w:pPr>
          </w:p>
          <w:p w:rsidR="00C32D34" w:rsidRDefault="00DE219B">
            <w:pPr>
              <w:autoSpaceDE w:val="0" />
              <w:autoSpaceDN w:val="0" />
              <w:adjustRightInd w:val="0" />
              <w:spacing w:after="0" w:line="240" w:lineRule="auto" />
              <w:jc w:val="center" />
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
            </w:pPr>
            <w:r>
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
              <w:t>Exploring French</w:t>
            </w:r>
            <w:r w:rsidR="00C32D34">
              <w:rPr>
                <w:rFonts w:ascii="Cambria" w:hAnsi="Cambria" w:cs="Cambria" />
                <w:b />
                <w:bCs />
                <w:sz w:val="28" />
                <w:szCs w:val="28" />
              </w:rPr>
              <w:t xml:space="preserve"> &amp; Spanish</w:t>
            </w:r>

Upvotes: 1

Views: 2028

Answers (2)

marc_s
marc_s

Reputation: 755361

The XML you posted isn't complete - assuming your table's XML is complete, you can try something like this (BTW: as a side-note: I would recommend to never use reserved words for your columns; don't call your column XML - that's a reserved keyword; use XmlDoc or something like that....)

-- define the two XML namespaces we need to use to get to the data in question
;WITH XMLNAMESPACES ('http://schemas.microsoft.com/office/2006/xmlPackage' as pkg, 
                     'http://schemas.openxmlformats.org/wordprocessingml/2006/main' as w)
select
    XmlContent.value('(//w:t)[2]', 'varchar(100)')
FROM dbo.YourTable
where ....(some condition here).....

The really interesting part is finding out the proper XPath expression to get to your data. Unfortunately, the text you're interested in doesn't seem to be in a specifically marked paragraph, e.g. the <w:r> and <w:rPr> tags surrounding it don't have any specific attributes you can check for - so all you can do is figure out that it's the second <w:t> node in that document that you're interested in, and get to its text that way.

Upvotes: 3

Kiran1016
Kiran1016

Reputation: 914

i allways suggest to use Xquery when you are dealing with XML datatypes in sql server,Below is the url which gives you clear understanding of Xquery. http://www.youtube.com/watch?v=mj4qodGsgDA

Upvotes: 1

Related Questions