Reputation: 764
How is books.xml
converted to CSV
and written to a file?
PS /home/nicholas/xml>
PS /home/nicholas/xml> $books = Import-Clixml books.xml
PS /home/nicholas/xml>
PS /home/nicholas/xml> $books.InnerXml
<?xml version="1.0"?><catalog><book id="bk101"><author>Gambardella, Matthew</author><title>XML Developer's Guide</title><genre>Computer</genre><price>44.95</price><publish_date>2000-10-01</publish_date><description>An in-depth look at creating applications
with XML.</description></book><book id="bk102"><author>Ralls, Kim</author><title>Midnight Rain</title><genre>Fantasy</genre><price>5.95</price><publish_date>2000-12-16</publish_date><description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description></book><book id="bk103"><author>Corets, Eva</author><title>Maeve Ascendant</title><genre>Fantasy</genre><price>5.95</price><publish_date>2000-11-17</publish_date><description>After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.</description></book><book id="bk104"><author>Corets, Eva</author><title>Oberon's Legacy</title><genre>Fantasy</genre><price>5.95</price><publish_date>2001-03-10</publish_date><description>In post-apocalypse England, the mysterious
agent known only as Oberon helps to create a new life
for the inhabitants of London. Sequel to Maeve
Ascendant.</description></book><book id="bk105"><author>Corets, Eva</author><title>The Sundered Grail</title><genre>Fantasy</genre><price>5.95</price><publish_date>2001-09-10</publish_date><description>The two daughters of Maeve, half-sisters,
battle one another for control of England. Sequel to
Oberon's Legacy.</description></book><book id="bk106"><author>Randall, Cynthia</author><title>Lover Birds</title><genre>Romance</genre><price>4.95</price><publish_date>2000-09-02</publish_date><description>When Carla meets Paul at an ornithology
conference, tempers fly as feathers get ruffled.</description></book><book id="bk107"><author>Thurman, Paula</author><title>Splish Splash</title><genre>Romance</genre><price>4.95</price><publish_date>2000-11-02</publish_date><description>A deep sea diver finds true love twenty
thousand leagues beneath the sea.</description></book><book id="bk108"><author>Knorr, Stefan</author><title>Creepy Crawlies</title><genre>Horror</genre><price>4.95</price><publish_date>2000-12-06</publish_date><description>An anthology of horror stories about roaches,
centipedes, scorpions and other insects.</description></book><book id="bk109"><author>Kress, Peter</author><title>Paradox Lost</title><genre>Science Fiction</genre><price>6.95</price><publish_date>2000-11-02</publish_date><description>After an inadvertant trip through a Heisenberg
Uncertainty Device, James Salway discovers the problems
of being quantum.</description></book><book id="bk110"><author>O'Brien, Tim</author><title>Microsoft .NET: The Programming Bible</title><genre>Computer</genre><price>36.95</price><publish_date>2000-12-09</publish_date><description>Microsoft's .NET initiative is explored in
detail in this deep programmer's reference.</description></book><book id="bk111"><author>O'Brien, Tim</author><title>MSXML3: A Comprehensive Guide</title><genre>Computer</genre><price>36.95</price><publish_date>2000-12-01</publish_date><description>The Microsoft MSXML3 parser is covered in
detail, with attention to XML DOM interfaces, XSLT processing,
SAX and more.</description></book><book id="bk112"><author>Galos, Mike</author><title>Visual Studio 7: A Comprehensive Guide</title><genre>Computer</genre><price>49.95</price><publish_date>2001-04-16</publish_date><description>Microsoft Visual Studio 7 is explored in depth,
looking at how Visual Basic, Visual C++, C#, and ASP+ are
integrated into a comprehensive development
environment.</description></book></catalog>
PS /home/nicholas/xml>
PS /home/nicholas/xml> $books | Export-Csv books.csv
PS /home/nicholas/xml>
PS /home/nicholas/xml> cat ./books.csv
"xml","catalog"
"version=""1.0""","System.Xml.XmlElement"
PS /home/nicholas/xml>
It would seem that $csv
is converted, but how is it written as a CSV
file?
Converting to JSON
didn't go much better:
PS /home/nicholas/xml>
PS /home/nicholas/xml>
PS /home/nicholas/xml> $books | ConvertTo-Json
WARNING: Resulting JSON is truncated as serialization has exceeded the set depth of 2.
[
[],
[
[
"System.Xml.XmlElement",
"System.Xml.XmlElement",
"System.Xml.XmlElement",
..
even with a depth of 9 or greater.
Here's the output of $books
:
PS /home/nicholas/xml>
PS /home/nicholas/xml> $books
xml catalog
--- -------
version="1.0" catalog
PS /home/nicholas/xml> $books.ChildNodes
Version : 1.0
Encoding :
Standalone :
Value : version="1.0"
InnerText : version="1.0"
Name : xml
LocalName : xml
NodeType : XmlDeclaration
PreviousSibling :
NextSibling : catalog
ParentNode : #document
ChildNodes : {}
Attributes :
OwnerDocument : #document
FirstChild :
LastChild :
HasChildNodes : False
NamespaceURI :
Prefix :
IsReadOnly : False
OuterXml : <?xml version="1.0"?>
InnerXml :
SchemaInfo : System.Xml.Schema.XmlSchemaInfo
BaseURI :
PreviousText :
book : {book, book, book, book…}
PS /home/nicholas/xml>
which is far from CSV
.
Upvotes: 0
Views: 170
Reputation: 34421
Try following :
using assembly System
using assembly System.Collections
using assembly System.Xml.Linq
using assembly System.IO
$inputFilename = "c:\temp\test.xml"
$outputFilename = "c:\temp\test.csv"
$doc = [System.Xml.Linq.XDocument]::Load($inputFilename)
$writer = New-Object System.IO.StreamWriter($outputFilename)
$header = "id,author,title,genre,price,publish,description"
$writer.Writeline($header)
$books = $doc.Descendants("book")
foreach($book in $books)
{
$id = $book.Attribute("id").Value
$author = $book.Element("author").Value
$title = $book.Element("title").Value
$genre = $book.Element("genre").Value
$price = $book.Element("price").Value
$date = $book.Element("publish_date").Value
$description = $book.Element("description").Value
$lineArray = @($id,$author,$title,$genre,$price,$date,$description)
Write-Host $lineArray
$line = [String]::Join(",", $lineArray)
Write-Host $line
$writer.Writeline($line)
}
$writer.Flush()
$writer.Close()
I update code below
Added double quotes around the field because data contained commas.
Merged description field to remove carriage return and extra spaces using Regex.
using assembly System using assembly System.Collections using assembly System.Xml.Linq using assembly System.IO using assembly System.Text.RegularExpressions
$inputFilename = "c:\temp\test.xml" $outputFilename = "c:\temp\test.csv"
$doc = [System.Xml.Linq.XDocument]::Load($inputFilename) $writer = New-Object System.IO.StreamWriter($outputFilename) $headerArray = @("id","author","title","genre","price","publish","description") $header = [String]::Join(""",""", $headerArray) $writer.Writeline("""" + $header + """")
$books = $doc.Descendants("book")
foreach($book in $books) {
$id = $book.Attribute("id").Value
$author = $book.Element("author").Value
$title = $book.Element("title").Value
$genre = $book.Element("genre").Value
$price = $book.Element("price").Value
$date = $book.Element("publish_date").Value
$description = $book.Element("description").Value
$description = [System.Text.RegularExpressions.Regex]::Replace($description, "\s+", " ");
$lineArray = @($id,$author,$title,$genre,$price,$date,$description)
#Write-Host $lineArray
$line = """" + [String]::Join(""",""", $lineArray) + """"
#Write-Host $line
$writer.Writeline($line)
} $writer.Flush() $writer.Close()
Upvotes: 1
Reputation: 437813
Import-Clixml
is not meant for importing arbitrary XML documents; it is solely designed to work with XML documents that contain CLIXML data, which is an XML-based serialization format for representing instances of .NET types for cross-process communication, such as produced by Export-Clixml
. CLIXML is typically used behind the scenes in PowerShell remoting (and, more generally, cross-process communication).
Perhaps surprisingly, there is no general-purpose cmdlet for importing arbitrary XML documents (however, there is a cmdlet for querying and extracting data from XML documents, namely Select-Xml
).
You have to work with the [xml]
type (System.Xml.XmlDocument
) to import XML data into a DOM, which then allows you to transform the data to a different format:
# Load and parse the XML file into a DOM.
($xmlDoc = [xml]::new()).Load((Convert-Path books.xml))
# Export the book child elements of the catalog element to a CSV file.
$xmlDoc.catalog.book | Export-Csv books.csv
# Parse the CSV file into objects ([pscustomobject] instances)
# and output them to the console with friendly formatting.
Import-Csv books.csv
Note: In Windows PowerShell, consider using -NoTypeInformation
with Export-Csv
, as well as the -Encoding
parameter to control the output character encoding, which defaults to ASCII(!). In PowerShell (Core) 7+, fortunately, -NoTypeInformation
is now implied and the default character encoding is (BOM-less) UTF-8.
Output:
id : bk101
author : Gambardella, Matthew
title : XML Developer's Guide
genre : Computer
price : 44.95
publish_date : 2000-10-01
description : An in-depth look at creating applications
with XML.
id : bk102
author : Ralls, Kim
title : Midnight Rain
genre : Fantasy
price : 5.95
publish_date : 2000-12-16
description : A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.
id : bk103
author : Corets, Eva
title : Maeve Ascendant
genre : Fantasy
price : 5.95
publish_date : 2000-11-17
description : After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.
id : bk104
author : Corets, Eva
title : Oberon's Legacy
genre : Fantasy
price : 5.95
publish_date : 2001-03-10
description : In post-apocalypse England, the mysterious
agent known only as Oberon helps to create a new life
for the inhabitants of London. Sequel to Maeve
Ascendant.
id : bk105
author : Corets, Eva
title : The Sundered Grail
genre : Fantasy
price : 5.95
publish_date : 2001-09-10
description : The two daughters of Maeve, half-sisters,
battle one another for control of England. Sequel to
Oberon's Legacy.
id : bk106
author : Randall, Cynthia
title : Lover Birds
genre : Romance
price : 4.95
publish_date : 2000-09-02
description : When Carla meets Paul at an ornithology
conference, tempers fly as feathers get ruffled.
id : bk107
author : Thurman, Paula
title : Splish Splash
genre : Romance
price : 4.95
publish_date : 2000-11-02
description : A deep sea diver finds true love twenty
thousand leagues beneath the sea.
id : bk108
author : Knorr, Stefan
title : Creepy Crawlies
genre : Horror
price : 4.95
publish_date : 2000-12-06
description : An anthology of horror stories about roaches,
centipedes, scorpions and other insects.
id : bk109
author : Kress, Peter
title : Paradox Lost
genre : Science Fiction
price : 6.95
publish_date : 2000-11-02
description : After an inadvertant trip through a Heisenberg
Uncertainty Device, James Salway discovers the problems
of being quantum.
id : bk110
author : O'Brien, Tim
title : Microsoft .NET: The Programming Bible
genre : Computer
price : 36.95
publish_date : 2000-12-09
description : Microsoft's .NET initiative is explored in
detail in this deep programmer's reference.
id : bk111
author : O'Brien, Tim
title : MSXML3: A Comprehensive Guide
genre : Computer
price : 36.95
publish_date : 2000-12-01
description : The Microsoft MSXML3 parser is covered in
detail, with attention to XML DOM interfaces, XSLT processing,
SAX and more.
id : bk112
author : Galos, Mike
title : Visual Studio 7: A Comprehensive Guide
genre : Computer
price : 49.95
publish_date : 2001-04-16
description : Microsoft Visual Studio 7 is explored in depth,
looking at how Visual Basic, Visual C++, C#, and ASP+ are
integrated into a comprehensive development
environment.
Upvotes: 1