Honkytonk
Honkytonk

Reputation: 13

Converting a CSV file into appropriate XML format

I'm trying to get some tab-delimited CSV files into an XML via PowerShell, and I'm struggling to make sense of it all. I'm 100% sure it's something really simple I'm missing, so help would be massively appreciated. I'm very close but just can't get over the final hurdle.

I essentially want this script to iterate through a tab-delimited CSV it's pointed at ($path), take each line off of the CSV and put the data into an XML sheet ($xmlpath).

CSV example:

"Name"  "Item"  "Purchase Price"
"Jimmy Smits"   "Egg"   "£40"
"Edward Price"  "KumQuat"   "£6000"
"Timmy Mallet"  "Bug"   "£2"
"Edgar Allen Poe"   "Weird" "0"
"2Pac Shakur"   "Eggnog"    ""

XML output template example:

<registration>
    <Individual>
        <Name></Name>
        <Item></Item>
        <Purchase Price></Purchase Price>
    </Individual>
</registration>

And this is what I've been poking about with

$data = Import-Csv -Path $path -Delimiter "`t"

$xmlnames = $data | Select-Object "Name"
$xmlitems = $data | Select-Object "Item"
$xmlprice = $data | Select-Object "Purchase Price"

$entryTemplate = @'
<individual>
    <Name>$($xmlnames.Name)</Name>
    <Item>$($xmlitems.Item)</Item>
    <Purchase Price>$($xmlprice."Purchase Price")</Purchase Price>
</individual>
'@

$xml = $data | ForEach-Object {
    $ExecutionContext.InvokeCommand.ExpandString($entrytemplate)
}
$xml | Out-File $xmlpath

Which then puts out;

<Individual>
    <Name>Jimmy Smits Edward Price Timmy Mallet Edgar Allen Poe 2Pac Shakur</Name>
    <Item>System.Object IList.Item(int index) {get;set;}</Item>
    <Purchase Price>£40 £6000 £2 0 </Purchase Price>
</Individual>

So somewhere I'm messing up stupidly on two things;

  1. The $data | Select-Object 'thing' is giving me all of the list of 'things' rather than iterating through each of the 'things' and listing them one per XML entry. Yet is giving me the correct number of entries?
  2. My 'Items' are being listed as system objects (not values), despite being treated identically to everything else.

Upvotes: 1

Views: 1124

Answers (2)

rokumaru
rokumaru

Reputation: 1244

You can also use "LINQ to XML" to create xml.

using namespace "System.Xml.Linq"
Add-Type -AssemblyName System.Xml.Linq

[XElement]::new("registration", @(
    Import-Csv -Path $path -Delimiter "`t" | foreach {
        [XElement]::new("Individual",
            [XElement]::new("Name", $_.Name),
            [XElement]::new("Item", $_.Item),
            [XElement]::new("PurchasePrice", $_."Purchase Price"))
    }
)).Save($xmlpath)

Upvotes: 1

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200203

The three statements

$xmlnames = $data | Select-Object "Name"
$xmlitems = $data | Select-Object "Item"
$xmlprice = $data | Select-Object "Purchase Price"

give you lists of custom objects with just one property. Using these lists in your template inserts all values into a tag with each iteration. What you want is just the value from the current iteration, hence creating the lists before doesn't do you any good. What you actually want is the current object during each iteration of the loop.

In case of $xmlitems you're also running into a PowerShell gotcha, because the variables $xmlnames, $xmlitems, and $xmlprice are arrays. When using dot-access on an array PowerShell is doing something that is called member enumeration. Essentially $xmlnames.Name won't give you the value of the property Name of the array object, but the value of the property Name of all array elements.

Member enumeration only works if the array object itself doesn't have a member of that name, though. Which brings us back to $xmlitems. For that variable you're trying to get the value of the property Item of the array elements. However, the array object actually has a method Item(), so $xmlitems.Item tries to invoke that method instead of the Item property of the array elements. But since the method is overloaded and the invocation is missing parentheses ($xmlitems.Item instead of $xmlitems.Item()) PowerShell shows you the method definition instead.

Remove the Select-Object statements and replace the array variables in your XML fragment template with the "current object" variable $_ and the code will do what you want.

$entryTemplate = @'
<individual>
    <Name>$($_.Name)</Name>
    <Item>$($_.Item)</Item>
    <Purchase Price>$($_."Purchase Price")</Purchase Price>
</individual>
'@

Upvotes: 2

Related Questions