Reputation: 2201
I am looking for a Powershell function to convert XML into a PsCustomObject which can finally exported as JSON. For this I created this small XML Test object:
[xml]$Xml = @"
<Action name="Test" id="1">
<Text>sample</Text>
<sub name="s1" id="2" />
<sub name="s2" id="3" />
<end details="no" />
</Action>
"@
This gives my an XML DocumentElement which I finally need to convert into the same object like the one from this call:
$Json = convertfrom-json @"
{
"Action": {
"name": "Test", "id": "1", "Text": "sample",
"sub": [
{"name": "s1","id": "2"},
{"name": "s2","id": "3"}
],
"End": {"details": "no"}
}
}
"@
Is there any smart way to get this done? I tested multiple functions from similar questions here but nothing really works as expected.
Upvotes: 4
Views: 1526
Reputation: 27756
EDIT: There is a very succinct solution using PowerShell 7 or newer, scroll down to bottom of this answer.
Because of the ambiguities, there is no standard way of converting XML to JSON. So you really have to roll your own function that interprets the XML in the way that matches your desired output.
Here is a generic solution:
Function ConvertFrom-MyXml( [xml.XmlNode] $node ) {
# Create an ordered hashtable
$ht = [ordered] @{}
# Copy the XML attributes to the hashtable
$node.Attributes.ForEach{ $ht[ $_.Name ] = $_.Value }
$node.ChildNodes.ForEach{
if( $_.FirstChild -is [xml.XmlText] ) {
# Add content of XML text node
Add-DictionaryArrayItem -Dict $ht -Key $_.LocalName -Value $_.FirstChild.InnerText
}
elseif( $_ -is [xml.XmlElement] ) {
# Add nested hashtable for the XML child elements (recursion)
Add-DictionaryArrayItem -Dict $ht -Key $_.LocalName -Value (ConvertFrom-MyXml $_)
}
}
$ht # Output
}
Function Add-DictionaryArrayItem( $Dict, $Key, $Value ) {
if( $Dict.Contains( $Key ) ) {
$curValue = $Dict[ $Key ]
# If existing value is not already a list...
if( $curValue -isnot [Collections.Generic.List[object]] ) {
# ...turn it into a list.
$curValue = [Collections.Generic.List[object]] @($curValue)
$Dict[ $Key ] = $curValue
}
# Add next value to the array. This updates the array in the hashtable,
# because $curValue is a reference.
$curValue.Add( $Value )
}
else {
# Key doesn't exist in the hashtable yet, so simply add it.
$Dict[ $Key ] = $Value
}
}
[xml]$Xml = @"
<Action name="Test" id="1">
<Text>sample</Text>
<sub name="s1" id="2" />
<sub name="s2" id="3" />
<end details="no" />
</Action>
"@
ConvertFrom-MyXml $Xml | ConvertTo-Json -Depth 100
Output:
{
"Action": {
"name": "Test",
"id": "1",
"Text": "sample",
"sub": [
{
"name": "s1",
"id": "2"
},
{
"name": "s2",
"id": "3"
}
],
"end": {
"details": "no"
}
}
}
ConvertFrom-MyXml
outputs an ordered hashtable. There is no need to convert to PSCustomObject
as ConvertFrom-Json
works with hashtables as well. So we can keep the code simpler.ConvertFrom-MyXml
loops over attributes and elements (recursively) of the given XML node. It calls the helper function Add-DictionaryArrayItem
to create an array if a key already exists in the hashtable. Actually this is not a raw, fixed-size array (like @(1,2,3)
creates), but a dynamically resizable List
, which behaves very similar to an array but is much more efficient when adding many elements.sub
element won't be turned into an array. If some elements should always be converted to arrays, you'd have to pass some kind of schema to the function (e. g. a list of element names) or add metadata to the XML itself.As suggested by OP, here is an alternative version of the code, that consists of only a single function:
Function ConvertFrom-MyXml( [xml.XmlNode] $node ) {
$ht = [ordered] @{}
$node.Attributes.ForEach{ $ht[ $_.Name ] = $_.Value }
foreach( $child in $node.ChildNodes ) {
$key = $child.LocalName
$value = if( $child.FirstChild -is [xml.XmlText] ) {
$child.FirstChild.InnerText
} elseif( $child -is [xml.XmlElement] ) {
ConvertFrom-MyXml $child
} else {
continue
}
if( $ht.Contains( $Key ) ) {
$curValue = $ht[ $Key ]
if( $curValue -isnot [Collections.Generic.List[object]] ) {
$curValue = [Collections.Generic.List[object]] @($curValue)
$ht[ $Key ] = $curValue
}
$curValue.Add( $Value )
}
else {
$ht[ $Key ] = $Value
}
}
$ht # Output
}
This makes use of Newtonsoft.Json.JsonConvert
.
$xml = [xml]@"
<Action name="Test" id="1">
<Text>sample</Text>
<sub name="s1" id="2" />
<sub name="s2" id="3" />
<end details="no" />
</Action>
"@
# Convert XML to JSON
$json = [Newtonsoft.Json.JsonConvert]::SerializeXmlNode($xml, 'indent')
This outputs:
{
"Action": {
"@name": "Test",
"@id": "1",
"Text": "sample",
"sub": [
{
"@name": "s1",
"@id": "2"
},
{
"@name": "s2",
"@id": "3"
}
],
"end": {
"@details": "no"
}
}
}
It is relatively easy to get rid of the @
prefix of the XML attributes. Though this may cause collisions of XML attribute and element names, potentially making the JSON invalid:
$json = $json -replace '"@([^"\\]+)":', '"$1":'
Normally I'm strongly against using RegEx with serialized forms of complex data, as it is very hard to do safely. In the case of converting XML to JSON, the above RegEx should be pretty safe, because XML attribute names are not allowed to contain double quotation marks nor backslashes (which would cause the .NET XML parser to fail). If anyone proves me wrong, please drop a comment.
Upvotes: 3
Reputation: 2201
Based on the accepted solution I made some minor adjustments to have the exact same types like the internal "ConvertFom-Json" command. I also made some speed improvements. Here the updated code:
Function ConvertFrom-MyXml($node) {
$ht = [ordered] @{}
$arrKeys = [System.Collections.Generic.List[string]]::new()
foreach($attr in $node.Attributes) {$ht[$attr.Name] = $attr.Value}
foreach($child in $node.ChildNodes) {
$key = $child.LocalName
if ($child -isnot [xml.XmlElement]) {continue}
if( $child.FirstChild -is [xml.XmlText] ) {
$value = $child.FirstChild.InnerText
} else {
$value = ConvertFrom-MyXml $child
}
if( $ht.Contains($Key) ) {
$curValue = $ht[$Key]
if( $curValue.count -eq $null) {
$curValue = [System.Collections.Generic.List[object]]@($curValue)
$arrKeys.add($key)
$ht[$Key] = $curValue
}
$curValue.Add($Value)
} else {
$ht[$Key] = $Value
}
foreach($key in $arrKeys) {$ht[$key] = [object[]]$ht[$key]}
}
[PsCustomObject]$ht
}
Upvotes: 0
Reputation: 303
Look at this may help
class sub {
[string] $name;
[int] $id;
}
class end {
[string] $details;
}
class Action {
[string] $Text;
[sub] $sub1;
[sub] $sub2;
[end] $end;
[string] $name;
[int] $id;
}
<#
<Action name="Test" id="1">
<Text>sample</Text>
<sub name="s1" id="2" />
<sub name="s2" id="3" />
<end details="no" />
</Action>
#>
$firstitem = [Action]@{
text = 'sample';
name = "test";
id = "1";
sub1=@{
name = "s1";
id = "2";}
sub2 = @{
name = "s2";
id = "3";}
end = @{
details = "no";}
}
$firstitem | ConvertTo-Json
<#
Output =
{
"Text": "sample",
"sub1": {
"name": "s1",
"id": 2
},
"sub2": {
"name": "s2",
"id": 3
},
"end": {
"details": "no"
},
"name": "test",
"id": 1
}
#>
Upvotes: 0
Reputation: 60045
Might not be exactly what you're looking for but I would personally do this with classes:
class Sub {
[string] $Name
[Int] $Id
Sub([string] $Name, [int] $Id) {
$this.Name = $Name
$this.Id = $Id
}
}
# Windows PowerShell will not like it named End :)
class End2 {
[string] $Details
End2 ([string] $Details) {
$this.Details = $Details
}
}
class Action {
[string] $Name
[int] $Id
[string] $Text
[Sub[]] $Sub
[End2] $End
Action () { }
Action ([string] $Name, [int] $Id, [string] $Text, [object[]] $Sub, [End2] $End) {
$this.Name = $Name
$this.Id = $Id
$this.Text = $Text
$this.Sub = @( $Sub )
$this.End = $End
}
[string] ToJson() {
return @{ Action = $this } | ConvertTo-Json -Depth 99
}
}
Now you can instantiate and convert to to Json your Action
class like this:
[Action]::new(
'Test', 1, 'Sample',
@(
[Sub]::new('s1', 2)
[Sub]::new('s2', 3)
),
'No'
).ToJson()
Or like this:
([Action]@{
Name = 'Test'
Id = 1
Text = 'Sample'
Sub = @(
[Sub]::new('s1', 2)
[Sub]::new('s2', 3)
)
End = 'No'
}).ToJson()
Both would output the following Json:
{
"Action": {
"Name": "Test",
"Id": 1,
"Text": "Sample",
"Sub": [
{
"Name": "s1",
"Id": 2
},
{
"Name": "s2",
"Id": 3
}
],
"End": {
"Details": "No"
}
}
}
Upvotes: 0