Reputation: 4793
I want to read in an XML file and modify an element then save it back to the file. What is the best way to do this while preserving the format and also keep matching Line terminator (CRLF vs LF)?
Here is what I have but it doesn't do that:
$xml = [xml]([System.IO.File]::ReadAllText($fileName))
$xml.PreserveWhitespace = $true
# Change some element
$xml.Save($fileName)
The problem is that extra new lines (aka empty lines in the xml) are removed and after I have mixed LF and CRLF.
Upvotes: 37
Views: 34718
Reputation: 8005
You can use the PowerShell [xml] object and set $xml.PreserveWhitespace = $true
, or do the same thing using .NET XmlDocument
:
# NOTE: Full path to file is *highly* recommended
$f = Convert-Path '.\xml_test.xml'
# Using .NET XmlDocument
$xml = New-Object System.Xml.XmlDocument
$xml.PreserveWhitespace = $true
# Or using PS [xml] (older PowerShell versions may need to use psbase)
$xml = New-Object xml
$xml.PreserveWhitespace = $true
#$xml.psbase.PreserveWhitespace = $true # Older PS versions
# Load with preserve setting
$xml.Load($f)
$n = $xml.SelectSingleNode('//file')
$n.InnerText = 'b'
$xml.Save($f)
Just make sure to set PreserveWhitespace before calling XmlDocument.Load
or XmlDocument.LoadXml
.
NOTE: This does not preserve white space between XML attributes! White space in XML attributes seem to be preserved, but not between. The documentation talks about preserving "white space nodes" (node.NodeType = System.Xml.XmlNodeType.Whitespace
) and not attributes.
Upvotes: 61
Reputation: 2153
When reading xml the empty lines ignored by default, in order to preserve them you can change PreserveWhitespace
property before reading the file:
Create XmlDocument object and configure PreserveWhitespace:
$xmlDoc = [xml]::new()
$xmlDoc.PreserveWhitespace = $true
Load the document:
$xmlDoc.Load($myFilePath)
or
$xmlDoc.LoadXml($(Get-Content $myFilePath -Raw))
Upvotes: 7
Reputation: 27491
I don't see the line endings changing (\r\n), except the last one goes away. However, the encoding goes from ASCII to UTF8 with BOM.
$a = get-content -raw file.xml
$a -replace '\r','r' -replace '\n','n'
<?xml version="1.0" encoding="utf-8"?>rn<Configuration>rn <ViewDefinitions />rn</Configuration>rn
[xml]$b = get-content file.xml
$b.save('file.xml')
$a = get-content -raw file.xml
$a -replace '\r','r' -replace '\n','n'
<?xml version="1.0" encoding="utf-8"?>rn<Configuration>rn <ViewDefinitions />rn</Configuration>
# https://gist.github.com/jpoehls/2406504
get-fileencoding file.xml
UTF8
Upvotes: 0
Reputation: 788
If you would like to correct the CRLF that gets transformed to LF for text nodes after you call the Save method on the XmlDocument you can use a XmlWriterSettings instance. Uses the same XmlWriter as MilesDavies192s answer but also changes encoding to utf-8 and keeps indentation.
$xml = [xml]([System.IO.File]::ReadAllText($fileName))
$xml.PreserveWhitespace = $true
# Change some element
#Settings object will instruct how the xml elements are written to the file
$settings = New-Object System.Xml.XmlWriterSettings
$settings.Indent = $true
#NewLineChars will affect all newlines
$settings.NewLineChars ="`r`n"
#Set an optional encoding, UTF-8 is the most used (without BOM)
$settings.Encoding = New-Object System.Text.UTF8Encoding( $false )
$w = [System.Xml.XmlWriter]::Create($fileName, $settings)
try{
$xml.Save( $w )
} finally{
$w.Dispose()
}
Upvotes: 11
Reputation: 710
If you save using an XmlWriter the default options are to indent with two spaces and to replace the line endings with CR/LF. You can configure these options after creating the writer or create the writer with an XmlSettings object configured with your needs.
$fileXML = New-Object System.Xml.XmlDocument
# Try and read the file as XML. Let the errors go if it's not.
[void]$fileXML.Load($file)
$writerXML = [System.Xml.XmlWriter]::Create($file)
$fileXML.Save($writerXML)
Upvotes: 2