ygoe
ygoe

Reputation: 20354

Get multiple text elements from XML

How can I get multiple text element from an XML document in PowerShell?

Here's an example:

<log>
  <logentry revision="152">
    <author>me</author>
    <date>2014-03-28T14:54:27.443978Z</date>
    <msg>Summary 1

* Note 1
* Note 2</msg></logentry>
  <logentry revision="153">
    <author>me</author>
    <date>2014-03-28T16:24:43.438847Z</date>
    <msg>Summary 2</msg>
  </logentry>
  <logentry revision="154">
    <author>me</author>
    <date>2014-03-31T16:00:01.590373Z</date>
    <msg>Summary 3</msg>
  </logentry>
  <logentry revision="155">
    <author>me</author>
    <date>2014-04-01T09:28:09.744015Z</date>
    <msg>Summary 4

* Note 3
* Note 4
    </msg>
  </logentry>
</log>

This is an output of svn log for specific revisions. I want to simplify log messages since the last script run for manual summarising into a text file. I can read the existing file, parse the last revision and call svn log for the new revisions. I'd like to get the following text output of the above XML document:

Summary 1
* Note 1
* Note 2
Summary 2
Summary 3
Summary 4
* Note 3
* Note 4

Also notice the inconsistent final newline in each "logentry/msg" element. All empty lines shall be removed, but all other line breaks must be kept. Also each "msg" element must be in a new line, not multiple messages glued together in one output line (which I now have, sort of).

Here's my current code:

$newMsgs = ($xml.log.logentry.msg).Replace("`n`n", "`n").Trim()

But it doesn't put each "msg" in a separate line. Also I don't understand what it does exactly and when it'll break. I am familiar with the BCL from C# but not so much PowerShell and its own way to solve things.

Upvotes: 0

Views: 232

Answers (1)

Robert Westerlund
Robert Westerlund

Reputation: 4838

You could easily just split the messages on the new line character and then filter out lines which do not have any content. In case you want to remove lines which only consist of whitespace, you could trim them before filtering. Here's an example:

$xml.log.logentry.msg -split "`n" | Foreach { $_.Trim() } | Where { $_ }

As a side note, you have a minor error in your sample xml. The first msg element is never closed.

Here's a full sample using your sample xml and filtering using the script above:

[xml]$xml = @"
<log>
  <logentry revision="152">
    <author>me</author>
    <date>2014-03-28T14:54:27.443978Z</date>
    <msg>Summary 1

* Note 1
* Note 2</msg>
  </logentry>
  <logentry revision="153">
    <author>me</author>
    <date>2014-03-28T16:24:43.438847Z</date>
    <msg>Summary 2</msg>
  </logentry>
  <logentry revision="154">
    <author>me</author>
    <date>2014-03-31T16:00:01.590373Z</date>
    <msg>Summary 3</msg>
  </logentry>
  <logentry revision="155">
    <author>me</author>
    <date>2014-04-01T09:28:09.744015Z</date>
    <msg>Summary 4

* Note 3
* Note 4
    </msg>
  </logentry>
</log>
"@

$xml.log.logentry.msg -split "`n" | Foreach { $_.Trim() } | Where { $_ }

This yields the requested output:

Summary 1
* Note 1
* Note 2
Summary 2
Summary 3
Summary 4
* Note 3
* Note 4

Upvotes: 2

Related Questions