Reputation: 97
I have a file named collection.xml which contains a list of movies in my Easter movie collection shown below. Once I get these I'm trying to use Split to split the files on the strings "Movies/" and . This would result in movie names like:
The Easter Bunny is Coming to Town (2006).mp4
I've been trying various permutations of Split() and the -split modifier. How can I split the output below to get just the movie names as shown above?
Get-Content .\collection.xml | Select-String Path
<Path>/volume1/Media Library/Movies/Here Comes Peter Cottontail (1971).mp4</Path>
<Path>/volume1/Media Library/Movies/Here Comes Peter Cottontail - The Movie (2005).mp4</Path>
<Path>/volume1/Media Library/Movies/The Easter Bunny is Coming to Town (2006).mp4</Path>
<Path>/volume1/Media Library/Movies/Its The Easter Beagle Charlie Brown (2008).mp4</Path>
<Path>/volume1/Media Library/Movies/Hop (2011).mp4</Path>
<Path>/volume1/Media Library/Movies/Peter Rabbit (2018).mp4</Path>
Upvotes: 1
Views: 105
Reputation: 31296
It's straightforward if you treat this as an XML file full of filenames as you can do this in a single line; I've broken into a multiple for ease of reading:
Option 1:
([xml](get-content temp.txt)).SelectNodes("//Path") | foreach-object {
[io.path]::GetFileNameWithoutExtension($_.'#text')
}
This effectively:
Option 2:
Pretty much the same, but using more native XML cmdlets, which may make easier reading:
(select-xml -xpath '//Path' -path .\temp.txt).Node | foreach-object {
[io.path]::GetFilenameWithoutExtension($_.'#text')
}
Again, tune the XPath to suit your XML file.
There's various ways to structure both of these for your taste (and exact XML format) by moving the ".Node" and ".'#text'" selectors inside (or outside) the foreach; for example, we can remove the brackets around select-xml
in the line above by shifting Node within the foreach:
select-xml -xpath '//Path' -path .\temp.txt | foreach-object {
[io.path]::GetFilenameWithoutExtension($_.Node.'#text')
}
...and variations on a theme. Your XML file structure can have a bearing on this; anything else is personal preference and readability.
Upvotes: 1
Reputation: 801
Since I don't know what your full .xml file looks like, I made this with just the info you gave, and some simple regex.
I'm making a couple assumptions here
$banana = Get-Content C:\Temp\collection.xml | Select-String Path
foreach($line in $banana)
{
#load the line as an xml object, expand the path property, and replace the characters we don't want.
([xml]$line).Path -replace "^\/.+Movies\/|\..+$"
}
Those hieroglyphics after the -replace
mean this
^
: Start of the line
\/
: Literal /
character
.
: Any character (except line terminators)
+
: At least one, but up to infinite
Movies
: The literal string "Movies"
\/
: Literal /
character
|
: Or
\.
: Literal .
(period) character
.+
: .
and +
combined, meaning any character least once, but up to infinite
$
: End of the line
Upvotes: 0