subject_x
subject_x

Reputation: 421

How do you use PowerShell to extract Epub meta data (XML)?

I'm not new to PowerShell, but I am to XML parsing. Basically I want to extract the title, creator, and publisher information from the OPF file, which is just an xml file. The book below is Moby Dick from Google's epub v3 sample collection.

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" xml:lang="en" unique-identifier="pub-  id" prefix="cc: http://creativecommons.org/ns#">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
        <dc:title id="title">Moby-Dick</dc:title>
        <meta refines="#title" property="title-type">main</meta>
        <dc:creator id="creator">Herman Melville</dc:creator>
        <meta refines="#creator" property="file-as">MELVILLE, HERMAN</meta>
        <meta refines="#creator" property="role" scheme="marc:relators">aut</meta>
        <dc:identifier id="pub-id">code.google.com.epub-samples.moby-dick-basic</dc:identifier>
        <dc:language>en-US</dc:language>
        <meta property="dcterms:modified">2012-01-18T12:47:00Z</meta>
        <dc:publisher>Harper &amp; Brothers, Publishers</dc:publisher>
        <dc:contributor id="contrib1">Dave Cramer</dc:contributor>
        <meta refines="#contrib1" property="role" scheme="marc:relators">mrk</meta>
        <dc:rights>This work is shared with the public using the Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.</dc:rights>        
        <link rel="cc:license" href="http://creativecommons.org/licenses/by-sa/3.0/"/>
        <meta property="cc:attributionURL">http://code.google.com/p/epub-samples/</meta>
    </metadata>
</package>

I've tried:

[xml]$opf = gc path/to/package.opf
$opf.package.metdata

I'm only able to get the tag and attribute information with this and not the text.

Upvotes: 2

Views: 1304

Answers (1)

Magnus Lindhe
Magnus Lindhe

Reputation: 7327

You need to use the #text property like this to get some of the values:

[xml] $opf = gc .\moby.opf

$title = $opf.package.metadata.title.'#text'
$creator = $opf.package.metadata.creator.'#text'
$publisher = $opf.package.metadata.publisher

Write-Host "$title written by $creator and published by $publisher"

Upvotes: 3

Related Questions