Blair
Blair

Reputation: 263

Can PyPDF access any XMP metadata?

Using version 4.2.0 of pypdf, I would like to access XMP metadata from a file. The xmp_metadata property allows this readthedocs and provides access to many standard items as properties (e.g., dc_date). However, the accessible data is not always complete: there are many metadata items that I can see using a PDF reader but cannot read using pypdf.

So, my question is this: can other metadata elements be accessed in some way?

I suspect that the XmpInformation.get_element method would allow this. If so, can anyone explain how to use it, perhaps by example?

If pypdf cannot access other metadata elements, which other Python packages should I look at using? EDIT: See: this answer

Additional information

As an example, here is the XMP metadata embedded in a PDF file for a scientific paper published by the Institute of Physics journal Metrologia.

I would just like to know if it is possible to access some of this information (e.g., prism:doi) with the help of pypdf.

<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.2-c003 61.141987, 2011/02/22-12:03:51">
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:pdfx="http://ns.adobe.com/pdfx/1.3/"
 xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/" 
xmlns:xap="http://ns.adobe.com/xap/1.0/" 
xmlns:xapRights="http://ns.adobe.com/xap/1.0/rights/" 
xmlns:dc="http://purl.org/dc/elements/1.1/" 
xmlns:dcterms="http://purl.org/dc/terms/" 
xmlns:pdf="http://ns.adobe.com/pdf/1.3/" 
xmlns:prism="http://prismstandard.org/namespaces/basic/2.0/" 
xmlns:fr="http://www.crossref.org/fundref.xsd" 
xmlns="http://www.crossref.org/schema/4.3.3"
xmlns:crossmark="http://crossref.org/crossmark/1.0/">
    <rdf:Description rdf:about=""
       xmlns:pdfx="http://ns.adobe.com/pdfx/1.3/">
    <pdfx:doi>10.1088/0026-1394/52/4/613</pdfx:doi>
    <pdfx:robots>noindex</pdfx:robots>
    <pdfx:CrossMarkMajorVersionDate>2015-8-3</pdfx:CrossMarkMajorVersionDate>
    <pdfx:CrossmarkDomainExclusive>true</pdfx:CrossmarkDomainExclusive>
    <pdfx:CrossMarkDomains><rdf:Seq><rdf:li>iop.org</rdf:li></rdf:Seq></pdfx:CrossMarkDomains>
     </rdf:Description>
    <rdf:Description rdf:about=""
       xmlns:xap="http://ns.adobe.com/xap/1.0/">
    <xap:CreatorTool>IOPP</xap:CreatorTool> 
    </rdf:Description>
    <rdf:Description rdf:about=""
       xmlns:xapRights="http://ns.adobe.com/xap/1.0/rights/">
    <xapRights:Marked>True</xapRights:Marked> 
    </rdf:Description>
      <rdf:Description rdf:about=""
            xmlns:dc="http://purl.org/dc/elements/1.1/">
         <dc:format>application/pdf</dc:format>
         <dc:title>
            <rdf:Alt>
               <rdf:li xml:lang="x-default">Comment on &#x2018;Dimensionless units in the SI&#x2019;</rdf:li>
            </rdf:Alt>
         </dc:title>
         <dc:creator>
            <rdf:Seq><rdf:li>B P Leonard</rdf:li>
</rdf:Seq>
         </dc:creator>
         <dc:publisher>
            <rdf:Bag>
               <rdf:li>IOP Publishing</rdf:li>
            </rdf:Bag>
         </dc:publisher>
     <dc:identifier>doi:10.1088/0026-1394/52/4/613</dc:identifier>
     <dc:description>Metrologia, 52 (2015) 613. doi: 10.1088/0026-1394/52/4/613</dc:description>   
      </rdf:Description>
      <rdf:Description rdf:about=""
            xmlns:prism="http://prismstandard.org/namespaces/basic/2.0/">
         <prism:aggregationType>journal</prism:aggregationType>
         <prism:publicationName>Metrologia</prism:publicationName>
         <prism:copyright>&#x00A9; 2015 BIPM &amp; IOP Publishing Ltd</prism:copyright>
     <prism:issn>0026-1394</prism:issn>
         <prism:startingPage>613</prism:startingPage> 
         <prism:endingPage>616</prism:endingPage>
         <prism:pageRange>613</prism:pageRange> 
         <prism:doi>10.1088/0026-1394/52/4/613</prism:doi> 
         <prism:url>http://dx.doi.org/10.1088/0026-1394/52/4/613</prism:url>
        </rdf:Description>
    <rdf:Description rdf:about=""
       xmlns:crossmark="http://crossmark.crossref.org">
         <crossmark:MajorVersionDate>2015-8-3</crossmark:MajorVersionDate>
         <crossmark:CrossmarkDomainExclusive>true</crossmark:CrossmarkDomainExclusive>
         <crossmark:DOI>10.1088/0026-1394/52/4/613</crossmark:DOI>
         <crossmark:CrossMarkDomains><rdf:Seq><rdf:li>iop.org</rdf:li></rdf:Seq></crossmark:CrossMarkDomains>
     </rdf:Description>
   </rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>

Upvotes: 1

Views: 131

Answers (0)

Related Questions