Reputation: 397
What is the proper way to index PDF files ? I would like to add semantic information in them, and help search engines present the files more accurately, more precisely (a particular image, text inside the PDF file). I am thinking about using ontologies that engines already understand like Schema.org.
Upvotes: 5
Views: 908
Reputation: 4603
How about using schema.org to link to the PDF file from a web page like this:
<div itemscope itemtype="http://schema.org/Article">
<img itemprop="thumbnailUrl" src="http://www.example.com/how_to_build_a_web_app.jpg"/>
<a itemprop="url" href="http://www.example.com/how_to_build_a_web_app.pdf">
<span itemprop="name">How to Build a Web App</span></a>
by <span itemprop="author">John Smith</span>
<div itemprop="description">This short e-book explains what a web application
is and how to build one.</div>
</div>
This lets you associate a title, image and textual description with the article in the PDF.
Upvotes: 3