Reputation: 643
Let's say I have a list of ISBN values:
9781887902694
9780072227109
9780672323843
9780782121797
9781565924031
9780735713338
9780735713338
...
How would I use shell scripting/bash to retrieve the Title, Date Published, Author, and Publisher (from a website like bookfinder4u.com)? I'm newish to bash and so I'm not sure how to proceed.
Upvotes: 1
Views: 773
Reputation: 98861
If you're able to run php
, you can use:
bookDetails.php
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
//you may want to un-comment the code below to make sure your script doesn't timeout
//set_time_limit(0);
//ignore_user_abort(true);
libxml_use_internal_errors(true);
//get all the isbn numbers from a txt file
$isbns = file("isbn.txt", FILE_IGNORE_NEW_LINES);
//lopp all the isbn's
foreach($isbns as $isbn){
$html = file_get_contents("http://www.bookfinder4u.com/IsbnSearch.aspx?isbn=$isbn");
$dom = new DomDocument();
$dom->loadHtml($html);
$xpath = new DomXpath($dom);
$title = $xpath->query('//*[@class="t9"]')->item(1)->nodeValue;
$author = $xpath->query('//*[@class="t9"]')->item(2)->nodeValue;
$pubisherFormat = $xpath->query('//*[@id="format_pub_listprice"]')->item(0)->c14n();
$matches = preg_split('%</br>%', $pubisherFormat);
$publisher = strip_tags($matches[0]);
$format = strip_tags($matches[1]);
$price = $xpath->query('//*[@class="t8"]')->item(1)->nodeValue;
preg_match_all('/List price:\s*?(.*?[\d\.]+)/', $price, $price, PREG_PATTERN_ORDER);
$price = $price[1][0];
echo $title."\n";
echo $author."\n";
echo $publisher."\n";
echo $format."\n";
echo $price."\n\n";
}
assuming that isbn.txt
contains
9781887902694
9780072227109
9780672323843
The output will be:
Javascript: Concepts & Techniques; Programming Interactive Web Sites
By: Tina Spain McDuffie
Publisher: Franklin Beedle & Assoc - 2003-01
Format: Paperback
EUR 48.32
J2ME: The Complete Reference
By: James Keogh
Publisher: McGraw-Hill - 2003-02-27
Format: Paperback
EUR 57.94
Sams Teach Yourself J2ee in 21 Days with CDROM (Sams Teach Yourself...in 21 Days)
By: Martin Bond Dan Haywood Peter Roxburgh
Publisher: Sams - 2002-04
Format: Paperback
EUR 43.92
run from the shell
:
php bookDetails.php
Upvotes: 0
Reputation: 36
#!/bin/bash
if [ -z "$1" ] ; then echo "Usage: $0 <ISBN number>" ; exit 1 ; fi
curl -sL 'http://www.bookfinder4u.com/IsbnSearch.aspx?isbn='$1'&mode=direct'
That will get you the page, but parsing that response with grep and sed looks like it'd be really messy. If you know an API that will return JSON or XML, it'd be easier.
Upvotes: 1