Reputation: 822
I am scrapping a website for news and saving that in a database, however this 'job' or function will be running every hour and therefore to reduce number of records I keep it down to 20 records, now of course I don't want to keep old news therefore I want to either insert new query and delete previous one or update existing one. Number of records is always 20, but how will that be done? Surely, if I do insert+delete, id number will change each time. However if I do an update, how will I tell it to update first 1-20 id's.
Here's how it looks like in a database:
And here's code so far:
function oldham_chronicle(){
$client = new Client();
$crawler = $client->request('GET', 'http://www.oldham-chronicle.co.uk/news-features');
$crawler->filter('div[id=content]>.teaser-50')->each(function ($node) {
$test = $node->filter('.plain')->text();
$test2 = $node->filter('.dateonline')->text();
$news = new News();
$news->title = $test;
$news->datepublished = $test2;
$news->save();
});
return view('chronicle');
Upvotes: 1
Views: 3259
Reputation: 314
I recommend you to update, you can try with this
function oldham_chronicle(){
$client = new Client();
$crawler = $client->request('GET', 'http://www.oldham-chronicle.co.uk/news-features');
$crawler->filter('div[id=content]>.teaser-50')->each(function ($node, $key) {
$test = $node->filter('.plain')->text();
$test2 = $node->filter('.dateonline')->text();
$id = $key + 1;
$news = News::where('id', $id)->first();
// if news is null
if (!$news) {
$news = new News();
}
$news->title = $test;
$news->datepublished = $test2;
$news->save();
});
return view('chronicle');
I haven't test my code, because I don't have the data.
Upvotes: 3