Reputation: 436
I've been using the Batch API successfully to do processing that would normally lead to PHP timeouts or out of memory errors, and it's been working nicely.
I've looked through the code a little, but I'm still unclear about what's happening behind the scenes.
Could someone familiar with the process describe how it works?
Upvotes: 4
Views: 1266
Reputation: 29679
I've looked through the code a little, but I'm still unclear about what's happening behind the scenes.
Could someone familiar with the process describe how it works?
What happens is that, to avoid PHP time outs, the browser periodically pings through AJAX the URL (http://example.com/batch?id=$id) that causes the batch operations to be executed.
See _batch_page(), which is the function called by system_batch_page(), the menu callback for the "batch" path.
function _batch_page() {
$batch = &batch_get();
// Retrieve the current state of batch from db.
if (isset($_REQUEST['id']) && $data = db_result(db_query("SELECT batch FROM {batch} WHERE bid = %d AND token = '%s'", $_REQUEST['id'], drupal_get_token($_REQUEST['id'])))) {
$batch = unserialize($data);
}
else {
return FALSE;
}
// Register database update for end of processing.
register_shutdown_function('_batch_shutdown');
$op = isset($_REQUEST['op']) ? $_REQUEST['op'] : '';
$output = NULL;
switch ($op) {
case 'start':
$output = _batch_start();
break;
case 'do':
// JS-version AJAX callback.
_batch_do();
break;
case 'do_nojs':
// Non-JS progress page.
$output = _batch_progress_page_nojs();
break;
case 'finished':
$output = _batch_finished();
break;
}
return $output;
}
In _batch_progress_page_nojs(), you will notice the following code.
$url = url($batch['url'], array('query' => array('id' => $batch['id'], 'op' => $new_op)));
drupal_set_html_head('<meta http-equiv="Refresh" content="0; URL=' . $url . '">');
$output = theme('progress_bar', $percentage, $message);
return $output;
Setting the "Refresh" meta tag will cause the page to refresh.
Similar code is present in Drupal 7; the difference is that the code has been ported, and it uses the new functions Drupal 7 implements.
// Merge required query parameters for batch processing into those provided by
// batch_set() or hook_batch_alter().
$batch['url_options']['query']['id'] = $batch['id'];
$batch['url_options']['query']['op'] = $new_op;
$url = url($batch['url'], $batch['url_options']);
$element = array(
'#tag' => 'meta',
'#attributes' => array(
'http-equiv' => 'Refresh',
'content' => '0; URL=' . $url,
),
);
drupal_add_html_head($element, 'batch_progress_meta_refresh');
return theme('progress_bar', array('percent' => $percentage, 'message' => $message));
When JavaScript is enabled, the code that does all the work is in the batch.js file.
/**
* Attaches the batch behavior to progress bars.
*/
Drupal.behaviors.batch = function (context) {
// This behavior attaches by ID, so is only valid once on a page.
if ($('#progress.batch-processed').size()) {
return;
}
$('#progress', context).addClass('batch-processed').each(function () {
var holder = this;
var uri = Drupal.settings.batch.uri;
var initMessage = Drupal.settings.batch.initMessage;
var errorMessage = Drupal.settings.batch.errorMessage;
// Success: redirect to the summary.
var updateCallback = function (progress, status, pb) {
if (progress == 100) {
pb.stopMonitoring();
window.location = uri+'&op=finished';
}
};
var errorCallback = function (pb) {
var div = document.createElement('p');
div.className = 'error';
$(div).html(errorMessage);
$(holder).prepend(div);
$('#wait').hide();
};
var progress = new Drupal.progressBar('updateprogress', updateCallback, "POST", errorCallback);
progress.setProgress(-1, initMessage);
$(holder).append(progress.element);
progress.startMonitoring(uri+'&op=do', 10);
});
};
The polling of the batch URL starts with progress.startMonitoring(uri+'&op=do', 10)
. The batch.js file depends from the functionality exposed in Drupal.progressBar
, which is defined in the progress.js file.
Similar code is used in Drupal 7, which uses a slightly different version of the batch.js, and progress.js files.
(function ($) {
/**
* Attaches the batch behavior to progress bars.
*/
Drupal.behaviors.batch = {
attach: function (context, settings) {
$('#progress', context).once('batch', function () {
var holder = $(this);
// Success: redirect to the summary.
var updateCallback = function (progress, status, pb) {
if (progress == 100) {
pb.stopMonitoring();
window.location = settings.batch.uri + '&op=finished';
}
};
var errorCallback = function (pb) {
holder.prepend($('<p class="error"></p>').html(settings.batch.errorMessage));
$('#wait').hide();
};
var progress = new Drupal.progressBar('updateprogress', updateCallback, 'POST', errorCallback);
progress.setProgress(-1, settings.batch.initMessage);
holder.append(progress.element);
progress.startMonitoring(settings.batch.uri + '&op=do', 10);
});
}
};
})(jQuery);
The differences are that, since Drupal 7, all the jQuery code is wrapped in (function ($) { })(jQuery);
, and that the jQuery Once plugin is included with Drupal 7. Drupal 7 sets also WAI-ARIA attributes for compatibility with screen readers; this happens also in the HTML added from JavaScript code, such as the following, found in the progress.js file.
// The WAI-ARIA setting aria-live="polite" will announce changes after users
// have completed their current activity and not interrupt the screen reader.
this.element = $('<div class="progress" aria-live="polite"></div>').attr('id', id);
this.element.html('<div class="bar"><div class="filled"></div></div>' +
'<div class="percentage"></div>' +
'<div class="message"> </div>');
When serving a batch page, Drupal sets _batch_shutdown() as shutdown callback; when PHP shuts down because a timeout, the function updates the batch array in the database.
// Drupal 6.
function _batch_shutdown() {
if ($batch = batch_get()) {
db_query("UPDATE {batch} SET batch = '%s' WHERE bid = %d", serialize($batch), $batch['id']);
}
}
// Drupal 7.
function _batch_shutdown() {
if ($batch = batch_get()) {
db_update('batch')
->fields(array('batch' => serialize($batch)))
->condition('bid', $batch['id'])
->execute();
}
}
Upvotes: 5
Reputation: 27553
From a great example implementation:
Each batch operation callback will iterate over and over until $context['finished'] is set to 1. After each pass, batch.inc will check its timer and see if it is time for a new http request, i.e. when more than 1 minute has elapsed since the last request.
An entire batch that processes very quickly might only need a single http request even if it iterates through the callback several times, while slower processes might initiate a new http request on every iteration of the callback.
This means you should set your processing up to do in each iteration only as much as you can do without a php timeout, then let batch.inc decide if it needs to make a fresh http request.
In other words: you must split up your batch of tasks into chunks (or single tasks) thta won't time-out. Drupal will end its currrent call and open a new HTTP-request if it sees the PHP-timeout nearing.
Upvotes: 1