masu.mo
masu.mo

Reputation: 793

Wikipedia API: how to get the number of revisions of a page?

Anyone know how to get the number of revisions of a wikipedia page using mediawiki API? I have read this API documentation, but can't find the related API:
Revision API

Upvotes: 9

Views: 3652

Answers (5)

Gras Double
Gras Double

Reputation: 16383

As stated in Wint's answer, the best solution is probably to use the REST API.

Though, if you have to use the usual Action API, your only solution is to count the revisions (which obviously will be slow on pages with large histories).

I have just crafted a JavaScript code for this:

/* jshint esversion: 6 */
/* globals Promise, mw */

function countRevisions( pageTitle ) {
    return new Promise( function ( resolve, reject ) {
        mw.loader.using( 'mediawiki.api', function () {
            const api = new mw.Api();
            const userGroups = mw.config.get( 'wgUserGroups' );
            const apiLimit = userGroups.includes( 'sysop' ) || userGroups.includes( 'bot' ) ? 5000 : 500;
            let count = 0;
            function makeRequest( apiContinue ) {
                const params = {
                    action: 'query',
                    prop: 'revisions',
                    titles: pageTitle,
                    rvprop: '', // we don't need any property
                    rvlimit: apiLimit,
                    formatversion: 2,
                };
                if ( apiContinue ) {
                    Object.assign( params, apiContinue );
                }
                api.get( params ).done( function ( data ) {
                    if ( !data.query ) {
                        reject();
                        return;
                    }
                    const revisions = data.query.pages[ 0 ].revisions;
                    if ( revisions ) {
                        count += revisions.length;
                    }
                    if ( data[ 'continue' ] ) {
                        makeRequest( data[ 'continue' ] );
                    } else {
                        resolve( count );
                    }
                } ).fail( function () {
                    reject();
                } );
            }
            makeRequest();
        } );
    } );
}

countRevisions( 'Page title' ).then( function ( count ) {
    /* ... */
} );

Upvotes: 0

Wint
Wint

Reputation: 11

With REST API provided by newer MediaWikis, you can use the "Get page history counts" API to get the number of revisions of a page.

For example,

GET https://en.wikipedia.org/w/rest.php/v1/page/Jupiter/history/counts/edits?from=384955912&to=406217369

this request will return a JSON response like the following:

{
    "count": 110,
    "limit": false
}

Zero coding at all.

Upvotes: 1

Paul
Paul

Reputation: 4168

Here is code to get number of revisions of a page (in this case, the JSON wiki page):

import requests

BASE_URL = "http://en.wikipedia.org/w/api.php"
TITLE = 'JSON'

parameters = { 'action': 'query',
           'format': 'json',
           'continue': '',
           'titles': TITLE,
           'prop': 'revisions',
           'rvprop': 'ids|userid',
           'rvlimit': 'max'}

wp_call = requests.get(BASE_URL, params=parameters)
response = wp_call.json()

total_revisions = 0

while True:
  wp_call = requests.get(BASE_URL, params=parameters)
  response = wp_call.json()

  for page_id in response['query']['pages']:
    total_revisions += len(response['query']['pages'][page_id]['revisions'])

  if 'continue' in response:
    parameters['continue'] = response['continue']['continue']
    parameters['rvcontinue'] = response['continue']['rvcontinue']

  else:
    break

print parameters['titles'], total_revisions

You can check the result here: https://en.wikipedia.org/w/index.php?title=JSON&action=info#Edit_history

(accessible from the corresponding wikipedia page sidebar: Tools - Page information)

Upvotes: 4

Pedro Nunes
Pedro Nunes

Reputation: 420

Retrieve the revisions and implement a method to count them (It's just XML).

MediaWiki Revisions: Example

api.php ? action=query & prop=revisions & titles=API|Main%20Page & rvprop=timestamp|user|comment|content

.

<api>
<query>
<pages>
  <page pageid="1191" ns="0" title="API">
    <revisions>
      <rev user="Harryboyles" timestamp="2006-10-31T05:39:01Z" comment="revert unexplained change: see talk ...">
        ...content...
      </rev>
    </revisions>
  </page>
  <page pageid="11105676" ns="0" title="Main Page">
    <revisions>
      <rev user="Ryan Postlethwaite" timestamp="2007-06-26T19:05:06Z" comment="rv - what was that for?">
        ...content...
      </rev>
    </revisions>
  </page>
</pages>

Upvotes: 2

Bergi
Bergi

Reputation: 665536

The only possibility is to retrieve all revisions and count them. You might need to continue the query for that.

Bug 17993 is about including a count, but is still unsolved.

Upvotes: 10

Related Questions