lipid
lipid

Reputation: 251

How to use Wikipedia API to get the page view statistics of a particular page in wikipedia?

The stats.grok.se tool provides the pageview statistics of a particular page in wikipedia. Is there a method to use the wikipedia api to get the same information? What does the page views counter property actually mean?

Upvotes: 22

Views: 17189

Answers (6)

Black.Lee
Black.Lee

Reputation: 354

em.. this question was asked 6 years ago. There's no such an API in official site in the past.

It changed.

A simple example:

https://en.wikipedia.org/w/api.php?action=query&format=json&prop=pageviews&titles=Buckingham+Palace%7CBank+of+England%7CBritish+Museum

See document:

prop=pageviews

Shows per-page pageview data (the number of daily pageviews for each of the last pvipdays days). The result format is page title (with underscores) => date (Ymd) => count.

Upvotes: 2

Anomie
Anomie

Reputation: 94794

No, there is not.

The counter property returned from prop=info would tell you how many times the page was viewed from the server. It is disabled on Wikipedia and other Wikimedia wikis because the aggressive squid/varnish caching means only a tiny fraction of page views would make it to the actual server in order to affect that counter, and even then the increased database write load for updating that counter would probably be prohibitive.

The stats.grok.se tool uses anonymized logs from the cache servers to calculate page views; the raw log files are available from http://dammit.lt/wikistats. If you need an API to access the data from stats.grok.se, you should contact the operator of stats.grok.se to request one be created.


Note this was written 4 years ago, and an API has since been created (see this answer). There's not yet a way to access that via api.php, though.

Upvotes: 7

Tgr
Tgr

Reputation: 28160

The Pageview API was released a few days ago: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end}

For example https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Foo/daily/20151010/20151012 will give you

{
  "items": [
    {
      "project": "en.wikipedia",
      "article": "Foo",
      "granularity": "daily",
      "timestamp": "2015101000",
      "access": "all-access",
      "agent": "all-agents",
      "views": 79
    },
    {
      "project": "en.wikipedia",
      "article": "Foo",
      "granularity": "daily",
      "timestamp": "2015101100",
      "access": "all-access",
      "agent": "all-agents",
      "views": 81
    }
  ]
}

Upvotes: 26

jim smith
jim smith

Reputation: 2454

get the daily JSON for the last 30 days like this

http://stats.grok.se/json/en/latest30/Britney_Spears

Upvotes: 3

Vipul Naik
Vipul Naik

Reputation: 155

There doesn't seem to be any API; however, you can make HTTP requests to stats.grok.se and parse the HTML or JSON result to extract the page view counts.

I created a website http://wikipediaviews.org that does exactly that in order to facilitate easier comparison for multiple pages across multiple months and years. To speed things up, and minimize the number of requests to stats.grok.se, I keep all past query results stored locally.

The code I used is available at http://github.com/vipulnaik/wikipediaviews.

The file with the actual retrieval code is in https://github.com/vipulnaik/wikipediaviews/blob/master/backend/pageviewqueries.inc

function getpageviewsonline($page, $month, $language)
{
  $url = getpageviewsurl($page,$month,$language);
  $html = file_get_contents($url);
  preg_match('/(?<=\bhas been viewed)\s+\K[^\s]+/',$html,$numberofpageviews);
  return $numberofpageviews[0];
}

The code for getpageviewsurl is in https://github.com/vipulnaik/wikipediaviews/blob/master/backend/stringfunctions.inc:

function getpageviewsurl($page,$month,$language)
{
  $page = str_replace(" ","_",$page);
  $page = str_replace("'","%27",$page);
  return "http://stats.grok.se/" . $language . "/" . $month . "/" . $page;
}

PS: In case the link to wikipediaviews.org doesn't work, it's because I registered the domain quite recently. Try http://wikipediaviews.subwiki.org instead in the interim.

Upvotes: 1

Aadil
Aadil

Reputation: 189

You can look into the stats here. Have anyone experienced some API to get the Pageview Stats? Furthermore, I have also looked into the available Raw Data but could not find the solution to extract the Pageview Count.

Upvotes: 2

Related Questions