me-esmee
me-esmee

Reputation: 101

How to get a list of links of a Wikipedia page (article) and the number of how many people clicked on a link?

This is my code so far. Now I have a list of how many people looked at the page (article) but I wondered if it's possible to make a list of links of a wikipedia page (article) and how many times there is clicked on the links?

String[] articles = {"Hitler", "SOA", "Albert_Einstein"};
void setup() {
  for (int i = 0; i < articles.length; i++) {

    String article = articles[i];

    String start = "20160101"; // YYYYMMDD
    String end = "20170101"; // YYYYMMDD

    // documentation: https://wikimedia.org/api/rest_v1/?doc#!/Pageviews_data/get_metrics_pageviews_per_article_project_access_agent_article_granularity_start_end
    String query = "http://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/"+article+"/daily/"+start+"/"+end;

    JSONObject json = loadJSONObject(query);
    JSONArray items = json.getJSONArray("items");

    int totalviews = 0;

    for (int j = 0; j < items.size(); j++) {
        JSONObject item = items.getJSONObject(j);
        int views = item.getInt("views");
        totalviews += views;
    }

    println(article+" "+totalviews);
  }
}

Upvotes: 2

Views: 1948

Answers (2)

Ainali
Ainali

Reputation: 1631

To get the links from an article use action=query in the API together with props=links.

In your example: https://en.wikipedia.org/w/api.php?action=query&format=json&prop=links&meta=&titles=Albert+Einstein%7CHitler%7CSOA&pllimit=500

Do note that this is not all results (you can only get 500 at a time) so you need to make more requests using the plcontinue you get as a parameter in the new request.

Upvotes: 3

Kevin Workman
Kevin Workman

Reputation: 42176

Break your problem down into smaller steps.

Create a single program that just returns all of the links on a wikipedia page. Make sure you have that program working perfectly, and post an MCVE if you get stuck.

Separately from that program, create a separate program that takes a hardcoded URL and returns the number of views that URL has. Again, post an MCVE if you get stuck. When you get that working, move up to a program that takes a hardcoded ArrayList of URLs and returns the pageviews for each URL.

Then when you have them both working separately, then you can start thinking about combining them.

Upvotes: 1

Related Questions