Nick Nick Nick
Nick Nick Nick

Reputation: 189

How to use Signature or User ID to get user's information?

I'm a beginner at Wikimedia, and I'm using Wiki API to finish my project. My dataset looks like this:

rev_id | comment | timestamp | page_id | page_title | user_id | user_text
-- -- -- -- -- -- -- -- -- -- -- --
352194497 | Welcome to Wikipedia | 2010-03-26T18:16:48Z | 26709696 | 116.197.206.138 | 8356162 | Mlpearc

I'm trying to find some user information of these comment posters. However, I find the "user_text" here is not the user name but the signature. If I use the official API demos get_users.py to get the information, it turns out the error because some signature have space in it, but usernames are all single word. Like in the code below, I can get the information of Catrope and Bob using Catrope|Bob. But it doesn't work if I use Catrope|Tide rolls, if Tide rolls is the signature.

import requests

S = requests.Session()

URL = "https://en.wikipedia.org/w/api.php"

PARAMS = {
    "action": "query",
    "format": "json",
    "list": "users",
    "ususers": "Catrope|Tide rolls",
    "usprop": "blockinfo|groups|editcount|registration|emailable|gender"
}

R = S.get(url=URL, params=PARAMS)
DATA = R.json()

USERS = DATA["query"]["users"]

for u in USERS:
    print(str(u["name"]) + " has " + str(u["editcount"]) + " edits.")

So my question is, is there any way that we can get user information through the signature using API? And since we also have page_id and user_id here, will this information be helpful? Thank you so much in advance!

Updated: I used Bob Ben here as a fake ID. Now it is replaced by a real one. Problems solved by using _ to replace space.(Thanks for the reminder from AXO.)

Upvotes: 1

Views: 264

Answers (1)

AXO
AXO

Reputation: 9086

You've not mentioned the error and traceback that you're getting. The code sample should work fine as long as the username exists, even if the username has a space in it.

But user account "Bob Ben" is not registered. In such cases the API replies with {'name': 'Bob Ben', 'missing': ''}.

So you're code could be:

for u in USERS:
    if 'missing' not in u:
        print(u["name"] + " has " + str(u["editcount"]) + " edits.")
    else:
        print(u["name"], "is not registered.")

BTW, if for some reason you prefer not to use space, you may use _ (underscore) instead. A blank space is equivalent with an underscore.

Regarding "user information", I'm not sure what kind of information you're looking for. According to API:Users one may get blockinfo|groups|groupmemberships|implicitgroups|rights|editcount|registration|emailable|gender|centralids|cancreate using the usprop parameter. But if some other information, for example the information on the user page, is to be fetched, then you'll perhaps need to use one of the methods mentioned in API:Get the contents of a page to get the contents of the user page and then write a program to look for the information you need.

Upvotes: 2

Related Questions