Saku
Saku

Reputation: 403

How to download files from Wikimedia Commons by API?

How can I download a lot of audio (.ogg) files from Wikimedia Commons? Is it possible using the Mediawiki API?

Upvotes: 4

Views: 2580

Answers (1)

Termininja
Termininja

Reputation: 7036

You can use MediaWiki API to get the url download links not only for .ogg but also to any other image or media file uploaded on Wikimedia Commons. From the response you can easy download each one file. Here is an example in C#:

private static void GetFiles(List<string> fileNames)
{
    //Get HTML request with all file names
    var url = "https://commons.wikimedia.org/w/api.php?action=query&format=xml" +
        "&prop=imageinfo&iiprop=url&titles=File:" + string.Join("|File:", fileNames);
    using (var webResponse = (HttpWebResponse)WebRequest.Create(url).GetResponse())
    {
        using (var reader = new StreamReader(webResponse.GetResponseStream()))
        {
            var response = reader.ReadToEnd();

            //Get all file url links by parsing the XML response
            var links = XElement.Parse(response).Descendants("ii")
                .Select(x => x.Attribute("url").Value);
            foreach (var link in links)
            {
                //Save the current file on the disk
                using (var client = new WebClient())
                {
                    var fileName = link.Substring(link.LastIndexOf("/") + 1);
                    client.DownloadFile(link, fileName);
                }
            }
        }
    }
}

Usage:

//list of files to download
var fileNames = new List<string>() {
    "Flag of France.svg", "Black scorpion.jpg", "Stop.png",         //image
    "Jingle Bells.ogg", "Bach Astier 15.flac",                      //audio
    "Cable Car.webm", "Lion.ogv",                                   //video
    "Animalibrí.gif",                                               //animation
};

GetFiles(fileNames);

Note: The API has limit for the files:

Maximum number of values is 50 (500 for bots).

So, if you need to download more files, you will have to split the list in parts and to create another requests.

Upvotes: 7

Related Questions