ozz
ozz

Reputation: 183

import wikipedia article using wget or curl (on windows)

i have a folder with wikipedia article (XML format).

I want imported files throught the Webinterface (Special:Import). Currently i do it with imacro. But this often hangs and need a lot of resources (Memory) an can only processing one file at once.So i am looking for better solution.

I currently figured out, that in have to login to get an edittoken. This is needed to upload the file.

Read already this. get stuck To get his run in need two wget/curl "commandlines"

  1. to login and get the edittoken (push user and pwd to form, get edittoken)
  2. push the file to the Formular (push edittoken and content to form)
  3. Building the loop to processing more than one file, i can do by my own.

Upvotes: 0

Views: 446

Answers (1)

Nemo
Nemo

Reputation: 2544

First of all, let's be clear: the web interface is not the right way to do this. MediaWiki installation requirements include shell access to the server, which would allow you to use importDump.php as needed for heavier imports.

Second, if you want to import a Wikipedia article from the web interface then you shouldn't be downloading the XML directly: Special:Import can do that for you. Set

$wgImportSources = array( 'wikipedia' );

or whatever (see manual), visit Special:Import, select Wikipedia from the dropdown, enter the title to import, confirm.

Third, if you want to use the commandline then why not use the MediaWiki web API, also available with plenty of clients. Most clients handle tokens for you.

Finally, if you really insist on using wget/curl over index.php, you can get a token visiting api.php?action=query&meta=tokens in your browser (check on api.php for the exact instructions for your MediaWiki version) and then do something like

curl -d "&action=submit&[email protected]" .../index.php?title=Special:Import

(intentionally partial code so that you don't run it without knowing what you're doing).

Upvotes: 1

Related Questions