Reputation: 11
I wish to fetch raw wikitext from a selection of webpages (which may be generated using MediaWiki or not).
I try to programmically look through the HTML of a webpage, determine if it uses MediaWiki and fetch raw wikitext, else skip. So far it seems MediaWiki pages tend to have:
<meta>
tag with name=generator
and content=MediaWiki...
alt=Powered by MediaWiki
Is this a good approach to look for one of these and try fetching raw wikitext using query param action=raw
or is there a better way to do this?
Thanks
Upvotes: 1
Views: 42
Reputation: 134
Right, try fetch it:
http://Host of test needed/path (if have)/index. php?title=Special:Version
If the response includes "MediaWiki" & "PHP" & "Lua", this site is prowered by MediaWiki.
Upvotes: 0