Minions
Minions

Reputation: 5477

Alexa site rank API

Today I was working on Alexa API to get sites popularity rank using this code:

import urllib.request, sys, re

site = 'https://stackoverflow.com/questions/'
xml = urllib.request.urlopen('http://data.alexa.com/data?cli=10&dat=s&url=%s'%site).read()
try: rank = int(re.search(r'<POPULARITY[^>]*TEXT="(\d+)"', xml).groups()[0])
except: rank = -1
print('Your rank for %s is %d!\n' % (site, rank))

It was working perfectly, but suddenly it stopped!, I checked the API link manually:

http://data.alexa.com/data?cli=10&dat=s&url=https://stackoverflow.com/questions/

and it just returns a word "Okay" rather than a XML string .. What is the problem ?!

Upvotes: 2

Views: 7322

Answers (4)

Ernest
Ernest

Reputation: 331

Alexa rank has moved to new place and now is offered through paid API - https://awis.alexa.com/developer-guide. Said that, it is not expensive -https://aws.amazon.com/marketplace/pp/B07Q71HJ3H

Upvotes: 0

me2ulab
me2ulab

Reputation: 61

This might be what you are looking for

from bs4 import BeautifulSoup
import urllib.request
url='wikipedia.com'
rank_str =BeautifulSoup(urllib.request.urlopen("https://www.alexa.com/minisiteinfo/" +url),'html.parser').table.a.get_text()
rank_int=int(rank_str.replace(',',''))
print(rank_int)

Upvotes: 6

Carlos Alves Jorge
Carlos Alves Jorge

Reputation: 1985

That okey means that the IP you are running the script from has been blacklisted by alexa.

If you run it from a different IP it will work. Having said that I have no idea what rate / limit will cause IPs to be blacklisted

Upvotes: 1

BigGerman
BigGerman

Reputation: 525

That link seems to work fine for me when I tried it in Chrome and in Postman. Are you saying that the regex is returning "Okay"?

Also the response from that link is not in JSON, it is XML. Instead of using a regex to parse XML I would suggest that you use the XML module

Edit: I just tried you code and it worked, although I needed to convert the response to string (it came in as a byte-like object) before passing it into the regex.

Upvotes: 0

Related Questions