Get requests from one site in parallel Python

I was trying to parse one site, the idea is simple: I am making get request to the user page www.link/username. Depends if response(html text) contains element or not I do an action.

But I do need to check large amount of usernames(~3000) in parallel and as often as possible. I have list of proxies(good proxies, not public ones). Set the HEADER with user-agent, refers.

What I do:

  1. Create thread for each username. 50 usernames are using one proxy.
  2. Each thread is checking its username once at random choosing sleep time.

At the begging everything is okay and I get right responses. But after few iterations responses are getting wrong and my program is not doing what should.

Can u please help me to figure out how to make that amount of requests at the same time using python requests.

Some code:

def check_username(username, proxy=''):
    responses[username] = 'tgme_page_extra'
    try:
        responses[username] = requests.get(URL + username, headers=HEADERS, proxies=proxy) # getting response from link
    except Exception as e:
        print(e)
        time.sleep(7)
    if "tgme_page_extra" not in responses[username].text: # If username seems unclaimable
        # action
    else:
        #another action

def username_monit(username, proxy):
    while True: # Check username and sleep for interval
        check_username(username, proxy) 
        time.sleep(random.choice(config.CHECK_INTERVAL))

Upvotes: 0

Views: 113

Answers (0)

Related Questions