Michael T Johnson
Michael T Johnson

Reputation: 689

Loop through url with Selenium Webdriver

The below request finds the contest id's for the day. I am trying to pass that str into the driver.get url so it will go to each individual contest url and download each contests CSV. I would imagine you have to write a loop but I'm not sure what that would look like with a webdriver.

import time
from selenium import webdriver
import requests
import datetime

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA') 
data = req.json()

for ids in data:
    contest = ids['id']

driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!


driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '') 

Upvotes: 2

Views: 3388

Answers (5)

mariowritescode
mariowritescode

Reputation: 764

You do not need to send load selenium for x nos of times to download x nos of files. Requests and selenium can share cookies. This means you can login to site with selenium, retrieve the login details and share them with requests or any other application. Take a moment to check out httpie, https://httpie.org/doc#sessions it seems you manually control sessions like requests does.

For requests look at: http://docs.python-requests.org/en/master/user/advanced/?highlight=sessions For selenium look at: http://selenium-python.readthedocs.io/navigating.html#cookies

Looking at the Webdriver block,you can add proxies and load the browser headless or live: Just comment the headless line and it should load the browser live, this makes debugging easy, easy to understand movements and changes to site api/html.

import time
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
import requests
import datetime
import shutil



LOGIN = 'https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby'
BASE_URL = 'https://www.draftkings.com/contest/exportfullstandingscsv/'
USER = ''
PASS = ''

try:
    data = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA').json()
except BaseException as e:
    print(e)
    exit()


ids = [str(item['id']) for item in data]

# Webdriver block
driver = webdriver.Chrome()
options.add_argument('headless')
options.add_argument('window-size=800x600')
# options.add_argument('--proxy-server= IP:PORT')
# options.add_argument('--user-agent=' + USER_AGENT)

try:
    driver.get(URL)
    driver.implicitly_wait(2)
except WebDriverException:
    exit()

def login(USER, PASS)
    '''
    Login to draftkings.
    Retrieve authentication/authorization.

    http://selenium-python.readthedocs.io/waits.html#implicit-waits
    http://selenium-python.readthedocs.io/api.html#module-selenium.common.exceptions

    '''

    search_box = driver.find_element_by_name('username')
    search_box.send_keys(USER)

    search_box2 = driver.find_element_by_name('password')
    search_box2.send_keys(PASS)

    submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
    submit_button.click()

    driver.implicitly_wait(2)

    cookies = driver.get_cookies()
    return cookies


site_cookies = login(USER, PASS)

def get_csv_files(id):
    '''
    get each id and download the file.
    '''

    session = rq.session()

    for cookie in site_cookies:
        session.cookies.update(cookies)

    try:
        _data = session.get(BASE_URL + id)
        with open(id + '.csv', 'wb') as f:
            shutil.copyfileobj(data.raw, f)
    except BaseException:
        return


map(get_csv_files, ids)

Upvotes: 1

Goran
Goran

Reputation: 269

Try in following order:

import time
from selenium import webdriver
import requests
import datetime

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')
data = req.json()



driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby')
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('Pr0c3ss')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('generic1!')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!

for ids in data:
    contest = ids['id']
    driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '')

Upvotes: 2

Buaban
Buaban

Reputation: 5127

I think you can set the URL of a contest to an a element in the landing page, and then click on it. Then repeat the step with other ID.

See my code below.

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA') 
data = req.json()
contests = []

for ids in data:
    contests.append(ids['id'])

driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!

for id in contests:
    element = driver.find_element_by_css_selector('a')
    script1 = "arguments[0].setAttribute('download',arguments[1]);"
    driver.execute_script(script1, element, str(id) + '.pdf')
    script2 = "arguments[0].setAttribute('href',arguments[1]);"
    driver.execute_script(script2, element, 'https://www.draftkings.com/contest/exportfullstandingscsv/' + str(id))
    time.sleep(1)
    element.click()
    time.sleep(3)

Upvotes: 0

Shoo
Shoo

Reputation: 43

May be its time to decompose it a bit.
Create few isolated functions, which are:
0. (optional) Provide authorisation to target url.
1. Collecting all needed id (first part of your code).
2. Exporting CSV for specific id (second part of your code).
3. Loop through list of id and call func #2 for each.

Share chromedriver as input argument for each of them to save driver state and auth-cookies.
Its works fine, make code clear and readable.

Upvotes: 0

Boki_LV
Boki_LV

Reputation: 21

will this help

for ids in data:
    contest = ids['id']
    driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '') 

Upvotes: 0

Related Questions