Cajuu'
Cajuu'

Reputation: 1166

Search for specific word in the whole page in all css selectors using xpath in python

Supposing that I have the following 3 links (there are more though):

https://rapidevolution.clickfunnels.com/jv-page-2  
http://Listhubpro.com/jv  
http://viralautopilotfunnels.com/jv

I'd like to find a way of pressing the button after entering the name and email in those fields.

I've managed to enter name and email in all the pages but for no reason, I am not able to press the button. There are either more buttons or the css selectors are different from page to page.

My code so far:

lista = [
    'https://rapidevolution.clickfunnels.com/jv-page-2',
    'http://Listhubpro.com/jv',
    'http://viralautopilotfunnels.com/jv',
]

for url in lista:
    not_found = False
    name_required = True
    email_required = True
    button_required = True

    driver.get(url)
    time.sleep(2)

    try:
        name_box = driver.find_element_by_xpath("//​input​[@*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'name')]]")
        name_box.click()
        name_box.clear()
        name_box.send_keys('MyName')
    except:
        not_found = True

    try:
        email_box = driver.find_element_by_xpath("//​input​[@*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'email')]]")
        email_box.click()
        email_box.clear()
        email_box.send_keys('[email protected]')
    except:
        not_found = True

    if not_found:
        print "here"
        for element in driver.find_elements_by_xpath("//input[@type='text']"):
            if name_required:
                try:
                    name_box = element.find_element_by_xpath(".[@*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'name')]]")
                    name_box.click()
                    name_box.clear()
                    name_box.send_keys('MyName')
                    name_required = False
                    continue
                except:
                    pass

            if email_required:
                try:
                    email_box = element.find_element_by_xpath(".[@*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'email')]]")
                    email_box.click()
                    email_box.clear()
                    email_box.send_keys('[email protected]')
                    email_box.send_keys(Keys.Enter)
                    email_required = False
                    break
                except:
                    pass

            if (not name_required) and (not email_required) and (not button_required):
                break

    for element1 in driver.find_element_by_xpath("//​div​[@*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'button')]]"):
        if button_required:
            try:
                button = element1.find_element_by_xpath("//*[@type='submit']").click()
                button.click()
                button.send_keys(Keys.ENTER)
                button_required = False
                continue
            except:
                try:
                    button1 = element1.find_element_by_xpath(".[@*[contains(., 'button')]]").click()
                    button1.click()
                    button1.send_keys(Keys.ENTER)
                    button_required = False
                except:
                    pass

    time.sleep(2)
    print button_required

Upvotes: 1

Views: 83

Answers (2)

sideshowbarker
sideshowbarker

Reputation: 88166

In the XPath expressions in lines 18 and 26 and 62 of your code, you have Unicode zero-width space (U+200B) characters. You should remove those.

If you configure your code editor to show non-printing characters, you’ll see that your code at line 18 looks like this:

name_box = driver.find_element_by_xpath("//<200b>input<200b>[@*[contains…

Where the <200b> is a Unicode zero-width space character.

Same thing in the XPath expressions at line 26 and 62. So those XPath expressions are never going to match anything. Please remove those zero-width space characters and see if your code works the way you expect.

As far as the documents listed in the question, your XPath expression //div[@*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'button')]] works as expected with the https://rapidevolution.clickfunnels.com/jv-page-2. It returns 4 div elements.

Upvotes: 2

Gijs
Gijs

Reputation: 10891

Not exactly answering your question, but if you want to sign up for the mailinglist, it's probably easier to mimick the request to the server that's signing you up. I looked at the first link, and there is a HTTP Post request issued when pressing the button, containing the credentials just entered. You can use the requests library to rebuild that request.

Edit: a bit more details

In the second link, I'm actually redirected to another page where I have to enter the data again. Then, after pressing the button, I see from the browser debugger the following request sent (as a curl command).

curl 'http://gopartnerpro.us11.list-manage.com/subscribe/post' -H 'Host: gopartnerpro.us11.list-manage.com' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:40.0) Gecko/20100101 Firefox/40.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: nl,en-US;q=0.7,en;q=0.3' --compressed -H 'Referer: http://gopartnerpro.us11.list-manage1.com/subscribe?u=7296d4e9339f32fccc465e451&id=2783407c84' -H 'Connection: keep-alive' --data 'u=7296d4e9339f32fccc465e451&id=2783407c84&MERGE1=Gib&MERGE0=bla%40gmail.com&b_7296d4e9339f32fccc465e451_2783407c84=&submit=Subscribe+to+list'

You can see my name is Gib and my emailadress is [email protected]. If you replace those by the ones you want to subscribe and repeat this request, you've probably subscribed someone else. I say probably, because there are also the u and id parameters, one of the is a reference to the mailinglist, but the other one probably refers to the user session. Experimentation is needed to figure out exactly what happens.

You would need to do all this tinkering for every subscription page, which may or may not be feasible. In return, you would end up with a relatively compact and robust way of subscribing though.

Upvotes: 0

Related Questions