I need to get some text inner span tag, but span tag does not have any class or title. Its just like: kirnath@me.com kirnath2@me.com kirnath3@me.com I have tried using: driver.find_elements_by_xpath('//*[contains(text(), 'kirnath@me.com')]') But I got error: SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//*[contains(text(), kirnath@me.com)]' is not a valid XPath expression. I need to get: kirnath@me.com kirnath2@me.com kirnath3@me.com

pythonhtmlseleniumweb-scrapingbeautifulsoup

Reputation: 49

How to find element by without class or title in selenium?

I need to get some text inner span tag, but span tag does not have any class or title. Its just like:

<span>[email protected]</span>
<span>[email protected]</span>
<span>[email protected]</span>

I have tried using:

driver.find_elements_by_xpath('//*[contains(text(), '[email protected]')]')

But I got error:

SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//*[contains(text(), [email protected])]' is not a valid XPath expression.

I need to get:

[email protected]    
[email protected]   
[email protected]

Upvotes: 0

Answers (3)

S A

Reputation: 1985

You are using single quotation for both the inner quotation inside a string and outside of the string. use the double quotation for the text inside. Or use the backslash before the quotation.

Try this:

driver.find_elements_by_xpath('//*[contains(text(), "[email protected]")]')

driver.find_elements_by_xpath('//*[contains(text(), \'[email protected]\')]')

This will only return the element with the text [email protected].

To find any email address you can use

driver.find_elements_by_xpath('//*[contains(text(), "@") and contains(text(), ".")]')

This will find all the elements that contain text with @ and .

Getting all the span element of the page is not ideal. Even though the span tag doesn't have any id or class, its parent nodes might have some unique identifier.

Can you provide the page source with some levels of parent nodes?

Upvotes: 0

QHarr

Reputation: 84465

If you want all spans then grab the webElements list and use list comprehension to extract the .text from each into a list. If you don't want all spans, look for a relationship/positional argument for example that limits to those required. Or possibly even substring match on .text if you have a consistently present substring to use.

span_texts = [item.text for item in driver.find_elements_by_css_selector('span')]

xpath substring

driver.find_elements_by_xpath('//span[contains(text(), "me.com")]')

You could use :contains pseudo class from bs4 4.7.1 to handle the html from driver.page_source. You can then specify a substring to match on for the span tags

from bs4 import BeautifulSoup as bs

soup = bs(driver.page_source, 'lxml')

data = [item.text for item in soup.select('span:contains("@me.com")')]
print(data)

Upvotes: 2

dede

Reputation: 726

Like this: !?

inp="bla <span>[email protected]</span> blub"

p1=inp.find("<span>")
p2=inp.find("</span>")
if p1>=0 and p2>p1:
  print(inp[p1+len("<span>"):p2])

output is:

[email protected]

Edit: or like this for more matches

inp="bla <span>[email protected]</span><span>[email protected]</span><span>[email protected]</span> blub"

def find_all(inp):
  res=[]
  p=0
  while True:
    p1=inp.find("<span>", p)
    p2=inp.find("</span>", p)
    if p1>=0 and p2>p1:
      res+=[inp[p1+len("<span>"):p2]]
      p=p2+1
    else:
      return res

print(find_all(inp))

output is:

['[email protected]', '[email protected]', '[email protected]']

Upvotes: 0

How to find element by <span> without class or title in selenium?

Answers (3)

Related Questions

How to find element by &lt;span&gt; without class or title in selenium?

Answers (3)

Related Questions

How to find element by <span> without class or title in selenium?