Reputation: 685
req = requests.get(url)
tree = etree.HTML(req.text)
now instead of using xpath tree.xpath(...)
I would like to know if we can search by class name of id as we do in beautifulSoup
soup.find('div',attrs={'class':'myclass'})
I'm looking for something similar in lxml.
Upvotes: 1
Views: 4497
Reputation: 386362
You say that you don't want to use xpath but don't explain why. If the goal is to search for a tag with a given class, you can do that easily with xpath.
For example, to find a div with the class "foo" you could do something like this:
tree.find("//div[@class='foo']")
Upvotes: 1
Reputation: 25974
The far more concise way to do that in bs4
is to use a css selector:
soup.select('div.myclass') # == soup.find_all('div',attrs={'class':'myclass'})
lxml
provides cssselect
as a module (which actually compiles XPath expressions) and as a convenience method on Element
objects.
import lxml.html
tree = lxml.html.fromstring(req.text)
for div in tree.cssselect('div.myclass'):
#stuff
Or optionally you can pre-compile the expression and apply that to your Element
:
from lxml.cssselect import CSSSelector
selector = CSSSelector('div.myclass')
selection = selector(tree)
Upvotes: 2