Reputation: 151
The Short:
How can I retrieve only tag names with .xpath() in Scrapy?
The Long:
I am currently using a scrapy.Spider and using response.selector.remove_namespaces()
in the parse()
function to keep things simple.
I am trying to do something like this, but with Scrapy:
Iterate on XML tags and get elements' xpath in Python
However, I can't seem to figure out how to retrieve only the name of the tags. What is the .xpath()
command to grab just the tag names?
Upvotes: 1
Views: 200
Reputation: 17291
There is no built in way of extracting just the tag name from a scrapy.selector
class, at least that I am aware of.
That being said, you can use the re
method of any selector and use a regular expression pattern to extract the tag name.
For example:
for selector in response.xpath("//*"):
print(selector.re(r'<(\w+)\s'))
Upvotes: 1