Reputation: 388
I'm trying crawl a website for a bot I'm working on, anyway; I'm not too experienced with xpath and right now I can get some information but from the website that I'm crawling there are guides (Like guides for videogames) (It's a game) and I want to get the title of the guide but it doesn't output anything; I'll explain my code:
name = input("> ")
page = requests.get("http://www.mobafire.com/league-of-legends/champions")
tree = html.fromstring(page.content)
for index, champ in enumerate(champ_list):
if name == champ:
y = tree.xpath(".//*[@id='browse-build']/a[{}]/@href".format(index + 1))
print(y)
guide = requests.get("http://www.mobafire.com/league-of-legends/champion/ashe-13")
builds = html.fromstring(guide.content)
print(builds)
for title in builds.xpath(".//*[@id='browse-build']/table/tbody/tr[1]/td/text()"):
print(title)
Ok, from the input it searched a list and from said list it extracts a link which would go on the guide
variable; from that I want to crawl for the title of the first guide but it doesn't output anything. I get a status code 200 so I know everything is fine with the url and that. I tried nesting this:
guide = requests.get("http://www.mobafire.com/league-of-legends/champion/ashe-13")
builds = html.fromstring(guide.content)
print(builds)
for title in builds.xpath(".//*[@id='browse-build']/table/tbody/tr[1]/td/text()"):
print(title)
Inside the for loop above but it doesn't do anything neither; literally, it just finishes the program; there you can see the site where I'm getting the information from and that's it; I don't know what would be the right approach to this, if there is anything else I should add please tell me. Thanks for any help.
Upvotes: 0
Views: 52
Reputation: 171
The site has a namespace defined (xmlns="http://www.w3.org/1999/xhtml"). You have to add that namespace at these xpath. For more info visit this. Xml Namespace breaking my xpath!
Upvotes: 1
Reputation: 21643
As noted in comments, an id must be unique. The first of these constructions works. The fact that the code doesn't actually contain a tbody might explain why the second doesn't.
>>> for item in builds.xpath(""".//table[@class='browse-table']/tr[1]/td/text()"""):
... item
...
'Season 7 Guides'
>>> for item in builds.xpath(""".//table[@class='browse-table']/tbody/tr[1]/td/text()"""):
... item
...
I don't know this provides a path to the results you want, however, since you didn't specify them.
Upvotes: 1