Reputation: 1179
I using Python with selenium (PhantomJS webdriver) to parse websites and i have problem with it.
I want to get current song from this radio site: http://www.eskago.pl/radio/eska-warszawa.
xpath:
/html/body/div[3]/div[1]/section[2]/div/div/div[2]/ul/li[2]/a[2]
that xpath does not work with python selenium
error:
Traceback (most recent call last): File "parser4.py", line 41, in p.loop() File "parser4.py", line 37, in loop self.eska(self.url_eskawarszawa) File "parser4.py", line 27, in eska driver.find_element_by_xpath('/html/body/div[3]/div[1]/section[2]/div/div/div[2]/ul/li[2]/a[2]') File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 230, in find_element_by_xpath return self.find_element(by=By.XPATH, value=xpath) File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 662, in find_element {'using': by, 'value': value})['value'] File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 173, in execute self.error_handler.check_response(response) File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 164, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: u'{"errorMessage":"Unable to find element with xpath \'/html/body/div[3]/div[1]/section[2]/div/div/div[2]/ul/li[2]/a[2]\'","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"148","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:55583","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"xpath\", \"sessionId\": \"e2fa7700-1bea-11e4-bd11-83e129ae286e\", \"value\": \"/html/body/div[3]/div[1]/section[2]/div/div/div[2]/ul/li[2]/a[2]\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/e2fa7700-1bea-11e4-bd11-83e129ae286e/element"}}' ; Screenshot: available via screen
Does anyone have idea what is wrong with this?
Edit: thx guys for answers I finally find a solution for my problem. xpath was good (but in fact fragile)
I use firefox driver and i saw problem - ad.
I would have to skip them by that and I decided to use another page without this ad: http://www.eskago.pl/radio
and finnaly, thx alecxe - I use this:
driver.find_element_by_xpath('//a[@class="radio-tab-button"]/span/strong').click()
element = driver.find_element_by_xpath('//p[@class="onAirStreamId_999"]/strong')
print element.text
and work perfectly.
Upvotes: 0
Views: 1821
Reputation: 473853
The xpath you provided is a very fragile one, now wonder you get a NoSuchElementException
exception.
Instead, rely on the a
tag's class name, there is a current playing song inside:
<a class="playlist_small" href="http://www.eskago.pl/radio/eska-warszawa?noreload=yes">
<img style="width:41px;" src="http://t-eska.cdn.smcloud.net/common/l/Q/s/lQ2009158Xvbl.jpg/ru-0-ra-45,45-n-lQ2009158Xvbl_jessie_j_bang_bang.jpg" alt="">
<strong>Jessie J, Ariana Grande, Nicki Minaj</strong>
<span>Bang Bang</span>
</a>
Here's the sample code:
element = driver.find_element_by_xpath('//a[@class="playlist_small"]/strong')
print element.text
Well, another way to retrieve the current playing song - is to mimic the JSONP response the website is making for the playlist:
>>> import requests
>>> import json
>>> import re
>>> response = requests.get('http://static.eska.pl/m/playlist/channel-999.jsonp')
>>> json_data = re.match('jsonp\((.*?)\);', response.content).group(1)
>>> songs = json.loads(json_data)
>>> current_song = songs[0]
>>> [artist['name'] for artist in current_song['artists']]
[u'David Guetta', u'Showtek', u'Vassy']
>>> current_song['name']
u'Bad'
Upvotes: 3
Reputation: 9019
As alecxe mentioned, that xpath is going to break if there are any changes in the structure of the page.
A much simpler xpath expression that will work is this: //li[2]/a[2]
Upvotes: 1