Fuzzyma
Fuzzyma

Reputation: 8484

Xpath expression returns empty output

My xidel command is the following:

xidel "https://www.iec-iab.be/nl/contactgegevens/c360afae-29a4-dd11-96ed-005056bd424d" -e '//div[@class="consulentdetail"]'

This should extract all data in the divs with class consulentdetail Nothing special I thought but it wont print anything.

Can anyone help me finding my mistake?

//EDIT: When I use the same expression in Firefox it finds the desired tags

Upvotes: 2

Views: 200

Answers (1)

Markus
Markus

Reputation: 3397

The site you are connecting to obviously checks the user agent string and delivers different pages, according to the user agent string it gets sent.

If you instruct xidel to send a user agent string, impersonating as e.g. Firefox on Windows 10, your query starts to work:

> ./xidel --silent  --user-agent="Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0" "http://www.iec-iab.be/nl/contactgegevens/c360afae-29a4-dd11-96ed-005056bd424d" -e '//div[@class="consulentdetail"]'
Lidnummer11484 2 N 73
TitelAccountant, Belastingconsulent
TaalNederlands
Accountant sinds4/04/2005
Belastingconsulent sinds4/04/2005
AdresStationsstraat 2419550 HERZELE
Telefoon+32 (53) 41.97.02
Fax+32 (53) 41.97.03
AdresStationsstraat 2419550 HERZELE
Telefoon+32 (53) 41.97.02
Fax+32 (53) 41.97.03
GSM+32 (474) 29.00.67
Websitehttp://abbeloosschinkels.be
E-mail

<!--
document.write("<a href=mailto:");document.write(decrypt(unescCtrlCh("5yÿÃ^à(pñ_!13!­[îøû!13!5ãév¦Ãçj|°W"),"Iate1milrve%ster"));document.write(">");document.write(decrypt(unescCtrlCh("5yÿÃ^à(pñ_!13!­[îøû!13!5ãév¦Ãçj|°W"),"Iate1milrve%ster"));document.write("</a>");
-->

As a rule of thumb, when doing Web scraping and getting weird results:

  1. Check the page in a browser with Javascript disabled.
  2. Send a user agent string simulating a Web browser.

Upvotes: 1

Related Questions