Reputation: 737
I have this problem with jsoup, a few months ago, i deployed a war file with crawlers to extract data from certain websites, the crawlers worked as expected but then they started to fail, i thought that the website changed, but that wasn't the case.
So debugging the crawlers i just found out that the data is not parsed correctly because it's in another currency (let's say i'm getting canadian dolars instead of US dolars).
I'm not sure why suddenly this changed, i'm pretty that i set up the user agent to get currency from a specific country, but now it's seem it's ignored.
I tried a few things to see if any changed, like changing Java system properties like User.country
to US by default. no results.
Note: my test are running on a local server, here the data is always in US dolars, the production server located in Australia.
I'm looking for someone who can give advice on what to change to avoid this situation when creating a webscraper/webcrawler.
Upvotes: 0
Views: 75
Reputation: 11712
This might be an IP-Address issue. You say that the production server is located in Australia. It seems likely to me that the target sites render the pages according to the originating IP-Address, which will result in a display with Australian Dollars. To avoid this, I see no other chance than using a US IP-Address. You may achieve this by employing a proxy in front of your server that is located in the US.
Upvotes: 1