ddelizia
ddelizia

Reputation: 1571

JSOUP Unsupported charset exception

I'm using jsoup to read this the following page:

http://valencia.loquo.com/cs/vivienda/piso-en-alquiler/312

Using the following code:

Document doc = Jsoup.connect("http://valencia.loquo.com/cs/vivienda/piso-en-alquiler/312").get();

and I get this error:

java.nio.charset.UnsupportedCharsetException: ISO-LATIN-1

I inspected the HTML response header:

Status Code: 200
Date: Sun, 23 Oct 2011 20:10:02 GMT
Content-Encoding: gzip
X-Pad: avoid browser bug
Connection: Keep-Alive
Content-Length: 13890
Server: Apache/2.2.3 (Debian)
Vary: Accept-Encoding
Content-Type: text/html; charset=iso-latin-1
Keep-Alive: timeout=5, max=100

As you can see the HTML response says charset=iso-latin-1 probably that is why I get the error. Anyway I can see the HTML body reponse. There is any way to avoid this error and getting the document (with the standard charset)?

Thanks in advance for your help

Danilo

Upvotes: 1

Views: 1267

Answers (2)

Sean Patrick Floyd
Sean Patrick Floyd

Reputation: 298838

You can always download the document without JSoup, convert the encoding programmatically (here's a link to the cookbook) and pass the converted String to JSoup.

Upvotes: 1

Andrew Thompson
Andrew Thompson

Reputation: 168825

See ISO_8859_1..

ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1

Upvotes: 1

Related Questions