Reputation: 1707
I'm trying to match a dropdown option:
Cabina Económica
against a String imported from a properties file.
I was having problems using
"//a[text()='" + cabin + "']"
and so changed it to:
final String translateFrom = "ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜÉÈÊÀÁÂÒÓÔÙÚÛÇÅÏÕÑŒäöüéèêàáâòóôùúûçåïõñœ";
final String translateTo = "abcdefghijklmnopqrstuvwxyzaoueeeaaaooouuucaionoaoueeeaaaooouuucaiono";
"//a[translate(text(),'"+translateFrom+"','"+translateTo+"')=translate('"+cabin+"', '"+translateFrom+"', '"+translateTo+"')]";
which works perfectly when I test it in Eclipse, but fails when I run it under the Windows 7 console:
main() Terminating due to error/exception: Unable to locate element: ....)=translate('Cabina Econ├│mica'....
If I print out the dropdown option from the page, under the Windows console it show as:
Cabina Econ≤mica
≤ seems to be ASCII F3, which matches what I see when I examine the (both) Strings under Eclipse.
But ├│, the value being read from the properties file, whilst it is F3 under Eclipse, seems to be C3B3 under the Windows console.
F3 is the Unicode value for ó; C3B3 is its UTF-8 value.
Why does reading the properties file under Eclipse (via Spring) give a different result to reading it under the Windows console, and what do I need to do to make these equal?
The webpage I'm reading is defined with
<meta ... charset=utf-8>
so I assume that something (Selenium?) is translating it to utf-16 or utf-32 (where ó = x'f3') before I see it.
Whereas Spring's property file seems to being read as utf-8 under the console but 16/32 under Eclipse.
Further research suggest this might be something to do with Spring's property file loading. I've opened a new question at:
and think it best to delete this one (unless anyone objects?)
Upvotes: 0
Views: 559
Reputation: 38821
Uncertain but possible answer with info:
Actually nothing above 7F is ASCII; a Windows console window (often inaccurately called 'DOS' prompt or window) uses the Windows 'OEM' (legacy) code page usually 437 in which F3 is the character ≤
. And the two characters ├│
are C3 B3 which you correctly identify as the UTF-8 for Unicode F3 ó
. It is possible to fix the Windows console display by explicitly encoding to IBM437
, but you need to do this only for the console display and not elsewhere, including not Windows files because files use either the so-called 'ANSI' (really CP1252) single-byte code or one of several Unicode encodings (UTF-8 or UTF-16 in either endianness).
Java's default encoding for I/O (particularly but not only files) on Windows is CP1252, while on Unix it is often though not always UTF-8. Is your Eclipse on Unix? My Eclipse (Indigo) on Windows defaults CP1252 for plain Java, but I don't know if Spring does anything to override that. If it uses the default to read your file, you can set that default with system property file.encoding=utf-8
.
Upvotes: 0
Reputation: 5022
Check the encoding of the console in the preference of eclipse. It's probably not the same encoding used by the windows console.
Upvotes: 2