Reputation: 13
I'm told to write a code that get a string text and check if its encoding is equal the specific encoding that we want or not. I've searched a lot but I didn't seem to find anything. I found a method (getEncoding()) but it just works with files and that is not what I want. and also I'm told that i should use java library not methods of mozilla or apache. I really appreciate any help. thanks in advance.
Upvotes: 0
Views: 776
Reputation: 4558
What you are thinking of is "Internationalization". There are libraries for this like, Loc4j
, but you can also get this using java.util.Locale
in Java. However in general text is just text. It is a token with a certain value. No localization information is stored in the character. This is why a file normally provides the encoding in the header. A console or terminal can also provide localization using certain commands/functions.
Unless you know the source encoding and the token used you will have a limited ability to guess what encoding is used in the other end. If you still would want to do this you will need to go into deeper areas such as decryption where this kind of stuff usually is done using statistic analysis. This in turn requires databases on the usage of different tokens and depending on the quality of the text, databases and algorithms a specific amount of text is required. Special stuff, like writing Swedish with eg. US encoding (like using a
for å
and ä
or o
for ö
) will require more advanced analysis.
EDIT
Since I got a comment that encoding and internationalization is different entities I will add some comments. It is possible to work with different encodings working plainly with English (like some English special characters). It is also possible to work with encodings using for example Charset
. However for many applications using different encodings it may still be efficient to use Locale
, since this library can do a lot of operations on text with different encodings.
Upvotes: 1
Reputation: 13
Thanks for ur answers and contribution but these two link did the trick. I had already seen these two pages but it didn't seem to work for me cause I was thinking about get the encoding directly and then compare it with the specific one. This is one of them
Upvotes: 0