Nate Glenn
Nate Glenn

Reputation: 6744

Eclipse turns Japanese into garbage during refactoring

I have several Java files that have Japanese strings in them, and are encoded in UTF-8. I use Eclipse. However, whenever Eclipse touches them in any automated way, it turns the Japanese into garbage. A good example of this is JAWJAW, the Java Japanese WordNet interface. You can see the code on the website with Japanese characters in it. If you load the project into Eclipse, though, everything will fail because the characters are garbled (bakemoji).

Does anyone know how to fix this?

Upvotes: 3

Views: 2660

Answers (2)

Sudheesh.M.S
Sudheesh.M.S

Reputation: 508

The primary reason for this cause is - the unicode supported font is missing from the system fonts. So do the following things to get it done.

  • Download Arial Unicode MS font and put it inside windows->fonts directory in windows.
  • Change the default text encoding in eclipse to UTF-8 by navigating to

    Window->Preferences->General->Workspace->Text File encoding ->Other->UTF-8

  • set Arial Unicode MS font to the Text Font attribute by navigating to

    Window->Preferences->General->General->Appearance->colors and Fonts->Basic->Text Font (select it)->Edit

Upvotes: 0

VonC
VonC

Reputation: 1323303

What is the default encoding for your project?
Future version of Eclipse (like e4) could be set by default to UTF-8, which would avoid any automatic conversion into "garbage".

See bug 108668 for more on that reflexion:

No solution will be perfect. However in the long term I think the current platform specific approach is clearly inferior to a platform-independent UTF-8 default.


+1 UTF-8 should be the obvious default character set for all text files, I had a problem with eclipse when I was using an English Windows XP system and trying to open a file in eclipse with Chinese characters, as you can imagine the display is completely messed up and eclipse doesn't tell me what I need to do.
I had to spend time google for answers. I had to put -Dfile.encoding=UTF-8 in eclipse.ini so that it behaves correctly.


Making UTF-8 the default is not the right solution for the problem you were having.


+1 for embedding encoding in the character stream wherever we can (like XML, HTTP, some kinds of file systems). Encoding is meta-info for the data and belongs to the data, not to a separate user-changeable setup.

Upvotes: 3

Related Questions