Leos Literak
Leos Literak

Reputation: 9406

JPA: accented character is saved as ? in database

Our application is running on weblogic 10g and persists data into oracle 9 database via JPA backed by default eclipselink implementation.

Entity is simple:

@Entity
@Table(name="PLUGINDATA")
@SequenceGenerator(name="plugin_seq", sequenceName="PLUGINDATASEQ", initialValue = 1000, allocationSize = 50)
public class PluginData implements Serializable {
    @Column(name = "data", length = 4000, nullable = false)
    private String data;

@PersistenceContext(unitName = "XYZ")
protected javax.persistence.EntityManager entityManager;

entityManager.persist(entity);

When I pass string containing accented characters, they are OK in EJB:

83, 116, -59, -103, 101, 100, 111, -60, -115, 101, 115, 107, -61, -67, 32, 107, 114, 97, 106

but they are stored corrupted in database:

83, 116, 63, 63, 101, 100, 111, -60, 63, 101, 115, 107, 63, 63, 32, 107, 114, 97, 106

e.g. accented characters are replaced by question mark: St??edo�?esk?? kraj

I do not understand it, the code is OK and datasource setting is same like for other applications that work correctly. Any idea?

Edit 1:

This is SID NLS setting (very same like another application database)

select DECODE(parameter, 'NLS_CHARACTERSET', 'CHARACTER SET', 'NLS_LANGUAGE', 'LANGUAGE', 'NLS_TERRITORY', 'TERRITORY') name, value 
from v$nls_parameters
where parameter in ('NLS_CHARACTERSET', 'NLS_LANGUAGE', 'NLS_TERRITORY');

LANGUAGE    ENGLISH
TERRITORY   CZECH REPUBLIC
CHARACTER SET   EE8ISO8859P2

Edit 2:

WebLogic / Data source

Connection pool tab has following attributes:

URL: jdbc:oracle:thin:@hostname:1521:SID 
Driver: oracle.jdbc.xa.client.OracleXADataSource

Oracle tab: everything is unchecked

Upvotes: 0

Views: 2199

Answers (3)

Leos Literak
Leos Literak

Reputation: 9406

Ok, it works finally. You all were right, that it is the locale problem. Nonetheless it was weblogic issue, not database. I tried sample code running on different weblogic against same database and it was running smooth. Then I compared shell environment and the correct weblogic had LANG set to UTF-8. When I added following line to setDomainEnv.sh and restarted weblogic, it started to work correctly. Thank you for your assistence.

export LANG=en_US.utf8

Upvotes: 0

Jason C
Jason C

Reputation: 40386

Looking at your original string vs. stored string, the pattern is that all characters > 0x7f are stored as 0x3f (63). This is typical of a conversion to US-ASCII (or some other character set that has no code points for bytes 0x80-0xff).

It is most certainly a character encoding issue somewhere. The characters you specified that didn't encode are not in ISO-8859-2, which is the character set you have NLS configured for.

You could try ISO-8859-1 (WE8ISO8859P1), e.g. FRENCH_FRANCE.WE8ISO8859P1 (or CZECH_CZECH REPUBLIC.WE8ISO8859P1 if you want to keep the language as Czech). WE8MSWIN1252 is a more complete set than 8859-1.

You could do CZECH_CZECH REPUBLIC.UTF8 too if your database supports it.

Upvotes: 0

user2688837
user2688837

Reputation: 11

If it might be a problem in the database. oracle databases doesn't support by default accented characters. you have to go to your database run this " select * from NLS_DATABASE_PARAMETERS ;"

if you have NLS_LANGUAGE setted to AMERICAN you can not have special characters (é,è,ê etc..) in your tables

To resolve this problem

In windows : HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_O10R2\NLS_LANG=FRENCH_FRANCE.WE8MSWIN1252

In Linux : set NLS_LANG=FRENCH_FRANCE.WE8PC850

Upvotes: 1

Related Questions