Reputation: 4870
We are working with one application where we need to save data in language Gujarati
.
Technologies used in Applcation is listed below.
My JSP is configured with
<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
And
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Hibernate configuration is
<prop key="hibernate.connection.useUnicode">true</prop>
<prop key="hibernate.connection.characterEncoding">UTF-8</prop>
<prop key="hibernate.connection.charSet">UTF-8</prop>
MySQL URL is
jdbc:mysql://host:port/dbName?useUnicode=true&connectionCollation=utf8_general_ci&characterSetResults=utf8
Pojo having String
field to store that data.
MySQL have VARCHAR
datatype to store data with charset=utf8 and Collation=utf8_general_ci
When i tried to save any non-english(Gujrati) character it show some garbage character like ���
for "ગુજ".
Is there any other configuration which i missed here.
Upvotes: 5
Views: 3644
Reputation: 1789
Your applicationContext file should be like this:
To make Spring MVC application supports the internationalization, register two beans :
SessionLocaleResolver Register a “SessionLocaleResolver” bean, named it exactly the same characters “localeResolver“. It resolves the locales by getting the predefined attribute from user’s session. Note If you do not register any “localeResolver”, the default AcceptHeaderLocaleResolver will be used, which resolves the locale by checking the accept-language header in the HTTP request.
LocaleChangeInterceptor Register a “LocaleChangeInterceptor” interceptor and reference it to any handler mapping that need to supports the multiple languages. The “paramName” is the parameter value that’s used to set the locale.
<bean id="localeResolver"
class="org.springframework.web.servlet.i18n.SessionLocaleResolver">
<property name="defaultLocale" value="en" />
</bean>
<bean id="localeChangeInterceptor"
class="org.springframework.web.servlet.i18n.LocaleChangeInterceptor">
<property name="paramName" value="language" />
</bean>
<bean class="org.springframework.web.servlet.mvc.support.ControllerClassNameHandlerMapping" >
<property name="interceptors">
<list>
<ref bean="localeChangeInterceptor" />
</list>
</property>
</bean>
<!-- Register the bean -->
<bean class="com.common.controller.WelcomeController" />
<!-- Register the welcome.properties -->
<bean id="messageSource"
class="org.springframework.context.support.ResourceBundleMessageSource">
<property name="basename" value="welcome" />
</bean>
<bean id="viewResolver"
class="org.springframework.web.servlet.view.InternalResourceViewResolver" >
<property name="prefix">
<value>/WEB-INF/pages/</value>
</property>
<property name="suffix">
<value>.jsp</value>
</property>
</bean>
The native2ascii is a handy tool build-in in the JDK, which is used to convert a file with ‘non-Latin 1′ or ‘non-Unicode’ characters to ‘Unicode-encoded’ characters.
Native2ascii example
Create a file named “source.txt”, put some Chinese characters inside, and save it as “UTF-8″ format.
Use native2ascii command to convert it into Unicode format.
C:>native2ascii -encoding utf8 c:\source.txt c:\output.txt
The native2ascii will read all the characters from “c:\source.txt” and encode it with “utf8″ format, and output all encoded characters to “c:\output.txt”
Open the “c:\output.txt”, you will see the all encoded characters, e.g \ufeff\u6768\u6728\u91d1
welcome.properties
welcome.springmvc = \u5feb\u4e50\u5b66\u4e60
Call the above string and store the value in database.
And if you want to display that inside JSP page:
Remember add the line
“<%@ page contentType=”text/html;charset=UTF-8″ %>”
on top of the jsp page, else the page may not able to display the UTF-8 (Chinese) characters properly.
Upvotes: 2
Reputation: 491
I was facing the same problem while inserting "tamil" characters into the database.After surfing a lot I got a better and working solution and it solves my problem.Here I am sharing my solution with you.I hope it will help you to clear your doubts regarding that Non English character.
INSERT INTO
STUDENT(name,address)
VALUES
(N'பெயர்', N'முகவரி');
I am using a sample since you have not provided me any structure of your table and field name.
Upvotes: 7
Reputation: 28519
Another tip, don't lean only on setting the characterEncoding
as a hibernate property <prop key="hibernate.connection.characterEncoding">UTF-8</prop>
, make sure you add it explicitely as connection variable on the DB url, so
jdbc:mysql://host:port/dbName?useUnicode=true&characterEncoding=UTF-8&connectionCollation=utf8_general_ci&characterSetResults=utf8
Also, as there is some many layers where an encoding would be lost, you can try to isolate the layer and update to a question. E.g. if its upon storing to DB, or at some point before
Upvotes: 3
Reputation: 23415
There might be a couple of things that you could have missed out. I had the same problem with mysql on linux, what I had to do is to edit my.cnf
like this:
[client]
default-character-set = utf8
[mysqld]
character-set-server = utf8
For e.g. on Centos this file is location at /etc/my.cnf
on Windows (my pc) C:\ProgramData\MySQL\MySQL Server 5.5\my.ini
. Please note that ProgramData
might be hidden.
Also the other thing if you are using Tomcat is that you have to sepcify UTF-8 for URI encoding. Just edit server.xml
and modify your main Connector
element:
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
URIEncoding="UTF-8"
redirectPort="8443" />
Also make sure you added character encoding filter in your application:
@WebFilter(filterName = "CharacterEncodingFilter", urlPatterns = {"/*"})
public class CharacterEncodingFilter implements Filter {
@Override
public void init(FilterConfig filterConfig)
throws ServletException {
}
@Override
public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain)
throws IOException, ServletException {
HttpServletRequest request = (HttpServletRequest) servletRequest;
request.setCharacterEncoding("UTF-8");
servletResponse.setContentType("text/html; charset=UTF-8");
filterChain.doFilter(request, servletResponse);
}
@Override
public void destroy() {
}
}
Hope this helps.
Upvotes: 4
Reputation: 142278
I am assuming you want ગુજ
(GA JA with Vowel sign U)?
I think you somehow specified "latin5". (Yes I see you have UTF-8 everywhere, but "latin5" is the only way I can make things work.)
CONVERT(CONVERT(UNHEX('C3A0C2AAC297C3A0C2ABC281C3A0C2AAC29C')
USING utf8) USING latin5) = 'ગુજ'
Plus you ended up with "double encoding"; I suspect this is what happened:
SET NAMES latin5
was used, but it lied by claiming that the client had latin5 encoding; andCHARACTER SET utf8
(good).If possible, it would be better to start over -- empty the tables, be sure to have SET NAMES utf8
or establish utf8 when connecting from your client to the database. Then repopulate the tables.
If you would rather try to recover the existing data, this might work:
UPDATE ... SET col = CONVERT(BINARY(CONVERT(
CONVERT(UNHEX(col) USING utf8)
USING latin5)) USING utf8);
But you would need to do that for each messed up column in each table.
A partial test of that code is to do
SELECT CONVERT(BINARY(CONVERT(
CONVERT(UNHEX(col) USING utf8)
USING latin5)) USING utf8)
FROM table;
I say "partial test" because looking
right may not prove that is
right.
After the UPDATE
, SELECT HEX(col)
get E0AA97E0AB81E0AA9C
for ગુજ
. Note that most Gujarati hex should be of the form E0AAyy
or E0AByy
. You might also find 20
for a blank space.
I apologize for not being more certain. I have been tackling Character Set issues for a decade, but this is a new variant.
Upvotes: 5