Reputation: 10558
I have a csv file
which contains words in english followed by their Hindi translation. I am trying to read the csv file and do some further processing with it. The csv file looks like so:
English,,Hindi,,,
,,,,,
Cat,,बिल्ली,,,
Rat,,चूहा,,,
abandon,,छोड़ देना,त्याग देना,लापरवाही की स्वतन्त्रता,जाने देना
I am trying to read the csv file line by line and display what has been written. The code snippet (Java
) is as follows:
//Step 2. Read csv file and get the string.
FileInputStream fis = null;
BufferedReader br = null;
try {
fis = new FileInputStream(new File(csvFile));
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
boolean startSeen = true;
if(fis != null) {
try {
br = new BufferedReader(new InputStreamReader(fis, "UTF-8"));
} catch (UnsupportedEncodingException e2) {
// TODO Auto-generated catch block
e2.printStackTrace();
System.out.print("Unsupported encoding");
}
String line = null;
if(br != null) {
try {
while((line = br.readLine()) != null) {
if(line.contains("English") == true) {
startSeen = true;
}
if((startSeen == true) && (line != null)) {
StringBuffer sbuf = new StringBuffer();
//Step 3. Parse the line.
sbuf.append(line);
System.out.println(sbuf.toString());
}
}
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
}
However, the following output is what I get:
English,,Hindi,,,
,,,,,
Cat,,??????,,,
Rat,,????,,,
abandon,,???? ????,????? ????,???????? ?? ???????????,???? ????
My Java is not that great and though I have gone through a number of posts on SO, I need more help in figuring out the exact cause of this problem.
Upvotes: 7
Views: 6816
Reputation: 6637
So as discussed in above answers; solutions it is TWO steps 1) Save your txt file as UTF-8 2) Change the property of your Java code to use UTF-8 In Eclipse; right click on Java file; Properties -> Resurces -> Text File Encoding -> Other -> UTF-8
Refer screenshot given on http://howtodoinjava.com/2012/11/27/how-to-compile-and-run-java-program-written-in-another-language/
Upvotes: 0
Reputation: 854
For reading text file it is better to use character stream e.g by using java.util.Scanner directly instead of FileInputStream. About encoding you have to make sure first that the text file that you want to read is saved as 'UTF-8' and not otherwise. I also notice in my system, I have to save my java source file as 'UTF-8' as well to make it shown hindi char properly.
However I want to suggest simpler way to read csv file as follow:
Scanner scan = new Scanner(new File(csvFile));
while(scan.hasNext()){
System.out.println(scan.nextLine());
}
Upvotes: 5
Reputation: 135992
I think your console cannot show Hindi chars. Try
System.out.println("Cat,,बिल्ली,,,");
to test
Upvotes: 2