jensengar
jensengar

Reputation: 6167

Java Replace unicode chars in string

I have a program that reads in a file. In this file there are some crazy chars that I have never seen before. The purpose of this file is to parse certain information into SQL statements.

When I get to this line in the file "read “Details for …(the name of the title”" (notice the horizontal ellipses and the right/left quotes), it outputs into this:

�Details for �(the name of the title�

I just want to replace the chars that are right with chars defined by me. I have tried:

st = st.replaceAll("…","...");
st = st.replaceAll("\u2026","...");

This is how i read the file:

 FileInputStream file = new FileInputStream(filePath);
 DataInputStream in = new DataInputStream(file); 
 BufferedReader br = new BufferedReader(new InputStreamReader(in));

And other things that I cant even remember. How can I do this seemingly simple task?

Upvotes: 0

Views: 2008

Answers (2)

Paul Vargas
Paul Vargas

Reputation: 42030

You need specify the encoding on read the file before replaces specials chars...

FileInputStream inputStream = new FileInputStream("input.txt");
// Specify the enconding
InputStreamReader streamReader = new InputStreamReader(inputStream, "UTF-8");
BufferedReader in = new BufferedReader(streamReader);

Upvotes: 1

ChristopheD
ChristopheD

Reputation: 116157

Unless it's absolutely necessary you don't really have to drop those weird (yet still meaningful) characters...

Have a look at the documentation for InputStreamReader and specify the right encoding when reading your file.

Upvotes: 0

Related Questions