ThE uSeFuL
ThE uSeFuL

Reputation: 1534

Find an unicode character in a txt - android

I am reading a txt file containing unicode characters. I need to find whether a specific unicode character exist in this file. The code so far is as follows,

    try {
        BufferedReader reader = new BufferedReader(
            new InputStreamReader(getAssets().open("DistinctWords.txt"), "UTF-8"));

         int i = 0;
        String mLine = reader.readLine();
        while ((mLine != null)) {
           //process line
//unicode value taken from http://codepoints.net/U+0D85
            if (mLine.contains("\u0D85")){
                i++;
            }
           mLine = reader.readLine(); 

        }

        reader.close();
        Log.i("tula", "Ayanna - " + String.valueOf(i));
    } catch (IOException e) {
        //log the exception
    }

Problem: Value of "i" is always "0". When I open the same text file from notepad I can see the letter but my code fails to find it.

Upvotes: 2

Views: 812

Answers (1)

HalR
HalR

Reputation: 11073

Like TronicZomB says, I think you need to be looking for the actual character, like:

while ((mLine != null)) {
   //process line
    if (mLine.contains("අ")){
        i++;
    }
   mLine = reader.readLine(); 
}

You will want to use an editor that can handle the proper encoding:

  • Notepad on Windows will allow you to specify UTF-8 encoding on your file, but you have to set the encoding on the file to UTF-8 from ANSI.
  • On mac OS-x you can use TextEdit. In the preferences, with the open & save tab, you can set the document encoding.
  • On Linux StarOffice supposedly works, but I haven't used it.

Upvotes: 2

Related Questions