utf-8 string indices in python not compatible in java

Question

I have a text file with the following content:

 🔴🔴🔴🔴🔴
==================\0No. 4♨ ==
📌 
✅IHappy Holi
✅Ground Floor or Second Floor
9910080224
emailaddress@gmail.com

I have a python code running in the server to find the indices which I want to pass with the text for the highlighting purposes on the client. Following is the code for that:

import re
f = open('data.json', 'r')
text = f.readline().strip().decode('UTF-8').encode('UTF-8')
f.close()

for m in re.finditer(r'emailaddress', text, flags=re.IGNORECASE): 
    s = m.start()
    e = m.end()
    print s, e
    print text[s:e]

The output is:

123 135
emailaddress

Now on the client side, I have the java code (on android). HOwever these indices dont work at all.

public class HelloWorld {
    public static void main(String[] args) {
        String text = "🔴🔴🔴🔴🔴
==================\0No. 4♨ ==
📌 
✅IHappy Holi
✅Ground Floor or Second Floor
9910080224
emailaddress@gmail.com";
        System.out.println(text.substring(**115**));
    }
}

And the output is:

l.com

I am sure I am making some mistake in the encoding of the strings. Can someone help me with that.

utf-8 string indices in python not compatible in java

Answers (1)

Related Questions