Reputation: 3694
I know there're many UnicodeDecodeError around, but I can't find anyone explaining my issue.. And this error randomly jumped from an environment that I can't reproduce..
What I want is to just to compare two strings byte to byte (I don't want any encoding and decoding).
Note that I'm using python2.7 and str1 is from open('..', 'r').read() on linux.
Hope for your advices..
def diff_str(str1, str2):
minlen = min(len(str1), len(str2))
if str1 == str2:
return "All %d bytes same" %minlen
for diff_pos in xrange(minlen):
if str1[diff_pos] != str2[diff_pos]:
break
k = 100
to_ret = "(%d vs %d) chars\n" % (len(str1), len(str2))
to_ret += "diff starts at %d:\n" % diff_pos
# error jumps out at here..
to_ret += str1[diff_pos:diff_pos+k] + "\n"
to_ret += str2[diff_pos:diff_pos+k] + "\n"
return to_ret
Upvotes: 0
Views: 125
Reputation: 17623
Firstly, please paste your string(or file) you want to compare and make the errors.
Have a try:
for uchar in your_string.decode('utf-8'):
# compare chars
Do you want a binary comparsion? Then give it a try:
oneBuf = bytes(yourfile.read(1024))
Then compare the byte bufs.
Upvotes: 1