Reputation: 2141
I have two strings that by all indication look identical:
x1 = 'N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_AD_Parallax_160x600'
x2 = 'N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_CT_Parallax_160X600'
However, checking for equality shows they are not.
In [312]: if x1 != x2:
.....: print 'yep'
.....:
yep
I also tried copying both strings out of command prompt and them pasting them back in as a new variables but they are still not equal. I'm 80% sure it's because they're encoded in a weird way, with some odd characters inserted that I can't see, but using type() both just show up as string.
Is there any way I can see the "real" string? Any help is appreciated.
Upvotes: 8
Views: 13981
Reputation: 1121266
They are not the same; using difflib.ndiff()
shows how these two values differ very clearly:
>>> import difflib
>>> print '\n'.join(difflib.ndiff([x1], [x2]))
- N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_AD_Parallax_160x600
? ^^ ^
+ N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_CT_Parallax_160X600
? ^^ ^
In general, when in doubt use repr()
to look at the representation. Python 2 will use escapes for any non-printable or non-ASCII character in the string, any 'funny' characters will stand out like a sore thumb. In Python 3, use the ascii()
function for the same result as repr()
there is less conservative and Unicode is rife with character combinations that look the same at first glance.
For strings where you still cannot see what changes between the two, the above difflib
tool can also help point out what exactly changed.
Upvotes: 23