Solijoli
Solijoli

Reputation: 474

Pandas Series unique method showing values looking the same

I have a pandas dataframe. When I run the .unique() method to one of the columns, it shows values looking the same. How can I see how these values differ? I tried to index from the unique() method but the values were just the strings as shown below. Thanks for the help.

df["MyColumn"].unique()
array(['yi̇', 'yd', 'yi'], dtype=object)
_______________________________________
df["MyColumn"].unique()[0]
'yi̇'
_______________________________________
df["MyColumn"].unique()[2]
'yi̇'

Upvotes: 1

Views: 176

Answers (2)

jezrael
jezrael

Reputation: 862511

You can check asci code what is difference, here after first i is special value 775 like mnetioned in comment Er Bharath Ram:

u = ['yi̇', 'yd', 'yi']
print ([list(map(ord,i)) for i in u])
[[121, 105, 775], [121, 100], [121, 105]]

Upvotes: 2

Arco Bast
Arco Bast

Reputation: 3892

On closer inspection you see the difference:

'yi̇' # the i letter has two dots
'yi' # normal i letter

So you seem to look at two different unicode characters, which however look very similar.

Upvotes: 1

Related Questions