What does it mean that two strings have the same linguistic meaning?

Question

In the swift documentation for comparing strings, I found the following:

Two String values (or two Character values) are considered equal if their extended grapheme clusters are canonically equivalent. Extended grapheme clusters are canonically equivalent if they have the same linguistic meaning and appearance, even if they are composed from different Unicode scalars behind the scenes.

Then the documentation proceeds with the following example which shows two strings that are "cannonically equivalent"

For example, LATIN SMALL LETTER E WITH ACUTE (U+00E9) is canonically equivalent to LATIN SMALL LETTER E (U+0065) followed by COMBINING ACUTE ACCENT (U+0301). Both of these extended grapheme clusters are valid ways to represent the character é, and so they are considered to be canonically equivalent:

Ok. Somehow e and é look the same and also have the same linguistic meaning. Sure I'll give them that. I have taken a Spanish class sometime and the prof wasn't too strict on whether we used either forms of e, so I'm guessing this is what they are referring to. Fair enough

The documentation goes further to show two strings that are not canonically equivalent:

Conversely, LATIN CAPITAL LETTER A (U+0041, or "A"), as used in English, is not equivalent to CYRILLIC CAPITAL LETTER A (U+0410, or "А"), as used in Russian. The characters are visually similar, but do not have the same linguistic meaning:

Now here is where the alarm bells go off and I decide to ask this question. It seems that appearance has nothing to do with it because the two strings look exactly the same, and they also admit this in the documentation. So it seems that what the string class is really looking for is linguistic meaning?

This is why I ask what it means by the strings having the same/different linguistic meaning, because e is the only form of e that I know which is mainly used in English, but I have only seen é being used in languages like French or Spanish, so why is it that the given that А is used in Russian and A is used in English, is what causes the string class to say that they are not equivalent?

I hope I was able to walk you through my thought process, now my question is what does it mean for two strings to have the same linguistic meaning (in code if possible)?

What does it mean that two strings have the same linguistic meaning?

Answers (1)

Related Questions