Lyubomir Vasilev
Lyubomir Vasilev

Reputation: 3030

Can the keywords (and standard library) of C++ be localised? (Modifiable parser syntax)

This question is conceived out of curiosity and is mostly a mental exercise:

According to the C++ standard (and as described in this question and its answers), a compiler should support Unicode (and even more precisely UTF-8 in source) in the names of identifiers (variables, functions, etc.) I know that Clang supports that fully (I mean you can use UTF-8 encoded source files) and GCC supports it only if you use \u codes in the identifiers, but let's assume this works directly with utf-8 on all compilers.

That is great! Now I no longer have to write my code in English and can finally do it in my native Bulgarian, or maybe Esperanto. That's the point of this requirement of the standard, after all. Take that as a joke, of course, but I'm just curious to understand to what extend this could be taken. To illustrate, let's take this (purely example) code:

First using identifiers in English (ASCII):

int i = 0;
while(i < 100)
{
    auto f = static_cast<float>(i);
    std::string currentName = "name_" + toString(f);
    std::cout << getPrettyName(currentName) << ": " << getSalary(currentName) << std::endl;
}

And then using identifiers in Bulgarian (to see what can be achieved):

int и = 0;
while(и < 100)
{
    auto д = static_cast<float>(и);
    std::string текущоИме = "име_" + превърниВНиз(д);
    std::cout << красивоИме(текущоИме) << ": " << заплата(текущоИме) << std::endl;
}

As you can see, the second code is still mainly in English because of keywords and the standard library. If we assume the goal of this feature is to make it easy for non-English speakers to code, then:

  1. Most of it is still in English, so people still have to know English to be proper programmers
  2. Should we accept that, this is still very annoying to write when using a script other than latin (like cyrillic but also others). It requires constant switching, which is slow.

It looks like for this feature to be useful, the C++ language should support localised keywords and standard library classes. That has been done for ALGOL 68 and possibly others. Let's see how that would look in Bulgarian (with my own translations of the keywords):

цяло и = 0;
докато(и < 100)
{
    авт д = статично_преобразуване<дробно>(и);
    стд::низ текущоИме = "име_" + превърниВНиз(д);
    стд::изх << красивоИме(текущоИме) << ": " << заплата(текущоИме) << стд::кред;
}

Still having in mind that this is just a mental exercise, it would be interesting to see what's the people's take on these questions:

  1. Is this actually allowed/possible according to the standard right now? I may be missing something…
  2. Is there any way to make a workaround in a decent way myself? Macros would work for the keywords but that would be awful. using would work about standard library classes (namespace стд { using низ = std::string; }) but there is no way to deal with methods (std::string::size() -> размер()?) apart from subclassing… or is there?
  3. In case that is not possible or even considered, how should one go about suggesting this idea to the C++ gurus that make the standard?

Upvotes: 1

Views: 181

Answers (2)

Jan Hudec
Jan Hudec

Reputation: 76316

That is great! Now I no longer have to write my code in English and can finally do it in my native Bulgarian, or maybe Esperanto. That's the point of this requirement of the standard, after all.

I am pretty sure it isn't. The point of the standard seems to be purely compatibility with other programming systems that may generate such symbols. After all, the specification does not require accepting actual utf-8 anywhere. The only thing it requires is the \u escapes supported in gcc.

  1. Is this actually allowed/possible according to the standard right now? I may be missing something…

No, it isn't. The specification specifies the exact symbol names.

  1. Is there any way to make a workaround in a decent way myself? Macros would work for the keywords but that would be awful. using would work about standard library classses (namespace стд { using низ = std::string; }) but there is no way to deal with methods (std::string::size() -> размер()) apart from subclassing… or is there?

You could cover them with the #defines, but obviously it would apply to the same name everywhere, which is rarely appropriate.

  1. In case that is not possible or even considered, how should one go about suggesting this idea to the C++ gurus that make the standard?

Forget it. It is extremely bad, borderline evil, idea. Remember, that most code out there is, or one day will be, maintained, or at least reviewed, by somebody on the other end of the world, who has different native language. English makes that possible. Switching from it would be very, very bad. At least bad for the big software companies and keep in mind that the key people in C++ standards committee do represent big software companies.

Upvotes: 5

No, keywords are fixed in the C++ standard (C++11, C++14, etc...). You cannot change them (or else the language won't be C++ anymore).

You might use preprocessor tricks like:

#define стд std

(or, as you commented, using стд = std;; but for proper keywords like while you can "replace" them only with the preprocessor). But I am not sure that is working, and I really believe it is a very bad idea.

A C++ programmer is expecting the names mentioned in the standard. Don't confuse him.

And programming is not about coding in a near natural language (that was the ambition of Cobol, which failed completely on that aspect). The point is that programming is difficult so it takes ten years to learn it, so you do expect programmers to be able to use English looking keywords and read technical documentation in English.

Upvotes: 8

Related Questions