dargaud
dargaud

Reputation: 2581

How to find foreign language used in "C comments"

I have a large source code where most of the documentation and source code comments are in english. But one of the minor contributors wrote comments in a different language, spread in various places.

Is there a simple trick that will let me find them ? I imagine first a way to extract all comments from the code and generate a single text file (with possible source file / line number info), then pipe this through some language detection app.

If that matters, I'm on Linux and the current compiler on this project is CLang.

Upvotes: 0

Views: 72

Answers (2)

Jákup
Jákup

Reputation: 64

The only thing that comes to mind is to go through all of the code manually and check it yourself. If it's a similar language, that doesn't contain foreign letters, consider using something with a spellchecker. This way, the text that isn't recognized will get underlined, and easy to spot.

Other than that, I don't see an easy way to go through with this.

You could make a program, that reads the files and only prints the comments out to another output file, where you then spell check that file, but this would seem to be a waste of time, as you would easily be able to spot the comments yourself. If you do make a program for that, however, keep in mind that there are three things to check for:

  1. If comment starts with /*, make sure it stops reading when encountering */
  2. If comment starts with //, only read one line - unless:
  3. If line starting with // ends with \, read next line as well

Upvotes: 1

DrKoch
DrKoch

Reputation: 9772

While it is possible to detect a language from a string automatically, you need way more words than fit in a usual comment to do so.

Solution: Use your own eyes and your own brain...

Upvotes: -1

Related Questions