Reputation: 2035
I work on a multi-language software codebase (python, JS, java, PHP, C) where my previous colleagues commented everything. However, the vast majority of comments is completely unuseful. Ex :
/**
* Get warning file info
*/
function getWarningFileInfos() {
...
}
/**
* Compute the speed
*/
function computeSpeed() {
...
}
I want to setup linter rules to make sure such comments are not written again. Do you know linters that have such feature, or in which this feature can be added easily ? (the best would be linters compatible with non-english language comments)
Upvotes: 3
Views: 653
Reputation: 2519
There is no such linters on GitHub, So you have to write one by yourself.
The PHP_CodeSniffer is best known and powerful linter in the PHP world, So here is quick and dirty implementation based on PHP_CodeSniffer 3.5.2 for your purpose, It uses similar_text function to compare function name and comment string, If the percentage is more than 60%, It will mark this function as error with message "Contains unusual comment".
File src/Standards/StackOverflow/Sniffs/Commenting/UnusualCommentSniff.php
<?php
namespace PHP_CodeSniffer\Standards\StackOverflow\Sniffs\Commenting;
use PHP_CodeSniffer\Files\File;
use PHP_CodeSniffer\Sniffs\Sniff;
use PHP_CodeSniffer\Util\Tokens;
class UnusualCommentSniff implements Sniff
{
public function register()
{
return [T_FUNCTION];
}
public function process(File $phpcsFile, $stackPtr)
{
$tokens = $phpcsFile->getTokens();
$find = Tokens::$methodPrefixes;
$find[] = T_WHITESPACE;
$commentEnd = $phpcsFile->findPrevious($find, ($stackPtr - 1), null, true);
if ($tokens[$commentEnd]['code'] != T_DOC_COMMENT_CLOSE_TAG) {
return;
}
$commentStart = $tokens[$commentEnd]['comment_opener'];
$shortDesc = $this->findShortDescriptionInComment($phpcsFile, $commentStart, $commentEnd);
$funcName = $phpcsFile->getDeclarationName($stackPtr);
similar_text($funcName, $shortDesc, $percent);
if ($percent > 60) {
$error = 'Contains unusual comment';
$fix = $phpcsFile->addFixableError($error, $stackPtr, 'code');
if ($fix) {
$fixer = $phpcsFile->fixer;
for ($i = $commentStart; $i < $commentEnd + 1; $i++) {
$fixer->replaceToken($i, '');
}
}
}
}
protected function findShortDescriptionInComment($phpcsFile, $commentStart, $commentEnd)
{
$tokens = $phpcsFile->getTokens();
$empty = [
T_DOC_COMMENT_WHITESPACE,
T_DOC_COMMENT_STAR,
];
$short = $phpcsFile->findNext($empty, $commentStart + 1, $commentEnd, true);
if ($short === false) {
return;
}
$shortContent = null;
if ($tokens[$short]['code'] === T_DOC_COMMENT_STRING) {
$shortContent = $tokens[$short]['content'];
$shortEnd = $short;
for ($i = ($short + 1); $i < $commentEnd; $i++) {
if ($tokens[$i]['code'] === T_DOC_COMMENT_STRING) {
if ($tokens[$i]['line'] === ($tokens[$shortEnd]['line'] + 1)) {
$shortContent .= $tokens[$i]['content'];
$shortEnd = $i;
} else {
break;
}
}
}
}
return $shortContent;
}
}
Construct files and directories like this
$ pwd
/home/gasolwu/Code/PHP_CodeSniffer/src/Standards
$ git describe
3.5.2-89-g80ebd4a1a
$ tree -C StackOverflow/
StackOverflow/
├── ruleset.xml
└── Sniffs
└── Commenting
└── UnusualCommentSniff.php
2 directories, 2 files
$ cat StackOverflow/ruleset.xml
<?xml version="1.0"?>
<ruleset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" name="Zend" xsi:noNamespaceSchemaLocation="../../../phpcs.xsd">
</ruleset>
Prepare input file for test
$ cat input.php
<?php
class Foo
{
/**
* Get warning file info
*/
private function getWarningFileInfos() {
}
/**
* Compute the speed
*/
public function computeSpeed() {
}
/**
* This comment contains some useful information
* So that should not be deleted
*/
public function shoot() {
}
}
Run with command phpcs
$ bin/phpcs --standard=Stackoverflow ./input.php
FILE: /data1/home/admin/gasolwu/Code/PHP_CodeSniffer/input.php
----------------------------------------------------------------------
FOUND 2 ERRORS AFFECTING 2 LINES
----------------------------------------------------------------------
9 | ERROR | [x] Contains unusual comment
15 | ERROR | [x] Contains unusual comment
----------------------------------------------------------------------
PHPCBF CAN FIX THE 2 MARKED SNIFF VIOLATIONS AUTOMATICALLY
----------------------------------------------------------------------
Time: 44ms; Memory: 3.75MB
Fixe this kind of errors automatically by command phpcbf
$ bin/phpcbf --standard=Stackoverflow ./input.php
PHPCBF RESULT SUMMARY
-----------------------------------------------------------------------------
FILE FIXED REMAINING
-----------------------------------------------------------------------------
/data1/home/admin/gasolwu/Code/PHP_CodeSniffer/input.php 2 0
-----------------------------------------------------------------------------
A TOTAL OF 2 ERRORS WERE FIXED IN 1 FILE
-----------------------------------------------------------------------------
Time: 52ms; Memory: 3.75MB
$ cat input.php
<?php
class Foo
{
private function getWarningFileInfos() {
}
public function computeSpeed() {
}
/**
* This comment contains some useful information
* So that should not be deleted
*/
public function shoot() {
}
}
Upvotes: 3
Reputation: 3418
If you really need it, you can use something crazy like this:
$c = '/**
* Get warning file info
*/
function getWarningFileInfos() {
...
}
/**
* Compute the speed
*/
function computeSpeed() {
...
}';
preg_match_all('~\*/[ \t]*\r?\n(?:[ \t]|\w)*function[ \t]+(\S+)[ \t]*\(~isu', $c, $matchesAll);
foreach ($matchesAll[1] as $n) {
$words = preg_split('~(?=\p{Lu})~', $n);
$regex = '~/\*(?:\s|\*)*' . implode('', array_map(function($w) {
return '(?:(?:a|the)\s+)?' . preg_quote(preg_replace('~e?s$~is', '', $w), '~') . '(?:e?s)?' . '\s+';
}, $words)) . '(?:\s|\*)*\*+/[ \t]*(?=\r?\n(?:[ \t]|\w)*function[ \t]+' . preg_quote($n, '~') . '[ \t]*\()~isu';
var_dump($regex);
$c = preg_replace($regex, '', $c);
}
var_dump($c);
example output:
string(231) "~/\*(?:\s|\*)*(?:(?:a|the)\s+)?get(?:e?s)?\s+(?:(?:a|the)\s+)?Warning(?:e?s)?\s+(?:(?:a|the)\s+)?File(?:e?s)?\s+(?:(?:a|the)\s+)?Info(?:e?s)?\s+(?:\s|\*)*\*+/[ \t]*(?=\r?\n(?:[ \t]|\w)*function[ \t]+getWarningFileInfos[ \t]*\()~isu"
string(162) "~/\*(?:\s|\*)*(?:(?:a|the)\s+)?compute(?:e?s)?\s+(?:(?:a|the)\s+)?Speed(?:e?s)?\s+(?:\s|\*)*\*+/[ \t]*(?=\r?\n(?:[ \t]|\w)*function[ \t]+computeSpeed[ \t]*\()~isu"
string(88) "
function getWarningFileInfos() {
...
}
function computeSpeed() {
...
}"
There is probably no tools outside educating your programmers, does the code above helped you?
Upvotes: 1
Reputation: 8967
This is an issue that requires teaching to your colleagues what comments are for and what kind of comments should be written.
If you just block comments with the same name as the method automatically, you will end up with slight variations:
/**
* Get warning file info
*/
function getWarningFileInfos() {
...
}
becomes:
/**
* Get the warning file info
*/
function getWarningFileInfos() {
...
}
... and the linter rule will accept it. This isn't really an issue that will get resolved with a linting rule.
If you are in position to demand proper comments from your colleagues, asking them to rewrite the comments properly is a great exercise to teach them what comments should be written.
There is no linter that can transform a useless comment into an useful one.
If you just want to delete all the lousy comments, you can do so with regex:
/\*.{1,50}\*/
will find all comments shorter than 50 characters (the editor must support the regex setting ". matches newline").
Replace with nothing and manually go through the files to check you aren't deleting anything valuable. The assumption is that most of these stupid comments are very short. 50 chars is arbitrary, change it to whatever works best for you.
Upvotes: 5