chocolata
chocolata

Reputation: 3338

What is wrong with this PHP captcha script?

I've been using this script for a long time and it works perfectly in 99%. It's easy and clear to users and I would like to continue using it.

However, once in a while a sparse user tells me that the system doesn't accept his captcha (wrong code) while the numbers are correct. Each time I've been going over their cookies settings, clearing cache etc, but in these cases nothing seems to work.

My question thus is, is there any reason in the code of this script that would explain malfunctioning in exceptional cases?

session_start();

$randomnr = rand(1000, 9999);
$_SESSION['randomnr2'] = md5($randomnr);

$im = imagecreatetruecolor(100, 28);
$white = imagecolorallocate($im, 255, 255, 255);
$grey = imagecolorallocate($im, 128, 128, 128);
$black = imagecolorallocate($im, 0,0,0);

imagefilledrectangle($im, 0, 0, 200, 35, $black);

$font = '/img/captcha/font.ttf';

imagettftext($im, 30, 0, 10, 40, $grey, $font, $randomnr);
imagettftext($im, 20, 3, 18, 25, $white, $font, $randomnr);

// Prevent caching
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
header("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); // Date in the past3
header("Cache-Control: post-check=0, pre-check=0", false);
header("Pragma: no-cache");

header ("Content-type: image/gif");

imagegif($im);
imagedestroy($im);

In my form, I then call this script as the source of the captcha image. After sending the form, the captcha is checked this way:

if(md5($_POST['norobot']) != $_SESSION['randomnr2']) {
    echo 'Wrong captcha!';
}

Please note that session_start(); is called on the form page and the form result page.

If anyone could pinpoint potential error causes in this script, I would appreciate it!

P.S.: I am aware of the drawbacks of captcha scripts. I am aware that certain bots could still read them out. I do not wish to use Recaptcha, because it is too difficult for my users (different language + lots of times older users). I also am aware of the fact that md5 is easily decryptable.


EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT


Following the remarks of Ugo Méda, I've been doing some experiments. This is what I've created (simplified for your convenience):

The form

// Insert a random number of four digits into database, along with current time
$query   = 'INSERT INTO captcha (number, created_date, posted) VALUES ("'.rand(1000, 9999).'", NOW(),0)';
$result  = mysql_query($query);

// Retrieve the id of the inserted number
$captcha_uid = mysql_insert_id();

$output .= '<label for="norobot"> Enter spam protection code';
// Send id to captcha script
$output .= '<img src="/img/captcha/captcha.php?number='.$captcha_uid.'" />'; 
// Hidden field with id 
$output .= '<input type="hidden" name="captcha_uid" value="'.$captcha_uid.'" />'; 
$output .= '<input type="text" name="norobot" class="norobot" id="norobot" maxlength="4" required  />';
$output .= '</label>';

echo $output;

The captcha script

$font = '/img/captcha/font.ttf';

connect();
// Find the number associated to the captcha id
$query = 'SELECT number FROM captcha WHERE uid = "'.mysql_real_escape_string($_GET['number']).'" LIMIT 1';
$result = mysql_query($query) or trigger_error(__FUNCTION__.'<hr />'.mysql_error().'<hr />'.$query);
if (mysql_num_rows($result) != 0){          
    while($row = mysql_fetch_assoc($result)){
        $number = $row['number'];
    }
} 
disconnect();

$im     = imagecreatetruecolor(100, 28);
$white  = imagecolorallocate($im, 255, 255, 255);
$grey   = imagecolorallocate($im, 128, 128, 128);
$black  = imagecolorallocate($im, 0,0,0);

imagefilledrectangle($im, 0, 0, 200, 35, $black);
imagettftext($im, 30, 0, 10, 40, $grey, $font, $number);
imagettftext($im, 20, 3, 18, 25, $white, $font, $number);

// Generate the image from the number retrieved out of database
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
header("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); // Date in the past3
header("Cache-Control: post-check=0, pre-check=0", false);
header("Pragma: no-cache");
header ("Content-type: image/gif");

imagegif($im);
imagedestroy($im);

The result of the form

function get_captcha_number($captcha_uid) {
    $query = 'SELECT number FROM captcha WHERE uid = "'.mysql_real_escape_string($captcha_uid).'" LIMIT 1';
    $result = mysql_query($query);
    if (mysql_num_rows($result) != 0){          
        while($row = mysql_fetch_assoc($result)){
            return $row['number'];
        }
    } 
    // Here I would later also enter the DELETE QUERY mentioned above...
}
if($_POST['norobot'] != get_captcha_number($_POST['captcha_uid'])) {
    echo 'Captcha error'
    exit;
}

This works very well, so thanks very much for this solution.

However, I'm seeing some potential drawbacks here. I'm noting at least 4 queries and feels somewhat resource intensive for what we're doing. Also, when a user would reload the same page several times (just to be an asshole), the database would quickly fill up. Of course this would all be deleted upon the next form submit, but nonetheless, could you go over this possible alternative with me?

I'm aware that one should generally not encrypt / decrypt. However, since captchas are flawed by nature (because of image readouts of bots), couldn't we simplify the process by encrypting and decrypting a parameter that is being sent to the captcha.php script?

What if we did this (following the encrypt/decrypt instructions of Alix Axel):

1) Encrypt a random four digit character like so:

$key = 'encryption-password-only-present-within-the-application';
$string = rand(1000,9999);
$encrypted = base64_encode(mcrypt_encrypt(MCRYPT_RIJNDAEL_256, md5($key), $string, MCRYPT_MODE_CBC, md5(md5($key))));

2) Send the encrypted number with a parameter to the image script and store it in a hidden field

<img src="/img/captcha.php?number="'.$encrypted.'" />
<input type="hidden" name="encrypted_number" value="'.$encrypted.'" />

3) Decrypt the number (that was sent via $_GET) inside the captcha script and generate an image from it

$decrypted = rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_256, md5($key), base64_decode($encrypted), MCRYPT_MODE_CBC, md5(md5($key))), "\0"); 

4) Decrypt the number on form submit again to compare to user input $decrypted = rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_256, md5($key), base64_decode($encrypted), MCRYPT_MODE_CBC, md5(md5($key))), "\0");
if($_POST['norobot'] != $decrypted) { echo 'Captcha error!'; exit; }

Agreed, this is a little bit "security-through-obscurity", but it seems to provide some basic security and remains fairly simple. Or would this encrypt/decrypt action be too resource intensive on its own?

Does anyone have any remarks on this?

Upvotes: 1

Views: 1965

Answers (2)

Ugo M&#233;da
Ugo M&#233;da

Reputation: 1215

Don't rely only on the SESSION value, for two reasons :

  • Your session can expire, so it won't work in some cases
  • If the user opens another tab with the same page, you'll have a weird behavior

Use some sort of token :

  • Generate a random ID when your output your form, put it in your database with the number expected (and the current date/time)
  • Generate your image using this ID
  • Add an hidden input in your form with the ID
  • When you receive your POST, fetch the expected value from the database and compare it
  • Delete this token and all the old tokens (WHERE token == %token AND datetime < DATE_SUB(NOW(), INTERVAL 1 HOUR) for instance)

Upvotes: 3

bisko
bisko

Reputation: 4078

It sometimes happens that some visitors can be behind proxies or there is a plugin/software on their computer that can do double-request of some of the files. I have discovered this while developing a project of mine and had some Chrome plugin I have completely forgotten about.

As it is happening to so few of your visitors, it is possible that this is the case. Here are the steps I followed to debug the problem (keep in mind that this was a development environment and I was able to modify the code directly on the site):

When a visitor reports the problem, enable 'debugging' for them which means that I would add their IP to a debug array in the config of the captcha generator. This would do the following:

  1. Acquire the generation time of the image in microtime format.
  2. Write in a log file somewhere on the filesystem every request to the captcha page in a format similar to: ip|microtime|random_numbers
  3. Check the logs for the requests made by the user's IP address and see if there are any close requests in the ranges of about 10 seconds of each other. If there are, then there is something that is making a second request to your captcha page and it is generating a new code, which the visitor cannot see.

Also you need to make sure that after clearing the user's cache, the user is seeing different numbers at every refresh of the page. There can be a quirky behavior on the browser's end and it can be showing an old cached copy nevertheless (seen it on Firefox, you have to clear the cache, restart the browser, clear the cache again and then it works fine).

If this is the case you can do a simple time based addition to your script that does the following:

When generating a new captcha image, check if there is already a captcha numbers set in the session. If they are set, check what time they were generated and if it is less than let's say 10 seconds, just show the same numbers. If it is more than 10 seconds, show new numbers. The only caveat of this method is that you must unset the captcha variable in the session every time you use it.

An example code would be:

<?php

// begin generating captcha:

session_start();

if (
   empty($_SESSION['randomnr2']) // there is no captcha set
   || empty($_SESSION['randomnr2_time'])  // there is no time set
   || ( time() - $_SESSION['randomnr2_time']  > 10 ) // time is more than 10 secs
) {
   $randomnr = rand(1000, 9999);
   $_SESSION['randomnr2'] = md5($randomnr);
   $_SESSION['randomnr2_time'] = microtime(true); // this is the time it was 
                                                  // generated. You can use it 
                                                  // to write in the log file
}


// ...
?>

Upvotes: 1

Related Questions