Reputation: 19445
im wondering about how to set up a clever way to have all my input 'clean', a procedure to run at the begin of every my script. I thought to create a class to do that, and then, add a 2 letter prefix in the begin of every input to identify the kind of input, for example:
in-mynumber
tx-name
ph-phone
em-email
So, at the top of my scripts i just run a function (for example):
function cleanInputs(){
foreach($_GET AS $taintedKey => $taintedValue){
$prefix = substr($taintedKey, 0, 2);
switch($prefix){
case 'in':
//I assume this input is an integer
$cGet[$taintedKey] = intval($taintedValue);
break;
case 'tx':
//i assume this input is a normal text
//can contains onely letters, numbers and few symbols
if(preg_match($regExp, $taintedValue)){
$cGet[$taintedKey] = $taintedValue;
}else{
$cGet[$taintedKey] = false;
}
break;
case 'em':
//i assume this input is a valid email
if(preg_match('/^[a-zA-Z0-9-_.]+@[a-zA-Z0-9-_.]+.[a-zA-Z]{2,4}$/', $taintedValue)){
$cGet[$taintedKey] = $taintedValue;
}else{
$cGet[$taintedKey] = false;
}
break;
}
}
}
..so i'll create other 2 arrays, $cGet and $cPost with the clean data respectively of $_GET and $_POST, and in my script i'lllook for use those arrays, completely forget the $_GET/$_POST I'm even thinkin about add a second prefix to determinate the input's max lenght... for example: tx-25-name ..but im not pretty sure about that.. and if i take this way, maybe a OOP approach will be better.
What do you think about that? Seem be a good way to use?
The negatives point that i can actually see (i havent still used that way, is just a wonder of this morning) 1. The prefix, and so the procedures, must be many if i want my application not to be much restrictive; 2. My sent variable's names will become little longer (but we are talking of 3-6 chars, shouldnt be a problem)
Any suggestion is really appreciated!
EDIT:
Im not triyn to reinvent the wheel, my post was't about the sistem to sanitizing input, but is about the procedure to do it. I use htmlpurifier to clen the possibly xss injection in html data, and of course i use the parametrized queryes. Im just wondering if is better take input by input, or sanitize them all at the begin and consider they clean in the rest of the script. The method i thougt is not miracolous and nothing new under the sun, but i think that truncate the input if is not in the format that i aspect, can be usefull...
Why check for sql injection in the 'name' field, that must contain just letters and the apostrophe char? Just remove everythings that is not letter or apostophe, add slashes for the last one, and run into a parametrized query. Then, if you aspect an email, just delete everythings that is not an email..
Upvotes: 0
Views: 496
Reputation: 115691
What are you trying to do? If you need to sanitize input to save data to the database, there's nothing better than parameterized queries.
See this for an example.
Upvotes: 0
Reputation: 106904
The idea is fine in itself, however I wonder if it really will be very useful.
For one thing, SQL injections and HTML injections can (should) be protected in another way. SQL injections are prevented by parametrized queries (a must-have this day and age); and HTML injections are prevented by htmlspecialchars()
method, which should be called right before outputting the string to the user. Don't store encoded strings in the DB or (even worse) - encode them as soon as receiving them. Working with them will be a hell later.
Other than these two injection attacks, what will your method do? Well, it can do some regexps for stuff like numbers, phone numbers, emails, names and dates. But that's about it. Unfortunately that's just a part of all the validations you will have to do. Other common cases that you cannot validate there are cross-checking of inputs (start date before end date), and checking that a value is in a list of allowed predefined values (say, for a <select>
element). And there are an infinite number of custom validation steps that you will have in your application as well. Is it worth to split up all validation in "generic type validation" and "custom rule validation"? I don't know. Perhaps. Or perhaps this will just make a bigger mess.
Upvotes: 0
Reputation: 42306
There are many well-made PHP tested classes that already sanitize inputs. Why make another one? Besides, sanitizing input is more than just verifying data types. It implies checking for sql injections, xss attacks, etc...
Upvotes: 2