Zac Brown
Zac Brown

Reputation: 6093

Better way to write this? Increase Speed?

I wrote the following PHP script to work with an HTTP proxy for content filtering. The proxy POSTs to this script the URL of the site the user is trying to visit. The script (obviously) checks the site for keywords that should be blocked, then responds to the proxy. It takes too long to navigate between pages with this. Currently.... about 3 mins. per page.

Here is that code:

<?php

$location = $_POST['Location'];
$user = $_POST['User'];
if($location == "") {
  die("Invalid Request! Missing Parameter 1!");
}

if($user == "") {
  die("Invalid Request! Missing Parameter 2!");
}
$con = mysql_connect("MySQL Host", "USER", "PASS") or die(mysql_error());
mysql_select_db("DBName", $con) or die(mysql_error());
$query = "SELECT `Policy` FROM Subscribe WHERE `Username`='$user'";
$result = mysql_query($query) or die(mysql_error());
if(mysql_num_rows($result) == "1") {
  $nothing = "nothing";
} else {
  die("Invalid User!");
}
while($row = mysql_fetch_assoc($result)) {
  $policy = $row['Policy'];
}
if($policy == "0") {
  echo "allow";
  exit;
}
if($policy == "4") {
  $query1 = "SELECT `Address`, `Keyword` FROM Policy WHERE `Owner`='$user'";
  $result2 = mysql_query($query1) or die(mysql_error());
  while($row = mysql_fetch_assoc($result2)) {
    $address = explode(',', $row['Address']);
    $keyword = explode(',', $row['Keyword']);
  }
} else {
  $query2 = "SELECT `Address`, `Keyword` FROM Policies WHERE `Policy`='p".$policy."'";
  $result2 = mysql_query($query2) or die(mysql_error());
  while($row = mysql_fetch_assoc($result2)) {
    $address = explode(',', $row['Address']);
    $keyword = explode(',', $row['Keyword']);
  }
}

if(in_array($location, $address)) {
  echo "deny";
  exit;
} else {
  $meta = get_meta_tags($location);
  $keywords = $meta['keywords'];
  $keywords = preg_replace('/\s+/', ' ', $keywords); 
  $keywords = str_replace(' ', '', $keywords);
  $keywords = explode(',', $keywords);
  while (list($key, $val) = each($keywords)) {
    if(in_array($val, $keyword)) {
      echo "deny";
      exit;
    }
  }
  $urlk = explode('.', $location);
  while (list($key, $val) = each($urlk)) {
    if(in_array($val, $keyword)) {
      echo "deny";
      exit;
    }
  }
}
echo "allow";
?>

Upvotes: 0

Views: 214

Answers (4)

Shoe
Shoe

Reputation: 76280

$query1/2 and $result1/2 can be overwritten very easily by calling them with the same name. It will not make any problem at all. Also mysql result vars are very heavy.

To check if a variable is empty there's a php native function that also check if the variable is equal to NULL, '', or is not set at all: empty($var). I'd use it for the first part of your code instead of $var == '' that is not elegant neither.

Also mysql_num_rows() returns an integer, and you are comparing that result with a string with value "1". I'd correct it with: mysql_num_rows($result) == 1.

We have also the problem of

  echo "deny";
  exit;

wich can be replaced with exit('deny');

I still doubt that a page takes 3 minutes to load, maybe 3 secs?

Upvotes: 1

bcosca
bcosca

Reputation: 17555

The 3mins/page is highly doubtful, but the else part of the code:

if (in_array($location, $address))

is a bottleneck due to the disk I/O and keyword matching involved.

See if this helps (without caching):

else {    
    $meta=get_meta_tags($location);
    $keywords=explode(',',str_replace(' ','',$meta['keywords']));
    $urlk=explode('.',$location);
    if (array_intersect($keywords,$keyword) || array_intersect($location,$urlk))
        echo 'deny';
}

Upvotes: 0

Josh
Josh

Reputation: 2196

Have you tried using GET instead of POST? Technically they should be the same speed, but the proxy might be doing something odd with POST to prevent multiple requests.

Here's a quick example of how to use GET instead with urllib: http://docs.python.org/library/urllib.html#examples

Exactly how long is "way too long"? You could try timing it compared to accessing the site without the proxy.

Also, you might want to do some other profiling to see where the bottleneck resides. Is it your python script, your connection to the internet, the PHP script or the PHP host? Is the PHP site on a shared host? It might be snappier if you had a dedicated or VPS.

Another thought, you could try adding some caching on the PHP side. If the same user keeps hitting the same site(s) over and over, there's no sense in querying the database each time.

Upvotes: 1

cababunga
cababunga

Reputation: 3114

The way you've pasted it, it looks like the call to ProxyRequest.process(self) is not done inside your process method.

Upvotes: 0

Related Questions