fbas
fbas

Reputation: 1674

php, mysql, my memory leaking

I didn't expect this script (throw-away) to be leaking and I haven't figured out what the culprit is. Can you spot anything? Although this is throw-away code, I'm concerned that I'll repeat this in the future. I've never had to manage memory in PHP, but with the number of rows in the db, it's blowing up my php instance (already upped the memory to 1Gb).

The california table is especially larger than the others (currently 2.2m rows, less as I delete duplicate rows). I get a memory error on line 31 ($row = mysql_fetch_assoc($res))

Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocat e 24 bytes) in C:\Documents and Settings\R\My Documents\My Webpages\cdiac\cdiac_ dup.php on line 31

PHP 5.3.0, mysql 5.1.36. part of a wamp install.

here's the entire code. the purpose of this script is to delete duplicate entries (data was acquired into segmented tables, which was far faster at the time, but now I have to merge those tables.)

what's causing it? something I'm overlooking? or do I just need to watch the memory size and call garbage collection manually when it gets big?

<?php

define('DBSERVER', 'localhost');
define('DBNAME', '---');
define('DBUSERNAME', '---');
define('DBPASSWORD', '---');

$dblink = mysql_connect(DBSERVER, DBUSERNAME, DBPASSWORD);
mysql_select_db(DBNAME, $dblink);


$state = "AL";
//if (isset($_GET['state'])) $state=mysql_real_escape_string($_GET['state']); 
if (isset($argv[1]) ) $state = $argv[1];

echo "Scanning $state\n\n";


// interate through listing of a state to check for duplicate entries (same station_id, year, month, day)
$DBTABLE = "cdiac_data_". $state;
$query = "select * from $DBTABLE ";
$query .= " order by station_id, year, month, day ";

$res = mysql_query($query) or die ("could not run query '$query': " . mysql_errno() . " " . mysql_error());

$last = "";
$prev_row;
$i = 1;
$counter = 0;
echo ".\n";
while ($row = mysql_fetch_assoc($res)) {  
  $current = $row["station_id"] . "_" . $row["year"] . "_" . sprintf("%02d",$row["month"]) . "_" . sprintf("%02d",$row["day"]);
  echo str_repeat(chr(8), 80) . "$i  $current ";
  if ($last == $current) {
    //echo implode(', ', $row) . "\n";

    // merge $row and $prev_row
    // data_id  station_id, state_abbrev, year, month,  day,  TMIN, TMIN_flags, TMAX, TMAX_flags, PRCP, PRCP_flags, SNOW, SNOW_flags, SNWD, SNWD_flags

    printf("%-13s %8s %8s\n", "data_id:", $prev_row["data_id"], $row["data_id"]);
    if ($prev_row["data_id"] == $row["data_id"]) echo " + ";

    $set = "";
    if (!$prev_row["TMIN"] && $row["TMIN"])  $set .= "TMIN = " . $row["TMIN"] . ", ";
    if (!$prev_row["TMIN_flags"] && $row["TMIN_flags"])   $set .= "TMIN_flags = '" . $row["TMIN_flags"] . "', ";
    if (!$prev_row["TMAX"] && $row["TMAX"])   $set .= "TMAX = " . $row["TMAX"] . ", ";
    if (!$prev_row["TMAX_flags"] && $row["TMAX_flags"])   $set .= "TMAX_flags = '" . $row["TMAX_flags"] . "', ";
    if (!$prev_row["PRCP"] && $row["PRCP"])   $set .= "PRCP = " . $row["PRCP"] . ", ";
    if (!$prev_row["PRCP_flags"] && $row["PRCP_flags"])   $set .= "PRCP_flags = '" . $row["PRCP_flags"] . "', ";
    if (!$prev_row["SNOW"] && $row["SNOW"])   $set .= "SNOW = " . $row["SNOW"] . ", ";
    if (!$prev_row["SNOW_flags"] && $row["SNOW_flags"])   $set .= "SNOW_flags = '" . $row["SNOW_flags"] . "', ";
    if (!$prev_row["SNWD"] && $row["SNWD"])   $set .= "SNWD = " . $row["SNWD"] . ", ";
    if (!$prev_row["SNWD_flags"] && $row["SNWD_flags"])   $set .= "SNWD_flags = '" . $row["SNWD_flags"] . "', ";

    $delete = "";
    $update = "";
    if ($set = substr_replace( $set, "", -2 )) $update = "UPDATE $DBTABLE SET $set WHERE data_id=".$prev_row["data_id"]." and year=".$row["year"]." and month=".$row["month"]." and day=".$row["day"].";\n";
    if ($row["data_id"] != $prev_row["data_id"]) $delete = "delete from $DBTABLE where data_id=".$row["data_id"]." and year=".$row["year"]." and month=".$row["month"]." and day=".$row["day"].";\n\n";

    if ($update) {
      $r = mysql_query($update) or die ("could not run query '$update' \n".mysql_error());
    }
    if ($delete) {
      $r = mysql_query($delete) or die ("could not run query '$delete' \n".mysql_error());
    }    

    //if ($counter++ > 5) exit(0);
  }
  else {
    $last = $current;
    unset($prev_row);
    //copy $row to $prev_row
    foreach ($row as $key => $val) $prev_row[$key] = $val;
  }

  $i++;
}

    echo "\n\nDONE\n"; 
?>

Upvotes: 2

Views: 1876

Answers (3)

Kickstart
Kickstart

Reputation: 21513

I found this when trying to trace down a memory use problem on a script of mine. Having solved the issue for mine I thought it worth adding a reply here for the next person who comes along with the same issue.

I was using mysqli, but much the same applies for mysql.

The problem I found was the queries not freeing their results. The solution was to use mysqli_free_result() after executing the update and delete queries. But more importantly on the mysqli_query for the loop I used the extra parameter of *MYSQLI_USE_RESULT* . There are side effects of this, so use a separate connection for the update and delete queries.

Upvotes: 0

user149341
user149341

Reputation:

Work smarter, not harder:

SELECT station_id, year, month FROM table
    GROUP BY station_id, year, month
    HAVING COUNT(*) > 1

That'll get you all the station_id/year/month tuples that appear in the table more than once. Assuming that most of your data is not duplicates, that'll save you a lot of memory, since now you just have to go through these tuples and fix up the rows matching them.

Upvotes: 2

J. Bruni
J. Bruni

Reputation: 20492

I would try two things:

1) Instead of running the UPDATE and DELETE queries inside the loop using mysql_query, save them in a text file, to execute later. For example: file_put_contents('queries.sql', $update, FILE_APPEND );

2) Instead of doing everything inside the while ($row = mysql_fetch_assoc($res)) loop, first grab all SELECT query results, then close database connection freeing all database resources, including the query result. Only after this, perform the loop process.

If you run out of memory while storing the database results in one array, you can try saving the results in a temporary file instead (one record per line / FILE_APPEND), and then use this file in the loop (reading one line per record, using fgets function).

Upvotes: 3

Related Questions