Reputation: 5224
I have the following rough code (full code is 146 lines, 90 of which are string parsing, can add if needed):
ini_set('memory_limit', '7G');
$db = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => true));
$db->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
$db_ub = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => true));
$db_ub->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);
$stmt = $db->prepare('select columns from stats where timestamp between ? and ?');
$stmt->execute(array('2020-04-25', '2020-05-25'));
while($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
echo memory_get_usage() .PHP_EOL;
echo $row['id'] . PHP_EOL;
$stmt2 = $db_ub->prepare('select somedata from users limit 1');
$stmt2->execute();
$row2 = $stmt2->fetch(PDO::FETCH_ASSOC);
$type = !empty($row2['somedate']) ? 5 : 4;
$result = $db_ub->prepare('insert ignore into newtable (old, type) values (?, ?)');
$result->execute(array($row['id'], $type));
}
during $stmt->execute(array('2020-04-25', '2020-05-25'));
my memory consumption is as .34GB
(using ps aux | grep 'php ' | awk '{$5=int(100 * $5/1024/1024)/100"GB";}{ print;}'
to monitor consumption during select
and show full processlist
SQL side to verify). Once the script enters the while
it jumps to +5 GB.
Testing the setattribute
var_dump($db->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false));
seems like it has taken affect:
bool(true)
but the behavior doesn't change when I switch buffered or unbuffered.
$db->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false)
and
$db->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true)
Using echo $db->getAttribute(constant('PDO::MYSQL_ATTR_USE_BUFFERED_QUERY'));
also shows the setting changes.
Moving the setting to the statement rather than connection as https://www.php.net/manual/en/ref.pdo-mysql.php suggested also didn't work.
$stmt = $db->prepare('select columns from stats where timestamp between ? and ?', array(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false));
I've also tried moving the buffer setting to the connection with no affect:
$db = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => true, PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false));
Taking out the second connection seems to allow the unbuffered query to function as intended:
ini_set('memory_limit', '1G');
$db = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => true, PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false));
$db->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
//$db_ub = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => true));
//$db_ub->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);
$stmt = $db->prepare('select columns from stats where timestamp between ? and ?');
$stmt->execute(array('2019-01-25', '2019-11-25'));
while($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
echo memory_get_usage() .PHP_EOL;
echo $row['id'] . PHP_EOL;
/*
$stmt2 = $db_ub->prepare('select somedata from users limit 1');
$stmt2->execute();
$row2 = $stmt2->fetch(PDO::FETCH_ASSOC);
$type = !empty($row2['somedate']) ? 5 : 4;
$result = $db_ub->prepare('insert ignore into newtable (old, type) values (?, ?)');
$result->execute(array($row['id'], $type));
*/
}
This usage the memory_get_usage
doesn't exceed 379999
.
If I uncomment the second connection and make it unbuffered as well I receive:
Cannot execute queries while other unbuffered queries are active. Consider using PDOStatement::fetchAll(). Alternatively, if your code is only ever going to run against mysql, you may enable query buffering by setting the PDO::MYSQL_ATTR_USE_BUFFERED_QUERY attribute.
The second connection buffered performs as initially described, large memory consumption on execution. If ini_set('memory_limit'
is high it works if low it errors. Using a large memory_limit
isn't a feasible solution.
Was using (Red Hat Enterprise Linux Server release 7.3 (Maipo)
):
php71u-pdo.x86_64 7.1.19-1.ius.centos7
Moved script to a newer machine (Amazon Linux release 2 (Karoo)
):
php73-pdo.x86_64 7.3.17-1.el7.ius
and have the same behavior.
Upvotes: 6
Views: 7188
Reputation: 142278
Can't you get rid of most of the code by simply running a single query:
INSERT IGNORE INTO newtable
SELECT ...,
IF(..., 5, 4)
FROM oldtable WHERE ...;
With that, you could get rid of the 7G memory issue.
If that turns out to be doing too much at once, then break it into chunks. See discussion here: http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks (That talks about DELETEs
, but it can be adapted to other things, such as your SELECT
.)
On another topic: Why is select somedata from users limit 1
executed inside the loop? It seems to get the same data every time. Also, without an ORDER BY
, you can't predict which limit 1
row you will get.
Upvotes: 0
Reputation: 5224
The PDO::ATTR_PERSISTENT
value is not boolean. It identifies the connection being used, use unique values for multiple connections. In my case:
$db = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => 'unbuff', PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false));
$db_ub = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => 'buff', PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => true));
Upvotes: 2
Reputation: 32280
You are actually making 135000000 queries instead of iterating over 135000000 objects.
Change the code to only do one query but order the elements as if they were inside your for loop.
$db = new PDO("mysql:host=".$dbhost.";dbname=".$dbname, $dbuser, $dbpass, array(PDO::ATTR_PERSISTENT => true));
$stmt = $db->prepare('SELECT * FROM stats ORDER BY id ASC');
$stmt->execute();
while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
// ...
}
You even don't need this if
, it's logic that can be faster used by the DB itself:
if(!empty($row['id'])) {
Instead:
SELECT * FROM stats WHERE id IS NOT NULL ORDER BY id ASC
I didnt look into PDO/MySQL for some time but I was assuming unbuffered allows you to use the cursor:
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
Take in account there can be only a single query active per connection. You are basically using the connection's buffer.
The better option would be to load only small chunks in a map reduce kind of way.
SELECT * FROM stats LIMIT 100, 0
use the results, then
SELECT * FROM stats LIMIT 100, 100
and so on.
Upvotes: -2