Reputation: 2096
I'm using PHP 5.3.6 with PDO to access Postgres 9.0.4. I've been asked to reduce the memory footprint of a report. The current implementation is simple: execute the query, do a fetchAll() and then iterate with foreach() through the resulting array. This obviously doesn't scale with huge result sets: it can temporarily consume 100MB or more.
I have a new implementation which takes the PDO statement handle and then iterates directly on it using foreach(), i.e. no intermediate array via fetchAll(). (From what I've read, iterating a statement handle with foreach calls fetch() under the covers.) This is just as fast and consumes way less memory: about 28kB. Still, I'm not confident I'm doing it right because, although I've done a ton of Googling, it's tough to find answers to basic questions about this:
I've seen articles that suggest solving my original problem using cursors. Does the Postgress PDO driver already use cursors internally? If writing my own SQL to create a cursor is required, I'm willing to but I'd prefer to write the simplest code possible (but no simpler!).
If foreach calls fetch() each iteration, isn't that too network chatty? Or is it smart and fetches many rows at once, e.g. 500, to save bandwidth? (This may imply that it uses cursors internally.)
I've seen an article that wraps the statement handle in a class that implements Iterator interface. Isn't this redundant given that a PDO statement handle already does this? Or am I missing something?
My call to prepare the SQL statement looks like this:
$sth = $dbh->prepare($sql);
I found that it made no memory or speed difference if I did this:
$sth = $dbh->prepare($sql, array( PDO::ATTR_CURSOR => PDO::CURSOR_FWDONLY ) );
Is this because this is the default anyway for the Postgres PDO driver? This would make sense if it is already using cursors internally.
General comments about the approach and other ways to solve this problem are welcome.
Upvotes: 1
Views: 2843
Reputation: 1609
setAttribute()
). Maybe because I focused on CURSOR_FWONLY
while they do not work, as shown by JohnSmith's answer.fetch()
and network:fetch()
is as efficient as possible, it uses exactly the same flow as a fetchAll()
where the server continuously pushes result rows (sometimes limited by the client not reading its socket fast enough). fetch()
then works with the (small) network buffer as a C getline()
would, buffering only until it finds an "end-of-row" marker to return it as a PHP array (and shifting the buffer until that marker); next call to fetch()
will start with the remains of the network buffer (it may even find a full row in them and thus return immediately), and so on.while(($row = $stmt->fetch()))
) until PHP 8's getIterator()
: one could not have a function that interchangeably accepted a PDOStatement
or an array to iterate over, they needed to have a specific loop, or wrap into an homemade Iterator
.fetch()
not a real fetch()
until PHP 8.4Historically, fetch()
did an internal fetchAll()
that waited for the server to have finished sending the whole resulset (and buffering it entirely) before delivering the first row;
and as it was internal to the driver, I'm not even sure it was reported by memory_get_usage()
. The only comfort in that was that maybe this was raw data memory, not yet converted to PHP array, so a little bit thrifty.
But it was thus of absolutely no use, and its name gave a false confidence feeling in its lightness.
PHP 8.4 introduced a PDO::ATTR_PREFETCH attribute which, set to 0 (either on the PDO, or the Statement) allows fetch()
to work on-demand, row-by-row without storing the results in mem; as it always should have.
There are still side effects (that prevents introspecting while popping the results, for example with getColumnMeta()
), but it is fully usable in a dump-to-CSV scenario.
Upvotes: 0
Reputation: 472
Apparently PDO::CURSOR_FWDONLY
does not use cursors. Black box tests:
(0) Preparations:
$con = new \PDO('dsn');
// you'll get "NO ACTIVE TRANSACTION" otherwise
$con->beginTransaction();
$sql = 'select * from largetable';
(1) Default - takes forever:
$stmt = $con->prepare($sql);
$stmt->execute();
print_r($stmt->fetch());
(2) FWDONLY - takes forever:
$stmt = $con->prepare($sql, array(\PDO::ATTR_CURSOR => \PDO::CURSOR_FWDONLY));
$stmt->execute();
print_r($stmt->fetch());
(3) SCROLLABLE - runs in a flash:
$stmt = $con->prepare($sql, array(\PDO::ATTR_CURSOR => \PDO::CURSOR_SCROLL));
$stmt->execute();
print_r($stmt->fetch());
I turned on PG logging just to be sure and it is indeed so - only SCROLL uses cursors.
So, the only way to make use of cursors is to use SCROLL, at least in PHP 5.4.23.
Upvotes: 2