Reputation: 2304
I have a rather large csv file (17GB) which I'm trying to sanity check. I've written a little script which looks like this:
#!/usr/bin/php
<?php
$f = fopen($argv[1],'r');
$i=0;
while (!feof($f)) {
$row = fgetcsv($f);
$i++;
}
print $i."\n";
?>
Which should just read in the number of rows and print it out. This script outputs: 60770881
But if I do a wc -l
the result is 60777200.
My csv file was generated from MySQL using:
INTO OUTFILE '/tmp/file.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n'
So it shouldn't have any unescaped newlines or anything like that. Does anyone have any idea what could be wrong?
Upvotes: 1
Views: 1597
Reputation: 57306
CSV record can span multiple lines. If you have carriage-returns in any of the values, there will be multiple (2 or more) physical lines in the file (as counted by wc
) but they would be read as one CSV record using fgetcsv
.
Also, you don't need to check for feof($f)
, because fgetcsv
will return FALSE on end-of-file.
Upvotes: 4