Danja Garno
Danja Garno

Reputation: 101

Ensure fgetcsv() reads the entire line

I am using PHP to import data from a CSV file using fgetcsv(), which yields an array for each row. Initially, I had the character limit set at 1024, like so:

while ($data = fgetcsv($fp, 1024)) {
  // do stuff with the row
}

However, a CSV with 200+ columns surpassed the 1024 limit on many rows. This caused the line read to stop in the middle of a row, and then the next call to fgetcsv() would start where the previous one left off and so on until an EOL was reached.

I have since upped this limit to 4096, which should take care of the majority of cases, but I would like put a check in to be sure that the entire line was read after each line is fetched. How do I go about this?

I was thinking to check the end of the last element of the array for end of line characters (\n, \r, \r\n), but wouldn't these be parsed out by the fgetcsv() call?

Upvotes: 7

Views: 6584

Answers (5)

Osahady
Osahady

Reputation: 449

fgetcsv() is by default is used to read line by line from a csv file but when it is not functioning that way, you have to check PHP_EOL character on your OS machine you have simply to go: C:\xampp\php\php.ini and search for:

;auto_detect_line_endings = Off

and uncomment it and activate it to:

auto_detect_line_endings = On

restart Apache and check . . . should works

Upvotes: 0

Patrick Michaelsen
Patrick Michaelsen

Reputation: 1045

I would be careful with your final solution. I was able to upload a file named /.;ls -a;.csv to perform command injection. Make sure you validate the file path if you use this approach. Also, it might be a good idea to provide a default_length in the case your wc fails for any reason.

// use wc to find max line length
// uses a hardcoded default if wc fails
// this is relatively safe from command 
// injection since the file path is a tmp file
$wc = explode(" ", shell_exec('wc -L ' . $validated_file_path));
$longest_line = (int)$wc[0];
$length = ($longest_line) ? $longest_line + 4 : $default_length;

Upvotes: 0

Danja Garno
Danja Garno

Reputation: 101

Thank you for the suggestions, but these solutions really didn't solve the issue of knowing that we account for the longest line while still providing a limit. I was able to accomplish this by using the wc -L UNIX command via shell_exec() to determine the longest line in the file prior to beginning the line fetching. The code is below:

// open the CSV file to read lines
$fp = fopen($sListFullPath, 'r');

// use wc to figure out the longest line in the file
$longestArray = explode(" ", shell_exec('wc -L ' . $sListFullPath));
$longest_line = (int)$longestArray[0] + 4; // add a little padding for EOL chars

// check against a user-defined maximum length
if ($longest_line > $line_length_max) {
    // alert user that the length of at least one line in the CSV is too long
}

// read in the data
while ($data = fgetcsv($fp, $longest_line)) {
    // do stuff with the row
}

This approach ensures that every line is read in its entirety and still provides a safety net for really long lines without stepping through the entire file with PHP line by line.

Upvotes: 0

Marc B
Marc B

Reputation: 360562

Just don't specify a limit, and fgetcsv() will slurp in as much as is necessary to capture a full line. If you do specify a limit, then it's entirely up to YOU to scan the file stream and ensure you're not slicing something down the middle.

However, note that not specifying a limit can be risky if you don't have control over generation of this .csv in the first place. It'd be easy to swamp your server with a malicious CSV that has a many terabytes of data on a single line.

Upvotes: 3

gen_Eric
gen_Eric

Reputation: 227190

Just omit the length parameter. It's optional in PHP5.

while ($data = fgetcsv($fp)) {
  // do stuff with the row
}

Upvotes: 11

Related Questions