Reputation: 101
I am using PHP to import data from a CSV file using fgetcsv(), which yields an array for each row. Initially, I had the character limit set at 1024, like so:
while ($data = fgetcsv($fp, 1024)) {
// do stuff with the row
}
However, a CSV with 200+ columns surpassed the 1024 limit on many rows. This caused the line read to stop in the middle of a row, and then the next call to fgetcsv() would start where the previous one left off and so on until an EOL was reached.
I have since upped this limit to 4096, which should take care of the majority of cases, but I would like put a check in to be sure that the entire line was read after each line is fetched. How do I go about this?
I was thinking to check the end of the last element of the array for end of line characters (\n, \r, \r\n), but wouldn't these be parsed out by the fgetcsv() call?
Upvotes: 7
Views: 6584
Reputation: 449
fgetcsv()
is by default is used to read line by line from a csv file but when it is not functioning that way, you have to check PHP_EOL
character on your OS machine
you have simply to go:
C:\xampp\php\php.ini
and search for:
;auto_detect_line_endings = Off
and uncomment it and activate it to:
auto_detect_line_endings = On
restart Apache and check . . . should works
Upvotes: 0
Reputation: 1045
I would be careful with your final solution. I was able to upload a file named /.;ls -a;.csv
to perform command injection. Make sure you validate the file path if you use this approach. Also, it might be a good idea to provide a default_length
in the case your wc
fails for any reason.
// use wc to find max line length
// uses a hardcoded default if wc fails
// this is relatively safe from command
// injection since the file path is a tmp file
$wc = explode(" ", shell_exec('wc -L ' . $validated_file_path));
$longest_line = (int)$wc[0];
$length = ($longest_line) ? $longest_line + 4 : $default_length;
Upvotes: 0
Reputation: 101
Thank you for the suggestions, but these solutions really didn't solve the issue of knowing that we account for the longest line while still providing a limit. I was able to accomplish this by using the wc -L
UNIX command via shell_exec()
to determine the longest line in the file prior to beginning the line fetching. The code is below:
// open the CSV file to read lines
$fp = fopen($sListFullPath, 'r');
// use wc to figure out the longest line in the file
$longestArray = explode(" ", shell_exec('wc -L ' . $sListFullPath));
$longest_line = (int)$longestArray[0] + 4; // add a little padding for EOL chars
// check against a user-defined maximum length
if ($longest_line > $line_length_max) {
// alert user that the length of at least one line in the CSV is too long
}
// read in the data
while ($data = fgetcsv($fp, $longest_line)) {
// do stuff with the row
}
This approach ensures that every line is read in its entirety and still provides a safety net for really long lines without stepping through the entire file with PHP line by line.
Upvotes: 0
Reputation: 360562
Just don't specify a limit, and fgetcsv() will slurp in as much as is necessary to capture a full line. If you do specify a limit, then it's entirely up to YOU to scan the file stream and ensure you're not slicing something down the middle.
However, note that not specifying a limit can be risky if you don't have control over generation of this .csv in the first place. It'd be easy to swamp your server with a malicious CSV that has a many terabytes of data on a single line.
Upvotes: 3
Reputation: 227190
Just omit the length parameter. It's optional in PHP5.
while ($data = fgetcsv($fp)) {
// do stuff with the row
}
Upvotes: 11