Reputation: 886
I have a csv file in which one of the columns there are line breaks. The column is a description field so it stores lines of text, bulleted points, and most importantly line breaks--and sometimes two between paragraphs. To complicate matters, the description field also contains quotes.
I have tried everything I know to get the content (and keeping the formatting) into a variable. I have tried file_get_contents, str_getcsv and fgetcsv to no avail. One of the functions let me put the entire description field into a variable, but it stripped all the new lines, so everything is in one giant paragraph. Another one separately parsed each paragraph within the description field, instead of one single unit.
This is the general format of my csv file:
"391","The Great Gatsby","The Great Gatsby, F. Scott Fitzgerald’s third book, stands as the supreme achievement of his career.
This exemplary novel of the Jazz Age has been acclaimed by generations of readers. The story of the fabulously wealthy Jay Gatsby and his love for the beautiful Daisy Buchanan, of lavish parties on Long Island at a time when The New York Times noted “gin was the national drink and sex the national obsession,” it is an exquisitely crafted tale of America in the 1920s.
The Great Gatsby is one of the great classics of twentieth-century literature.","No","Yes",
I'd like to have a variable where $array[0]=391
, $array[1]=The Great Gatsby
and most importantly $array[2]
contains the text in the description with all the line breaks and formatting, including the quotes within the field itself.
Or if there's a better way to parse this using PHP, please let me know.
Upvotes: 8
Views: 7845
Reputation: 1453
I recently had this problem while doing huge database uploads so I added a bit of code inside the loop. It checks if the header field count matches the fields count at each pass, and if not it gets the next array from fgetcsv, and continues until they match the field count. So if you have 100 line breaks across any number of fields they will all be preserved.
$file = fopen($filenam,"r");
$count=0;
while (($fields = fgetcsv($file,0,"|")) !== false)
{
$count++;
if($count == 1) {
$sql_fields=[];$fc=0;
foreach($fields as $field) {
//$sql_fields .= $field.",";
$sql_fields[] = $field;
$fc++;
}
continue;
}
if(count($fields)!=$fc) {
if($gotprev!='') {
$gotprev .= implode("|",$fields);
$fields = explode("|",$gotprev);
if(count($fields)!=$fc)
continue;
$gotprev='';
}else{
$gotprev = implode("|",$fields);
continue;
}
}
$sql_values='';$rc=0;$sql_fields_gen='';
foreach($sql_fields as $key=>$field) {
if($fields[$key]=='False') $fields[$key]=0;
if($fields[$key]=='True') $fields[$key]=1;
$sql_values .= "'".addslashes($fields[$key])."',";
$rc++;
$sql_fields_gen .= $field.",";
}
$sql_fields_gen = "(".rtrim($sql_fields_gen,",").")";
$sql_values = "VALUES (".rtrim($sql_values,",").")";
$sql_insert = "INSERT INTO mytable $sql_fields_gen $sql_values ";
$DB->execute($sql_insert);
}
Upvotes: 0
Reputation: 327
if someone still having this problem in 2019 my solution was fgetcsv($file)
. Automatically respect the linebreaks inside text fields.
Upvotes: 4
Reputation: 10643
I recently wrote a function myself for precisely this use-case.
The PHP function str_getcsv()
is for interpreting a single line of CSV data. Often, you can simply explode the file on new lines, and then parse each line individually, but what do you do when fields themselves may contain new lines?
The trick is to use str_getcsv()
twice: once to split the file into lines, and again to split each individual line.
/**
* Receives CSV string and returns as an array.
*
* *************************************************************************
*
* Based on code by fab at tradermail dot info at http://php.net/manual/en/function.str-getcsv.php#119666
*/
function csv_to_array($csv, $delimiter=',', $header_line=true) {
// CSV from external sources may have Unix or DOS line endings. str_getcsv()
// requires that the "delimiter" be one character only, so we don't want
// to pass the DOS line ending \r\n to that function. So first we ensure
// that we have Unix line endings only.
$csv = str_replace("\r\n", "\n", $csv);
// Read the CSV lines into a numerically indexed array. Use str_getcsv(),
// rather than splitting on all linebreaks, as fields may themselves contain
// linebreaks.
$all_lines = str_getcsv($csv, "\n");
if (!$all_lines) {
return false;
}
$csv = array_map(
function(&$line) use ($delimiter) {
return str_getcsv($line, $delimiter);
},
$all_lines
);
if ($header_line) {
// Use the first row's values as keys for all other rows.
array_walk(
$csv,
function(&$a) use ($csv) {
$a = array_combine($csv[0], $a);
}
);
// Remove column header row.
array_shift($csv);
}
return $csv;
}
Bonus: This function can also (if $header_line
) return an associative array, using the first line as key names.
Upvotes: 1