Reputation: 33
I have an issue regarding converting a CSV-String into an array.
INV;165;1;0;1 Username;0;10000;"Here is multiline-text.
with line-breaks:
";20 Offen;0,00
INV;166;1;0;1 Username2;0;10000;"Here is another multiline-text.
with line-breaks:
";20 Offen;0,00
I tried to split up the fields with str_getcsv, but the problem is, that the delimiter only occurs in one field and the function is splitting up the multiline-fields also.
My solution was to first convert the line-breaks by preg_replace, but I'm not getting into it. Here's my regex to only replace the line-breaks enclosed by ;" and "; :
/(;")(.*)(\n)(.*)(";)/
This pattern does actually match only the first line-break. Could anyone give me a hint to do this job?
Thank you in advance.
Here is the original CSV:
CMXINV;165;1;0;1 Felix Hirschberg;0;10000;Herr;;Max;Muster;Company;;Street;123;City;DE;(0 40) 6 25 6;;(0 40) 6 25 6;[email protected];;;;;;;;0;20121217;20121217;1 Sofort ohne Abzug;EUR;1 Agentur;0 ;0,00;;"Vielen Dank für Ihren Auftrag.
Vereinbarungsgemäß berechnen wir Ihnen:
";"Mit besten Grüßen
Invoice Man";;0;0;0;0;;20 Offen;0,00;;0 ;0,00;0,00;;EXW;;;;;;;;;;;;;;;;2;;Project: Test-Project;;0,000;0,00;1,000;0,00;0,00;0;0;0;0;0
CMXINV;165;2;0;1 Felix Hirschberg;0;10000;Herr;;Max;Muster;Company;;Street;123;City;DE;(0 40) 6 25 6;;(0 40) 6 25 6;[email protected];;;;;;;;0;20121217;20121217;1 Sofort ohne Abzug;EUR;1 Agentur;0 ;0,00;;"Vielen Dank für Ihren Auftrag.
Vereinbarungsgemäß berechnen wir Ihnen:
";"Mit besten Grüßen
Invoice Man";;0;0;0;0;;20 Offen;0,00;;0 ;0,00;0,00;;EXW;;;;;;;;;;;;;;;;0;1;"- job1 (1h)
- job2 (1h)
- job3 (0,75h)
- job4 (1h)
- job5 (0,5h)";HR;3,25;100,00;1,00;0,00;325,00;1;0;0;0;0
MESSAGE;S;210053;INVOICE_GET hat 1 Datensätze zurückgegeben
MESSAGE;S;204020;Datenübertragung erfolgreich. Es wurden 1 Datensätze verarbeitet.
Upvotes: 3
Views: 3239
Reputation: 21007
According to user comments in php manual both fgetcsv()
and str_getcsv()
should handle newlines correctly.
You probably should take an advantage of those implementation (they should have already solve any possible issue you can come accross).
Or you could write your own parser (based on comment):
// Browse file one character after another
while (false !== ($c = fgetc($fp))) {
// We are not inside the value, newline = new row
if( ($c == "\n") || ($c == "\r")){
// Newline, add to result
continue;
}
// Whitespace? continue, do nothing
if( ctype_space( $c)){
continue;
}
// Okay, now we can use switch
switch( $c){
case ',':
// Add empty value
break;
// Escaped value
case '"':
case "'":
$escapeChar = $c;
$prevChar = '';
$value = '';
while( false !== ($c = fgetc($fp))){
// We just hit and end of escaped sequence, check escaped val by \
if( ($c == $escapeChar) && ($c != '\\') ){
break;
}
// If we got \ and prev value is \ = "blah blah \\"
// Prevent escape escape character of being guessed incorrectly
if( ($c == '\\') && ($prevChar == '\\')){
$prevChar = '';
} else {
$prevChar = $c;
}
$value .= $c;
}
// $value is your value
break;
// Normal, non escaped value:
default:
$value = '';
while( false !== ($c = fgetc($fp))){
if( ($c == ',') || ($c == '\n') || ($c == '\r')){
break;
}
$value .= $c;
}
// $value = your field value
break;
}
}
Upvotes: 1
Reputation: 50378
If you have the CSV input in a file, you can just use fgetcsv()
, which will handle multi-line entries just fine.
If the CSV input is in a string, you can use the special php://temp
I/O stream to efficiently pass it to fgetcsv()
:
$fp = fopen( 'php://temp', 'w+' );
fputs( $fp, $csv );
rewind( $fp );
$data = fgetcsv( $fp, 0, ';', '"' );
fclose( $fp );
Upvotes: 0
Reputation: 4356
You could try this:
/;"(([^"]*)([\r\n])+([^"]*))+"/im
This will match the text before and after every newline within the ;"
delimiters.
The second match will be the preceding text, and the fourth match will be the following text.
Note that I have left off the last ';' to ensure that this will still match if the multi-line value is the last in the line.
Upvotes: 2