Reputation: 7438
Before I start, I know this is CSV and I know there is a function that exist build-in PHP. I got the following pattern :
preg_match_all("/([^\"]|\"[^\"]*\")*?(r\n|\n\r|\r|\n)/i", $CSV, $Matches);
Who will parse something like that :
Country,Region/State,City,"Zip/Postal Code\n From","Zip/Postal Code To","Weight From","Weight To","Shipping Price","Delivery Type"\n\r
CAN,*,,,,0.0000,4999.0000,29.7500,Priority\n\r
CAN,*,,,,10000.0000,19999.0000,35.5000,Express\n\r
CAN,*,,,,0.0000,4999.0000,19.7500,Express\n\r
CAN,*,,,,20000.0000,99999999.9999,59.0000,Priority\n\r
CAN,*,,,,5000.0000,9999.0000,34.7500,Priority\n\r
CAN,*,,,,20000.0000,99999999.9999,41.5000,Express\n\r
CAN,*,,,,5000.0000,9999.0000,24.4500,Express\n\r
CAN,*,,,,10000.0000,19999.0000,48.0000,Priority\n\r
CAN,*,,,,10000.0000,19999.0000,29.7500,Standard\n\r
CAN,*,,,,20000.0000,99999999.9999,36.5000,Standard\n\r
CAN,*,,,,500.0000,9999.0000,20.3500,Standard\n\r
CAN,*,,,,90.0000,499.0000,9.7500,Standard\n\r
CAN,*,,,,50.0000,89.0000,1.8000,Standard\n\r
CAN,*,,,,30.0000,49.0000,1.5000,Standard\n\r
CAN,*,,,,0.0000,29.0000,1.0000,Standard\n\r
USA,*,,,,20000.0000,99999999.9999,160.0000,Express\n\r
USA,*,,,,10000.0000,14999.0000,76.0000,Express\n\r
USA,*,,,,1000.0000,4999.0000,42.0000,Express\n\r
USA,*,,,,15000.0000,19999.0000,155.0000,Priority\n\r
USA,*,,,,5000.0000,9999.0000,94.0000,Priority\n\r
USA,*,,,,0.0000,999.0000,75.5000,Priority\n\r
USA,*,,,,15000.0000,19999.0000,98.0000,Express\n\r
USA,*,,,,5000.0000,9999.0000,61.5000,Express\n\r
USA,*,,,,0.0000,999.0000,40.0000,Express\n\r
USA,*,,,,20000.0000,99999999.9999,230.0000,Priority\n\r
USA,*,,,,10000.0000,14999.0000,120.0000,Priority\n\r
USA,*,,,,1000.0000,4999.0000,61.5000,Priority\n\r
USA,*,,,,500.0000,999.0000,25.5000,Standard\n\r
USA,*,,,,90.0000,499.0000,13.3500,Standard\n\r
USA,*,,,,50.0000,89.0000,3.0000,Standard\n\r
USA,*,,,,30.0000,49.0000,1.8000,Standard\n\r
USA,*,,,,0.0000,29.0000,1.5000,Standard\n\r
The resulst I get is similar to :
[2] => Array
(
)
[3] => Array
(
[0] => CAN
[1] => *
[2] =>
[3] =>
[4] =>
[5] => 10000.0000
[6] => 19999.0000
[7] => 35.5000
)
[4] => Array
(
)
[5] => Array
(
[0] => CAN
[1] => *
[2] =>
[3] =>
[4] =>
[5] => 0.0000
[6] => 4999.0000
[7] => 19.7500
)
[6] => Array
(
)
If I try to add ?:
in the line break group it still do it. Can anyone help me, I am stuck there. Thanks.
Upvotes: 1
Views: 120
Reputation: 6959
Not knowing any particulars of php matching, I'll take your word that the regex is working like you show it is (using my preferred regex I'm not capturing in the same way).
I'll assume you are trying to remove those blank matches. I'll also believe that those "newlines" are actually encoded into the input, and not left as literal \
's and \r
's and \n
's.
The problem seems to be the "newlines" are being matched twice? Like you match just the \n
on one pass, and then the \r
on the next pass?
The simplest solution would be to restrict the newline to be the type you know the file has: /([^\"]|\"[^\"]*\")*?(\n\r)/
Does this help?
Alternatively, I would just use a regex split (delimited by comma) on each line of input.
Upvotes: 1