Mark Fruhling
Mark Fruhling

Reputation: 606

Any Elegant Ideas on how to parse this Dataset?

I'm using PHP 5.3 to receive a Dataset from a web service call that brings back information on one or many transactions. Each transaction's return values are delimited by a pipe (|), and beginning/ending of a transaction is delimited by a space.

2109695|49658|25446|4|NSF|2010-11-24 13:34:00Z 2110314|45276|26311|4|NSF|2010-11-24 13:34:00Z 2110311|52117|26308|4|NSF|2010-11-24 13:34:00Z (etc)

Doing a simple split on space doesn't work because of the space in the datetime stamp. I know regex well enough to know that there are always different ways to break this down, so I thought getting a few expert opinions would help me come up with the most airtight regex.

Upvotes: 3

Views: 167

Answers (4)

Napas
Napas

Reputation: 2811

Use explode('|', $data) function

Upvotes: 1

codaddict
codaddict

Reputation: 455400

If each timestamp is going to have a Z at the end you can use positive lookbehind assertion to split on space only if it's preceded by a Z as:

$transaction = preg_split('/(?<=Z) /',$input);

Once you get the transactions, you can split them on | to get the individual parts.

Codepad link

Note that if your data has a Z followed a space anywhere else other than the timestamp, the above logic will fail. To overcome than you can split on space only if it's preceded by a timestamp pattern as:

$transaction = preg_split('/(?<=\d\d:\d\d:\d\dZ) /',$input);

Upvotes: 4

Galen
Galen

Reputation: 30170

Each timestamp is going to have a Z at the end so explode it by 'Z '. You don't need a regular expression. There's no chance that the date has a Z after it only the time.

example

Upvotes: 1

ircmaxell
ircmaxell

Reputation: 165271

As others have said, if you know for sure that there will be no Z characters anywhere other than in the date, you could just do:

$records = explode('Z', $data);

But if you have them elsewhere, you'll need to do something a bit fancier.

$regex = '#(?<=\d{2}:\d{2}:\d{2}Z)\s#i';
$records = preg_split($regex, $data, -1, PREG_SPLIT_NO_EMPTY);

Basically, that record looks for the time portion (00:00:00) followed by a Z. Then it splits on the following white-space character...

Upvotes: 1

Related Questions