mhopkins321
mhopkins321

Reputation: 3073

Parsing line of text into different variables using php

I am very new to php so I apologize for the seemingly simple question. I need to parse a line of text into different variables. More specifically, I need to parse many lines of text in different arrays. The line of text would resemble the following

timeStamp UserName* garbage text Number x item*
timeStamp UserName* garbage text Number x item*
timeStamp UserName* garbage text Number x item*

both userName and item could contain spaces. I would assume the best way to go about this would be 4 different arrays?

actual data would look like the following

03:12:34 mhopkins321 has acquired 5 x bottles of water
09:38:01 Nick Smith has acquired 100 x pennies
23:22:59 Fancy Frank has acquired 15684 x artichoke hearts

So I would assume the arrays would be

$timeStamp         $userName        $amount     $items
03:12:34           mhopkins321      5           bottles of water
09:38:01           Nick Smith       100         pennies
23:22:59           Fancy Frank      15684       artichoke hearts

Upvotes: 1

Views: 229

Answers (2)

cavila
cavila

Reputation: 7942

Looks like you going need a regular expression to split the text line. It is not so easy to understand but a tool you need for other cases like this you related. Manual page: https://www.php.net/manual/en/book.pcre.php

You need find patterns on the text. For example the timestamp always start at very begin of line and ahs 8 characters in length?

Upvotes: 2

Francis Avila
Francis Avila

Reputation: 31621

This is a very bad format for machine parsing. Especially problematic is that names may have spaces but are not delimited.

The only foolproof way to parse this is to know all the "garbage text" strings that may appear between the name and the amount. Unless you have a complete list, you may mess up your user names.

It's possible to parse this using explode() to split a line into an array and then extracting parts. However, I think you should just use a regular expression.

$sample = "
03:12:34 mhopkins321 has acquired 5 x bottles of water
09:38:01 Nick Smith has acquired 100 x pennies
23:22:59 Fancy Frank has acquired 15684 x artichoke hearts
";

$re = '/^(?<timeStamp>[0-9]{2}:[0-9]{2}:[0-9]{2}) # timestamp 
         \s+
         (?<userName>[\w\s]+)        # user name
         \s+(?:has\s+acquired)\s+    # garbage text between name and amount
         (?<amount>\d+)              # amount
         \s+x\s+                     # multiplication symbol
         (?<items>.*)\s*$            # item name (to end of line)
       /xmu';

preg_match_all($re, $sample, $matches, PREG_SET_ORDER);

var_export($matches);

Upvotes: 2

Related Questions