Reputation: 3073
I am very new to php so I apologize for the seemingly simple question. I need to parse a line of text into different variables. More specifically, I need to parse many lines of text in different arrays. The line of text would resemble the following
timeStamp UserName* garbage text Number x item*
timeStamp UserName* garbage text Number x item*
timeStamp UserName* garbage text Number x item*
both userName and item could contain spaces. I would assume the best way to go about this would be 4 different arrays?
actual data would look like the following
03:12:34 mhopkins321 has acquired 5 x bottles of water
09:38:01 Nick Smith has acquired 100 x pennies
23:22:59 Fancy Frank has acquired 15684 x artichoke hearts
So I would assume the arrays would be
$timeStamp $userName $amount $items
03:12:34 mhopkins321 5 bottles of water
09:38:01 Nick Smith 100 pennies
23:22:59 Fancy Frank 15684 artichoke hearts
Upvotes: 1
Views: 229
Reputation: 7942
Looks like you going need a regular expression to split the text line. It is not so easy to understand but a tool you need for other cases like this you related. Manual page: https://www.php.net/manual/en/book.pcre.php
You need find patterns on the text. For example the timestamp always start at very begin of line and ahs 8 characters in length?
Upvotes: 2
Reputation: 31621
This is a very bad format for machine parsing. Especially problematic is that names may have spaces but are not delimited.
The only foolproof way to parse this is to know all the "garbage text" strings that may appear between the name and the amount. Unless you have a complete list, you may mess up your user names.
It's possible to parse this using explode()
to split a line into an array and then extracting parts. However, I think you should just use a regular expression.
$sample = "
03:12:34 mhopkins321 has acquired 5 x bottles of water
09:38:01 Nick Smith has acquired 100 x pennies
23:22:59 Fancy Frank has acquired 15684 x artichoke hearts
";
$re = '/^(?<timeStamp>[0-9]{2}:[0-9]{2}:[0-9]{2}) # timestamp
\s+
(?<userName>[\w\s]+) # user name
\s+(?:has\s+acquired)\s+ # garbage text between name and amount
(?<amount>\d+) # amount
\s+x\s+ # multiplication symbol
(?<items>.*)\s*$ # item name (to end of line)
/xmu';
preg_match_all($re, $sample, $matches, PREG_SET_ORDER);
var_export($matches);
Upvotes: 2