Max
Max

Reputation: 1349

Creating a Regular Expression pattern to match space delimited string

I have file with a lot of rows (over 32k). Rows looks like:

34 Item
5423 11Item
44    Item

First digits it is IDs. I want make assoc. array: array("34" => "Item", "5423" => "11Item", "44" => "Item")

  1. IDs can be from 1 to 5 length (1 - 65366)
  2. Name of item can start from with a digit
  3. Minimum one (BUT can be MORE than one) space between IDs and Items name

So main divide is space or certain number of them. Using PHP.

Upvotes: 0

Views: 109

Answers (4)

Amal
Amal

Reputation: 76666

Use preg_match with named capturing groups:

preg_match('/^(?<id>\d+)\s+(?<name>[\w ]+)$/', $row, $matches);

$matches['id'] will contain the ID and $matches['name'] will contain the name.

while (/* get each row */) {
    preg_match('/^(?<id>\d+)\s+(?<name>[\w ]+)$/', $row, $matches);

    $id = $matches['id'];
    $name = $matches['name'];

    if ($id > 1 && $id < 65366) {
        $arr[$id] = $name;
    }
}

print_r($arr);

Example output:

Array
(
    [34] => Item
    [5423] => 11Item
    [44] => Item
    [3470] => BLABLA TEF2200
)

Demo

Upvotes: 1

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89639

You can use this:

$data = <<<'LOD'
34 Item
5423 11Item
44    Item
546
65535 toto le héros
65536 belzebuth
glups  glips
LOD;

$result = array();

$line = strtok($data, "\r\n");

while($line!==false) {
    $tmp = preg_split('~\s+~', $line, 2, PREG_SPLIT_NO_EMPTY);
    if (count($tmp)==2 && $tmp[0]==(string)(int)$tmp[0] && $tmp[0]<65536)
        $result[$tmp[0]] = $tmp[1];
    $line = strtok("\r\n");
}
print_r($result);

Upvotes: 1

Maxime Lorant
Maxime Lorant

Reputation: 36181

Here's a method which doesn't check validity of data but might works. It explodes every line according to space(s) and put results in a $res associative array.
For information, preg_split() allows to split a string with a regex.

$res = array();
foreach($lines as $line) {
     $data = preg_split('/\s+/', $line);
     $res[$data[0]] = $data[1];     
}

If you really want to check your conditions you can add some if statement, with the ID limit:

$res = array();
foreach($lines as $line) {
     $data = preg_split('/\s+/', $line);
     $idx = intval($data[0]);
     if($idx > 0 && $idx < 65366) // skip lines where the ID seems invalid
         $res[$data[0]] = $data[1];     
}

Upvotes: 1

Ed Heal
Ed Heal

Reputation: 60037

Use https://www.php.net/preg_split

i.e.

 preg_split("/ +/", $line);

It will return an array of strings.

Upvotes: 0

Related Questions