asankasri
asankasri

Reputation: 486

Parse an ini-formatted string to generate an associative array

I have a string like below.

$str = "ENGINE=InnoDB 
        DEFAULT CHARSET=utf8 
        COLLATE=utf8_unicode_ci 
        COMMENT='Table comment'";

And I need to parse the key/value pairs from the string and combine them with the key/value pairs in the array below...

$arr = array(
    "ENGINE" => "InnoDB",
    "DEFAULT CHARSET" => "utf8",
    "COLLATE" => "utf8_unicode_ci",
    "COMMENT" => "'Table comment'"
);

Here the sequence of the parts of the string can be different.

Example:

$str = "ENGINE=InnoDB
        COMMENT='Table comment'
        COLLATE=utf8_unicode_ci
        DEFAULT CHARSET=utf8";

Upvotes: 1

Views: 91

Answers (3)

zedfoxus
zedfoxus

Reputation: 37059

Here's a verbose, inelegant way of parsing the data (with plenty of comment-explanation). This could be adapted for a differently structured string.

<?php

$str = "ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT='Table comment'";
$keys = array('ENGINE', 'DEFAULT CHARSET', 'COLLATE', 'COMMENT');

$str_array = explode('=', $str); 
/* result of the above will be
[0] => ENGINE
[1] => InnoDB DEFAULT CHARSET
[2] => utf8 COLLATE
[3] => utf8_unicode_ci COMMENT
[4] => 'Table comment' 
*/

$output = array();
$lastkey = '';

// loop through each split item
foreach ($str_array as $item) {

    // if the item is entirely one of the keys, just remember it as the last key
    if (in_array($item, $keys)) {
        $lastkey = $item;
        continue;
    }

    // check if item like InnoDB DEFAULT CHARSET contains one of the keys
    // if item contains a key, the key will be returned
    // Otherwise, item will be returned
    $result = item_has_a_key($item, $keys);

    if ($result === $item) {
        // if the result is exactly the item, that means no key was found in the item
        // that means, it is the value of the previously found key
        $output[$lastkey] = $item;
    } else {    
        // if the result is not exactly the item, that means it contained one of the keys
        // strip out the key leaving only the value. Assign the value to previously found key
        $output[$lastkey] = trim(str_replace($result, '', $item));

        // remember the key that was found
        $lastkey = $result;
    }
}

print_r($output);
/*
Result:
[ENGINE] => InnoDB
[DEFAULT CHARSET] => utf8
[COLLATE] => utf8_unicode_ci
[COMMENT] => 'Table comment'
*/


// $item can be InnoDB DEFAULT CHARSET
// $keys is the array of keys you have assigned (ENGINE, DEFAULT CHARSET etc.)
// if the item contains one of the keys, the key will be returned
// if the item contains no key, the item will be returned
function item_has_a_key($item, $keys) {
    foreach ($keys as $key) {
        if (strpos($item, $key) !== false) {
            return $key;
        }
    }
    return $item;
}
?>

Upvotes: 0

voodoo417
voodoo417

Reputation: 12101

String looks like ini-file. With parse_ini_string:

$str = "ENGINE=InnoDB DEFAULT 
        CHARSET=utf8 
        COLLATE=utf8_unicode_ci 
        COMMENT='Table comment'";

$data = parse_ini_string($str);
var_dump($data);

array(4) {
   ["ENGINE"]=>
   string(14) "InnoDB DEFAULT"
   ["CHARSET"]=>
   string(4) "utf8"
   ["COLLATE"]=>
   string(15) "utf8_unicode_ci"
   ["COMMENT"]=>
   string(13) "Table comment"
}

Upvotes: 1

David Boskovic
David Boskovic

Reputation: 1519

You should use preg_match_all() and have PHP build your output from there in the format you'd like. Here's a working example in PHP. And the regex statement.

<?php
    $str = "ENGINE=InnoDB COMMENT='Table comment' COLLATE=utf8_unicode_ci DEFAULT CHARSET=utf8";
    preg_match_all("/([\w ]+)=(\w+|'(?:[^'\\\]|\\.)+')\s*/",$str,$matches,PREG_SET_ORDER);
    $out = [];
    foreach($matches as $match) {
        $out[$match[1]] = $match[2];
    }
    var_dump($out);
?>

And the result:

array(4) {
  ["ENGINE"]=>
  string(6) "InnoDB"
  ["COMMENT"]=>
  string(15) "'Table comment'"
  ["COLLATE"]=>
  string(15) "utf8_unicode_ci"
  ["DEFAULT CHARSET"]=>
  string(4) "utf8"
}

Explanation of regex

([\w ]+) // match one or more word characters (alpha+underscore+space)
= // match equals sign
  (
      \w+ // match any word character
   | // or
      ' // match one exact quote character
      (?:[^'\\]|\\.)+ // match any character including escaped quotes
      ' // match one exact quote character
   )
\s* // match any amount of whitespace until next match

Upvotes: 6

Related Questions