Alexander Teitelboym
Alexander Teitelboym

Reputation: 73

Splitting string by fixed length

I am looking for ways to split a string of a unicode alpha-numeric type to fixed lenghts. for example:


    992000199821376John Smith          20070603

and the array should look like this:

Array (
 [0] => 99,
 [1] => 2,
 [2] => 00019982,
 [3] => 1376,
 [4] => "John Smith",
 [5] => 20070603
) 

array data will be split like this:

    Array[0] - Account type - must be 2 characters long,
    Array[1] - Account status - must be 1 character long,
    Array[2] - Account ID - must be 8 characters long,
    Array[3] - Account settings - must be 4 characters long,
    Array[4] - User Name - must be 20 characters long,
    Array[5] - Join Date - must be 8 characters long.

Upvotes: 7

Views: 2654

Answers (4)

Pavel Radzivilovsky
Pavel Radzivilovsky

Reputation: 19114

It is not possible to split a unicode string in a way you ask for.

Not possible without making the parts invalid. Some code points have no way of standing out, for example: שׁ is 2 code points (and 4 bytes in UTF-8 and UTF-16) and you cannot split it because it is undefined.

When you work with unicode, "character" is a very slippery term. There are code points, glyphs, etc. See more at http://www.utf8everywhere.org, the part on "length of a string"

Upvotes: 0

noj
noj

Reputation: 6759

Or if you want to avoid preg:

$string = '992000199821376John Smith          20070603';
$intervals = array(2, 1, 8, 4, 20, 8);

$start = 0;
$parts = array();

foreach ($intervals as $i)
{
   $parts[] = mb_substr($string, $start, $i);

   $start += $i;
}

Upvotes: 4

Thomas Moxon
Thomas Moxon

Reputation: 73

Using the substr function would do this quite easily.

$accountDetails = "992000199821376John Smith          20070603";
$accountArray = array(substr($accountDetails,0,2),substr($accountDetails,2,1),substr($accountDetails,3,8),substr($accountDetails,11,4),substr($accountDetails,15,20),substr($accountDetails,35,8));

Should do the trick, other than that regular expressions (as suggested by akond) is probably the way to go (and more flexible). (Figured this was still valid as an alternate option).

Upvotes: 0

akond
akond

Reputation: 16045

    $s = '992000199821376Николай Шмидт       20070603';

    if (preg_match('~(.{2})(.{1})(.{8})(.{4})(.{20})(.{8})~u', $s, $match))
    {
        list (, $type, $status, $id, $settings, $name, $date) = $match;
    }

Upvotes: 0

Related Questions