Mitch
Mitch

Reputation: 127

Parse filename without delimiting characters into segments

I'm working on a project where I downloaded thousands laws from the governments website (legality all in order). I am trying to split the name of the files so I can sort them better in my website. The file is designed as such

48600003801.html

I am running a foreach loop on the scandir() function. I have around 20,000+ files as such. I want to do the following:

Chapter: 486
Section: 380

      CH.   ART.  SEC.
Split 486 | 000 | 0380 | 1

// PHP code
$files = scandir('stathtml/');
    foreach ($files as $file) {

        // Change this to a split function and then echo certain parts
        // of it out to test.
        echo $file . "<br>";

    }

How would I go about splitting such a string type up seeing that they are almost all different lengths?

Upvotes: 1

Views: 1010

Answers (2)

mickmackusa
mickmackusa

Reputation: 47894

sscanf() is an ideal tool for parsing your filenames which do not contain delimiters. Signify the length and quality of each substring by customizing the placeholders.

%s placeholders will greedily match non-whitespace characters. %d will greedily match digits and return the int-type match (no leading zeros). Explicit length can be dictated by inserting the specified length between the % and the type letter. Demo

$file = '48600003801.html';

sscanf($file, '%03s%03s%04s%d', $chp, $art, $sec, $num);
var_export([$chp, $art, $sec, $num]);

sscanf($file, '%03d%03d%04d%d', $chp, $art, $sec, $num);
var_export([$chp, $art, $sec, $num]);

Outputs:

array (
  0 => '486',
  1 => '000',
  2 => '0380',
  3 => 1,
)

array (
  0 => 486,
  1 => 0,
  2 => 380,
  3 => 1,
)

If you'd rather just have a flat array directly, sscanf() can return these same payloads as an array if you do not declare any variables starting from the 3rd parameter. $result = sscanf($file, '%03s%03s%04s%d');

Upvotes: 0

bugs2919
bugs2919

Reputation: 358

try this:

// PHP code
    $files = scandir('stathtml/');
        foreach ($files as $file) {
            $arr1 = substr($files, -5);
            $arr1 = substr($arr1, 4);
            $arr2 = substr($files, 3);
            echo $file . "<br>";
            echo "section ".$arr1 . "<br>";
            echo "chapter".$arr2 . "<br>";
        }

Upvotes: 1

Related Questions