Reputation: 415
Using php I am trying to split a string into chunks of 1800 characters or less, split by the last newline in each 1800 character chunk or if no new line, after the last full word. Below is what I have so far but of course does not work. Only $matches[0] has any content which is the first 1800 characters
preg_match( "/[\s\S]{1,1800}$/m", $string, $matches );
Upvotes: 1
Views: 320
Reputation: 344
<?php
$text = <<<EOT
“…it has great practical value – you can wrap it around you for warmth as you bound across the
cold moons of Jaglan Beta; you can lie on it on the brilliant marble-sanded beaches of Santraginus V,
inhaling the heady sea vapours; you can sleep under it beneath the stars which shine so redly on the desert world of Kakrafoon; use it to sail a mini
raft down the slow heavy river Moth; wet it for use in
hand-to-hand-combat; wrap it round your head
to ward off noxious fumes or to avoid the gaze of the Ravenous Bugblatter
Beast of Traal (a mindboggingly stupid animal, it assumes
that if you can’t see it,
it can’t see you – daft as a bush, but very, very ravenous); you can wave your towel in emergencies as a
distress signal, and of course dry yourself off with it if it still seems to be clean enough.”
012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
1. There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable. There is another theory which states that this has already happened.
2. Many were increasingly of the opinion that they’d all made a big mistake in coming down from the trees in the first place. And some said that even the trees had been a bad move, and that no one should ever have left the oceans.
3. “My doctor says that I have a malformed public-duty gland and a natural deficiency in moral fibre,” Ford muttered to himself, “and that I am therefore excused from saving Universes.”
4. The ships hung in the sky in much the same way that bricks don’t.
5. “You know,” said Arthur, “it’s at times like this, when I’m trapped in a Vogon airlock with a man from Betelgeuse, and about to die of asphyxiation in deep space that I really wish I’d listened to what my mother told me when I was young.”
“Why, what did she tell you?”
“I don’t know, I didn’t listen.”
6. “Space,” it says, “is big. Really big. You just won’t believe how vastly, hugely, mindbogglingly big it is. I mean, you may think it’s a long way down the road to the chemist’s, but that’s just peanuts to space.”
7. “Funny,” he intoned funereally, “how just when you think life can’t possibly get any worse it suddenly does.”
8. Isn’t it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too?
9. A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools.
10. Curiously enough, the only thing that went through the mind of the bowl of petunias as it fell was Oh no, not again. Many people have speculated that if we knew exactly why the bowl of petunias had thought that we would know a lot more about the nature of the Universe than we do now.
EOT;
$sp = strlen($text);
$ii = 0;
$chars = 179; // 1799
$split = array();
for ($i = 0; $i < $sp; $i++) {
if ($ii == $chars) {
$char0 = substr($text, $i-$ii, 1);
if (preg_match("/\s/m", $char0)) { continue; }
$char3 = substr($text, $i-$ii, $ii);
if (preg_match("/\s/m", $char3)) {
for ($iii = $i-1; $iii > 0; $iii--) {
$char = substr($text, $iii, 1);
if (preg_match("/\s/m", $char)) {
if ($iii-1 >= 0) {
$char1 = substr($text, $iii-1, 1);
if (preg_match("/\s/m", $char1)) { continue; }
}
$split[] = substr($text, $i-$ii, $ii-($i-$iii));
$ii = 0;
$i = $iii;
break;
}
}
}
else {
$split[] = $char3;
$ii = 0;
}
}
$ii++;
}
if ($ii > 0) {
while (preg_match("/\s/m", substr($text, $i-$ii, 1))) { $i++; }
$iiii = 0;
while (preg_match("/\s/m", substr($text, $sp-1-$iiii, 1))) { $iiii++; }
$split[] = substr($text, $i-$ii, $ii - ($iiii>0?$iiii+1:0));
}
echo hrtime(true), PHP_EOL.PHP_EOL;
foreach ($split as $index => $part) {
echo "Length [".$index."]: " . strlen($part) . "\r\n\r\n";
echo $part . "\r\n\r\n";
}
echo "\n\n\n\n\n\n\n";
$reg = explode("[xxx]", preg_replace("~[\s\S]{1,".$chars."}\b(?!\w)~", "\$0[xxx]", $text));
echo hrtime(true), PHP_EOL.PHP_EOL;
foreach ($reg as $index => $part) {
echo "Length [".$index."]: " . strlen($part) . "\r\n\r\n";
echo $part . "\r\n\r\n";
}
?>
Comparing preg vs my custom made using strpos/substring. I'm still not sure if I understand your goal exactly. The preg doesn't handle 1801 length words without whitespace. 😁
http://sandbox.onlinephpfunctions.com/code/b09a6a3d6a74f7672e82be7f6b785a88ee68f9f0
Upvotes: 1
Reputation: 3845
You could:
explode()
$string
into $parts
based on newlines$parts
of that to your $chunk
until it will be bigger then 1800$chunk
to an array of $chunks
$chunks
as you wishUpvotes: 0