Reputation: 495
For educational purposes I'm trying to build a PUG Parser. I have two loops, one loops the lines of the PUG-Code, the other loops unclosed tags I saved last iteration.
In my code it seems like the inner loop is only run once, because when I put echo
statements for debugging I'm not getting enough output.
foreach ($lines as $line) {
#Increment the line number
$lineNum++;
#Save the current indentation for later use
$currentIndentation = strspn($line, " ");
#Go though list of unclosed tags and recursivley close all that need to be closed
foreach ($unclosed_tags as $lineNum => $tag) {
#Assign a variable to the tag indentation
$tagIndentation = $tag[1];
echo $tag[0] . "$tagIndentation:$currentIndentation=" . !($tagIndentation < $currentIndentation) . "<br>";
#Check if the current indentation is not smaller than the tag
if (!($tagIndentation < $currentIndentation)) {
#Close the tag
$output .= "</" . $tag[0] . ">";
#Remove the tag from unclosed list cince we just closed it
$unclosed_tags = array_diff($unclosed_tags, array($unclosed_tags[$key]));
}
}
#Get the current element (== the first string in a string)
$currentElement = preg_replace('/(\s*)([^\s]*)(.*)/', '$2', $line);
#Output as open tag
$output .= "<" . $currentElement . ">" . trim(str_replace_first($currentElement, "", $line));
#Add to list of unclosed tags
$unclosed_tags[$lineNum] = array($currentElement, $currentIndentation);
}
The sample .pug file I'm trying to parse looks like this:
html
head
title Test
body
h1 This page was defined using JADE (PUG)!
h2 This a Header 2
div
p <strong>This text will appear strong</strong>
hr
br
ul
li This
li is a
li test
ol
li <b>Ordered Test</b>
li for the win
The exported html looks like this
<html>
<head>
<title>Test</title>
<body>
<h1>This page was defined using JADE (PUG)!</h1>
<h2>This a Header 2</h2>
<div>
<p><strong>This text will appear strong</strong></p>
<hr/>
</p>
<br/>
</p>
<ul>
<li>This</li>
<li>is a</li>
<li>test</li>
<ol>
<li><b>Ordered Test</b></li>
<li>for the win</li>
Appart from closing the unclosed tags this is my complete algorithm, but it is not working since sometimes tags are not closed because the inner loop can't check if more tags need to be closed this cycle!
I am not looking for a existing library, but rather tips on why my code is not working and how to improve upon the code quality and execution time!
Upvotes: 2
Views: 1759
Reputation: 495
I worked out a solution. Most likely the problem was because of a wrongly declared array accessor.
foreach ($lines as $line) {
#Increment the line number
$lineNum++;
#Save the current indentation for later use
$currentIndentation = strspn($line, " ");
print_r($unclosed_tags);
#Go though list of unclosed tags and recursivley close all that need to be closed
foreach ($unclosed_tags as $tagIndentation => $tag) {
#Check if the current indentation is not smaller than the tag
if (!($tagIndentation < $currentIndentation)) {
#Close the tag
$output .= str_repeat(' ', $tagIndentation) . "</" . $tag . ">\n";
#Remove the tag from unclosed list cince we just closed it
unset($unclosed_tags[$lineNum]);
}
}
#Get the current element (== the first string in a string)
$currentElement = preg_replace('/(\s*)([^\s]*)(.*)/', '$2', $line);
#Output as open tag
$output .= str_repeat(' ', $currentIndentation) . "<" . $currentElement . ">" . trim(str_replace_first($currentElement, "", $line)) . "\n";
#Add to list of unclosed tags
$unclosed_tags[$currentIndentation] = $currentElement;
}
#Close all remaining unclosed tags
Here is the working code.
Feel free to use it in your project (Pay attention to StackOverflows License though)
Upvotes: 0
Reputation: 21
Break breaks you out of the foreach loop, it doesn't just skip you to the next instance in the loop. So in this case you can delete the else statement, this will just simply do nothing if it doesn't match the if.
If your Break is intentional then there would be data you're not expecting/handling correctly in the $unclosed_tags variable.
Upvotes: 0
Reputation: 15311
So I found something that works. Here is the code as an example and I'll explain the changes below:
foreach ($lines as $line) {
#Increment the line number
$lineNum++;
#Save the current indentation for later use
$currentIndentation = strspn($line, " ");
#Go though list of unclosed tags and recursivley close all that need to be closed
for($i=count($unclosed_tags)-1; $i>= 0; $i--){
$tag = $unclosed_tags[$i];
#Assign a variable to the tag indentation
$tagIndentation = $tag[1];
echo $tag[0] . "$tagIndentation:$currentIndentation=" . !($tagIndentation < $currentIndentation) . "<br>";
#Check if the current indentation is not smaller than the tag
if ($tagIndentation >= $currentIndentation) {
#Close the tag
$output .= "</" . $tag[0] . ">";
#Remove the tag from unclosed list cince we just closed it
unset($unclosed_tags[$i]);
} else {
break;
}
}
#Get the current element (== the first string in a string)
$currentElement = preg_replace('/(\s*)([^\s]*)(.*)/', '$2', $line);
#Output as open tag
$output .= "<" . $currentElement . ">" . trim(str_replace_first($currentElement, "", $line));
#Add to list of unclosed tags
$unclosed_tags[] = array($currentElement, $currentIndentation);
}
OK, so changes are:
$unclosed_tags
as just a straight append. You don't need to track what the line number is here and this allows us to use a for
loop as the inner loop.After that, the only thing that is left would be to loop over the remaining $unclosed_tags
in reverse order and close them. I guess you could also keep the line number stuff and foreach(array_reverse($unclosed_tags) as $i=>$line){ ...
with the same result as the for loop.
Upvotes: 1