PostScript execution of specific types (Array)

Question

I'm having a hard times with executing arrays.

PostScript Language Reference Manual page 50:

An executable array or executable packed array (procedure) object is pushed on the operand stack if it is encountered directly by the interpreter. If it is invoked indirectly as a result of executing some other object (a name or an operator), it is called instead. The interpreter calls a procedure by pushing it on the execution stack and then executing the array elements in turn. When the interpreter reaches the end of the procedure, it pops the procedure object off the execution stack. (Actually, it pops the procedure object when there is one element remaining and then pushes that element; this permits unlimited depth of “tail recursion” without overflowing the execution stack.)

The Green Book page 33:

The PostScript interpreter executes array objects one element at a time, leaving the unused part of the array on the execution stack. The interpreter always checks the remainder of the procedure body before it executes the first element. If the procedure body is empty, the interpreter discards it. This permits infinitely deep tail recursion, since the empty procedure bodies will not accumulate on the execution stack. (Tail recursion means that the recursive procedure call is the very last element (the tail) of the procedure body.)

As I read both, I see a slight difference:

Actually, it pops the procedure object when there is one element remaining and then pushes that element;

vs.

The interpreter always checks the remainder of the procedure body before it executes the first element. If the procedure body is empty, the interpreter discards it.

PLRM: before executing the last element you pop the array body from the execution stack and push the last element to the execution (?) stack.

Green book: The PostScript interpreter executes array objects one element at a time, leaving the unused part of the array on the execution stack. The interpreter always checks the remainder of the procedure body before it executes the first element".

For this code:

%!PS-Abode-2.0
/F {
  findfont exch scalefont setfont
} bind def
24 /Helvetica F
0 0 moveto
(Red Book) show
showpage

When the F token is scanned and interpreted you look up the procedure body and have this on the execution stack:

{ findfont --exch-- --scalefont-- --setfont--}
-file-

EXAMPLE 1: Does it work like this (PLRM):

execution stack:

{ findfont --exch-- --scalefont-- --setfont--}
-file-

after executing scalefont, we are at the last element, so pop that last element, pop the procedure and push the last element back:

--setfont--
-file-

EXAMPLE 2: or does it goes like this (green book):

execution stack:

{ findfont --exch-- --scalefont-- --setfont--}
-file-

execution stack:

{ --exch-- --scalefont-- --setfont--}
-file-

execution stack:

{ --scalefont-- --setfont--}
-file-

execution stack:

{ --setfont--}
-file-

execution stack: (check remainder and if empty pop the last element, pop the array and push the last element:

--setfont--
-file-

It's not clear to me as both are differently explained. The Green book suggests that the array body is modified (shrinked) each time.

Update: some pseudocode

Example 1:

array[1, 2, 3, 4]
for (int i = 0; i < array.Length; i++)
{
   if (i == array.Length -1)
   {
      exectionstack.Pop() // pop the array
   }
   Execute(array[i]);     // execute element
}

example 2:

//array[1, 2, 3, 4]
do {

   var obj = executionstack.Pop();

   if (obj is ArrayObject) {
     var arr = executionstack.Pop();
     var item = arr[0];     // first element

     if (arr.Length > 1)
     {
        var newArray = new Array(arr.Length - 1); // create new array size -1
        arr.CopyTo(newArray, 1, arr.Length - 1);  // copy remaining elements
        executionstack.Push(newArray);  // push the remaining array body
     }

     executionstack.Push(item);  // push the item
   } 
   else 
   {
      Execute(obj); // not an array, execute it
   }
} while (executionstack.Count > 0)

luser droog · Accepted Answer

ISTM that your pseudocode doesn't capture the crux of the issue. If you'll pardon a little lecturing, ... Postscript executes like a low-level language, like an assembly language. Let me use your example proc and go into lots of detail.

First, what does it mean "when the F token is scanned and interpreted"? It means that the execution stack has a file object on it.

  op-stack>
exec-stack> --file--

The execution loop handles an executable file by doing something to the effect of:

--file-- exec        % push back on exec stack
--file-- token {
    xcheck  1 index type /arraytype ne  and {
        exec
    } if
}{
    pop
} ifelse

token returns the executable name F, and since it is executable and not an array, it gets pushed back on the exec stack. This leaves the stacks like this:

  op-stack>
exec-stack> --file--  F

The execution loop handles an executable name by doing something to the effect of:

--name-- load
xcheck {
    exec
} if

Look up the name. If the value is executable, execute it. This leaves the stacks like this:

  op-stack>
exec-stack>  --file--  { findfont --exch-- --scalefont-- --setfont--}

The execution loop handles an executable array by doing something to the effect of:

--array--
dup length 1 gt {
    dup 1  1 index 2 sub  getinterval  exec  % push back remainder
} if
dup length 0 gt {
    0 get
    dup xcheck  1 index type /arraytype ne {
        exec
    } if
}{
    % leave on stack
} ifelse

It uses the equivalent of getinterval to push the sub-array of the remainder of the procedure back on the exec stack. But if there's 1 or fewer elements, do not push the remainder but it's a waste of stack space to hold an array of length 0. This leaves the stacks like this:

  op-stack>
exec-stack>  --file--  { --exch-- --scalefont-- --setfont--}  findfont

And this situation is handled by the executable-name case.

Each handler just does the smallest possible job and pushes things on the exec stack to be done later.

For efficiency, it is very important that you can make the subarray very cheaply. Do not copy the array, but design your array type so two different pointers can have different lengths and different starting offsets. This ability is used by many operators like search which returns substrings of the argument string. For Xpost, I pack all objects into an 8-byte representation. Array objects have 4 16bit fields:

tag
length
VM-ref
offset

All objects have a 16bit tag as the first byte. The tag has a bitfield for the type of the object. So, ignoring range checking, getinterval is implemented like this:

object
getinterval ( object array, uint16_t offset, uint16_t length ){
    array.offset = offset;
    array.length = length;
    return array;
}

Super-simple, super-fast. No allocating, no copy.

HTH

To actually answer the question, I think the Green Book and PLRM are describing the same things with different emphasis. One consequence of the "stack-machine" design is:

There is only one loop.

All of the various looping operators just push things on the exec stack and return, handling one little piece of the job each time. This even goes for complex looping operators like image. So just like the array handler above, things like forall have no "loop" in the function that implements it. They just use the one loop, the interpreter's execution loop.

Eg.

/forall {  % arr proc
   dup length 0 eq { pop pop }{
        /forall cvx exec                   % comp proc
        dup 0 get 3 1 roll                 % comp[0] proc comp
        1 1 index length 1 sub getinterval % comp[0] proc comp[1..n-1]
        1 index 2 array astore cvx  exec   % comp[0] proc
        exec                               % comp[0]
   } ifelse
} def

This was modified from my debugger which simulates the inner loop in postscript. And re-implements all of the looping operators to work with it.

Note, in this version of forall it does push the final 0-length tail for a final iteration. This is so the final invocation of the proc can still use exit to exit the loop, and exit must still have something on the exec stack to search for.

PostScript execution of specific types (Array)

Answers (2)

Related Questions