Unrolling tree-recursion into an iterative program

I wrote a library that can generate arbitrary strings given a spec-object (https://github.com/rgrannell1/revexp) and I want to convert the function that reads the spec from a recursive algorithm to an iterative algorithm. I'm running into stackoverflow errors due to the depth of the specs I'm traversing.

I believe I need to move from using the call-stack to an explicit stack, but I've never done this before. I've read through previous posts on StackOverflow but I didn't fully understand how to apply the solutions to this problem.

Here is an example spec object.

const data = {
  every: [
    {
      digit: { zero: false }
    },
    {
      repeat: {
        value: { digit: {} }
      }
    }
  ]
}

and a minimal example of how the algorithm currently traverses the spec and generates a string matching the spec.

const fromSpec = (data) => {
  if (data.every) {
    return fromSpec.every(data)
  } else if (data.digit) {
    return fromSpec.digit()
  } else if (data.repeat) {
    return fromSpec.repeat(data.repeat)
  }
}

fromSpec.digit = () => {
  return Math.floor(Math.random() * 10)
}

fromSpec.every = part => {
  let message = ''

  for (const elem of part.every) {
    message += fromSpec(elem)
  }

  return message
}



fromSpec.repeat = part => {
  let message = ''
  
  // -- just using fixed repeat for the example
  for (let ith = 0; ith < 10; ++ith) {
    message += fromSpec(part.value)
  }

  return message
}

const result = fromSpec(data)
result // 1034856872

I'd appreciate any advice on how to traverse this data-structure and generate an output string in an iterative rather than recursive fashion.

Upvotes: 2

Views: 172

Answers (2)

Mulan
Mulan

Reputation: 135227

It's natural to write RevExp.toString as a recursive program because it is expected to process a recursively structured input. However because recursive programs can lead to deep stacks, it's not uncommon to flatten a recursive process to an iterative one.

Good programmers know that maintaining complexity goes a long way in sustaining our sanity. I want to keep my recursive program and I want the computer to handle flattening the process for me. Can I have my cake and eat it too?

Here's another way to look a the problem -

// example1.js

import { concat, upper, digit, str, toString } from './RevExp.js'

const licensePlate =
  concat(upper(), upper(), upper(), str("-"), digit(), digit(), digit())

console.log(toString(licensePlate))
console.log(toString(licensePlate))
console.log(toString(licensePlate))
// RFX-559
// VKT-794
// KSF-823

Let's begin writing the RevExp module. We'll start by creating constructors for each of our expression types -

// RevExp.js

const str = (value = "") =>
  ({ type: str, value })

const lower = () =>
  ({ type: lower })

const upper = () =>
  ({ type: upper })

const digit = ({ zero = true } = {}) =>
  ({ type: digit, zero })

const concat = (...exprs) =>
  ({ type: concat, exprs })

Now let's work on RevExp.toString -

// RevExp.js (continued)

import { inRange } from './Rand.js'

const toString = (e) =>
{ switch (e.type)
  { case str:
      return String(e.value)
    case lower:
      return String.fromCharCode(inRange(97, 122))
    case upper:
      return String.fromCharCode(inRange(65, 90))
    case digit:
      return e.zero
      ? String(inRange(0, 9))
      : String(inRange(1, 9))
    case concat:
      return e.exprs.reduce((r, v) => r + toString(v), "")
    default: throw Error(`unsupported expression type: ${e.type}`)
  }
}

export { lower, upper, digit, alpha, repeat, concat, str, toString }

It should be possible to make complex expressions by combining several simple expressions. And we imagine some new types like alpha, and repeat -

// example2.js

import { alpha, digit, repeat, concat, str, toString } from './RevExp.js'

const segment =
  concat(alpha(), digit(), alpha(), digit(), alpha())

const serial =
  concat
    ( repeat
        ( concat(segment, str("-"))
        , { count: 4 }
        )
    , segment
    )

console.log(toString(serial))
console.log(toString(serial))
console.log(toString(serial))
// F3Q7U-b6k8Q-R8e3A-a2q3M-j0a9k
// g6G3w-h2O3O-b8O3k-L4p1y-m5I0y
// m6E0M-A4C2y-K3g0M-d7X7j-w8v5G

And add corresponding support in the RevExp module -

// RevExp.js (enhanced)

import { inRange, sample } from './Rand.js'

const str = // ...
const lower = // ...
const upper = // ...
const digit = // ...
const concat = // ...

const alpha = () =>
  oneOf(upper(), lower())

const oneOf = (...exprs) =>
  ({ type: oneOf, exprs })

const repeat = (expr = {}, { count = 10 } = {}) =>
  ({ type: repeat, expr, count })

const toString = (e) =>
{ switch (e.type)
  { case str: // ...
    case lower: // ...
    case upper: // ...
    case digit: // ...
    case concat: // ...
    case oneOf:
      return toString(sample(e.exprs))
    case repeat:
      return toString(concat(...Array(e.count).fill(e.expr)))
    default: // ...
  }
}

export { /* ..., */ alpha, oneOf, repeat }

Now let's transform the recursive program to an iterative one. And without having to think about the stack or mutate state as the program runs, too!

// RevExp.js (stack-safe)

// ...
import * as Str from './Str.js'
import { loop, recur, call } from './TailRec.js'

// ...
const toString = (e = {}) =>
  loop(toStringTailRec, e)

const toStringTailRec = e =>
{ switch (e.type)
  { case str: // ...
    case lower: // ...
    case upper: // ...
    case digit: // ...
    case concat:
      return e.exprs.length
        ? call
            ( Str.concat
            , recur(e.exprs[0])
            , recur(concat(...e.exprs.slice(1)))
            )
        : Str.empty
    case oneOf:
      return recur(sample(e.exprs))
    case repeat:
      return recur(concat(...Array(e.count).fill(e.expr)))
    default: throw Error(`unsupported expression type: ${e.type}`)
  }
}

export { /*...*/, toString } // <-- don't export toStringTailRec helper

And here are the remaining modules, Str, Rand, and TailRec -

// Str.js

const empty =
  ""

const concat = (a = "", b = "") =>
  a + b

export { empty, concat }
// Rand.js

const rand = (n = 2) =>
  Math.floor(Math.random() * n)

const inRange = (min = 0, max = 1) =>
  rand(max - min + 1) + min

const sample = (t = []) =>
  t[rand(t.length)]

export { rand, inRange, sample }

Writing modules is an important factor in creating reusable code. This TailRec module was written in another post and can be reused, without modification1, to meet our program's needs.

Now we can rely on recursively structured programs without having to introduce complexity or requiring a change in how we think every time we encounter a recursive problem. Write the module once, reuse as needed -

// TailRec.js

const identity = x =>
  x

const call = (f, ...values) =>
  ({ type: call, f, values })

const recur = (...values) =>
  ({ type: recur, values })

const loop = (f, ...init) =>
{ const aux1 = (e, k) =>
    e.type === recur
      ? call(aux, e.values, r => call(aux1, f(...r), k))
  : e.type === call
      ? call(aux, e.values, r => call(aux1, e.f(...r), k))
  : call(k, e)

  const aux = (exprs, k) =>
    call
      ( exprs.reduce
          ( (mr, e) =>
              k => call(mr, r => call(aux1, e, x => call(k, [ ...r, x ])))
          , k => call(k, [])
          )
      , k
      )

  return run(aux1(f(...init), identity))
}

const run = r =>
{ while (r && r.type === call)
    r = r.f(...r.values)
  return r
}

export { loop, call, recur }

In the end, the approach here is virtually the same as yours. However instead of representing expressions writing JS objects by hand, we use functions, which can be parameterized and composed, and can handle the tedious and precarious assembly for us. Programmer sanity maintained -

// example2.js

// ...
console.log(serial)
{ type: concat
, exprs:
    [ { type: repeat
      , expr:
          { type: concat
          , exprs:
              [ { type: concat
                , exprs:
                    [ { type: oneOf, exprs: [ { type: lower }, { type: upper } ] }
                    , { type: digit, zero: true }
                    , { type: oneOf, exprs: [ { type: lower }, { type: upper } ] }
                    , { type: digit, zero: true }
                    , { type: oneOf, exprs: [ { type: lower }, { type: upper } ] }
                    ]
                }
              , { type: str, value: "-" }
              ]
          }
      , count: 4
      }
    , { type: concat
      , exprs:
        [ { type: concat
          , exprs:
              [ { type: oneOf, exprs: [ { type: lower }, { type: upper } ] }
              , { type: digit, zero: true }
              , { type: oneOf, exprs: [ { type: lower }, { type: upper } ] }
              , { type: digit, zero: true }
              , { type: oneOf, exprs: [ { type: lower }, { type: upper } ] }
              ]
          }
        ]
      }
    ]
}

I hope this was seen as an exciting way to see the same problem from a different perspective. If you end up using the TailRec module, see the original post for additional explanation. I'm happy to answer any follow-up questions.


1. Minor formatting changes and variable renaming for consistency with this answer

Upvotes: 1

dannyadam
dannyadam

Reputation: 4170

The following example modifies the code to use a stack data structure. Data on the stack is processed incrementally, with new data possibly added on each iteration.

const fromSpec = (data) => {
    const stack = [data];
    let message = '';
    while (stack.length > 0) {
        const item = stack.pop();
        // Assumption based on the code in the question:
        //   'every', 'digit', and 'repeat' keys are mutually exclusive.
        if (item.every) {
            // Add items in reverse order, so that items are popped off the stack
            // in the original order.
            for (let i = item.every.length - 1; i >= 0; --i) {
                stack.push(item.every[i]);
            }
        } else if (item.digit) {
            message += String(Math.floor(Math.random() * 10));
        } else if (item.repeat) {
            for (let i = 0; i < 10; ++i) {
                stack.push(item.repeat.value);
            }
        }
    }
    return message;
}

An alternative approach would be required for a more complicated scenario (e.g., when a node in the tree requires processing both 1) when it's initially encountered in the traversal and 2) after all its children have been traversed).

The following links may be relevant.

Upvotes: 1

Related Questions