Saad Out03
Saad Out03

Reputation: 91

How to Handle Left Associativity for Logical Operators in a Recursive Descent Parser?

I'm implementing a recursive descent parser for a mini-shell program based on bash, where typically, right recursion is applied. However, for logical operators && and ||, this approach doesn't seem convenient for execution.

For instance, when parsing a command sequence such as cmd1 && cmd2 || cmd3, the right recursive parser would generate a tree like this:

     &&
    /   \
cmd1    ||
        /  \
    cmd2   cmd3

However, for the execution to follow the bash logic correctly, it's more suitable to have left associativity for && and || operators, resulting in a tree like this:

    ||
   /  \
&&   cmd3
/ \
cmd1 cmd2

Given this scenario, what would be the recommended approach to modify the parser to handle left associativity specifically for && and || operators while maintaining right associativity for others? Would it involve a change in the parsing algorithm or a post-processing step after parsing? Or there is a way to execute the tree with right associativity according to bash?

Upvotes: 1

Views: 102

Answers (1)

Kaz
Kaz

Reputation: 58627

In the POSIX shell syntax, the && and || operators have the same precedence and associate left to right.

Using Bash-specific curly brace syntax we can understand it as:

A op1 B op2 C      <-->    { A op1 B } op2 C

for op1 and op2 being any combination of the || and && operators.

(We can't use parentheses to show this because they introduce a semantic action of execution in a "subshell" process.)

The shell executes commands separated by these operators from left to right. When the separator is &&, the command so far must be successful in order to continue. When the separator is ||, the command so far must have failed.

Parsing this is the same like handling A + B - C .... Recursive descent parsing identifies as LL(k) (usually k = 1), whereas left association more naturally pairs with LR parsing. A grammar structure like:

E := E && E
  |  E || E

is "left factored" for LL(1) parsing. You don't have to do this formally. There is a code structure which handles the case. The basic structure is:

  1. Parse an expression; call it E.

  2. While the next token is && or ||:

    • consume token, call it operator.
    • parse expression which follows, call it Enext.
    • build tree node by combining existing expression with new one: E := make_node(E, operator, Enext).
    • loop from (2)
  3. Return E

Upvotes: 1

Related Questions