Bite Bytes
Bite Bytes

Reputation: 1513

How to make C language context-free?

I know that C is not a context-free language, a famous example is:

int foo;
typedef int foo;
foo x;

In this case the lexer doesn't know, whether foo in the 3rd line, is an identifier, or typedef.

My question is, is this the only reason that makes C a Context-Sensitive Language?

I mean, if we get rid of typedef, would it become context-free language? Or there are other reasons (examples) that prevent it from being so?

Upvotes: 6

Views: 1732

Answers (3)

Petr Skocik
Petr Skocik

Reputation: 60145

The post-processor C syntax can be parsed with a classical lex + yacc combo. The lexer definition and the yacc grammar are freely available at

http://www.quut.com/c/ANSI-C-grammar-l-2011.html and http://www.quut.com/c/ANSI-C-grammar-y-2011.html

As you can see from the lex file, it's straightforward except for the context-sensitive check_type() (and comment(), but comment processing technically belongs to the preprocessor), which makes typedef the only source of context-sensitivity there. Since the yacc file doesn't contain any context-sensitivity introducing tricks either, a typedef-less C would be a perfectly context-free syntax.

The subsequent typechecking of C (matching declarations with use sites) is context sensitive, so you could say that overall, C is context sensitive.

Upvotes: 4

Tanvi Agarwal
Tanvi Agarwal

Reputation: 156

As per my knowledge and research there are two basic reasons that make C context sensitive language. These are:

  1. Variable is declared before it is used.
  2. Matching the formal and actual parameters of functions or sub-routines.

These two can't be done by PushDown Automata (PDA) but Linear Bounded Automata (LBA) can do thes two.

Upvotes: 1

Luis Colorado
Luis Colorado

Reputation: 12708

No. C cannot be a strict context independent language. For that, you should describe a syntax that doesn't allow to use a nondeclared variable (this is context) in a similar way as what you describe in your question. The language authors always describe syntax using some kind of context free grammar, but just to describe the main syntactic constructs of the language. The case you describe (making a type identifier to fit in a different token class to be able to go in places where it shouldn't) is only an example. If you look for example, the freedom in the order for things like static unsigned long long int variable simplifies the syntax remembering by programmers, but complicates things to the compiler authors.

Upvotes: 3

Related Questions