Can languages with char counts be described by context free grammars?

Question

I am looking at a the German HBCI/FinTS protocol. One peculiarity of this protocol is that it can contain binary blobs, which are prefixed by @NUM_OF_BINARY_CHARS@. Otherwise the protocol is quite simple, a grammar could be described as follows (a bit simplified, terminals are quoted by "):

message  = segment+
segment  = elements "'"
elements = element "+" elements | element
element  = items
items    = item ":" items | item
item     = [a-zA-Z0-9,._-]* | escaped item
escaped  = ?[-@?_-a-zA-Z0-9,.]

The @ is missing here!

A sample message could look something like this

FirstSegment+Elem1+Item1:Item2+@4@:'+@+The_last_four_chars_are_binary+Elem4'SecondSegment+Elem5'

Can this language (with the escaping of binary strings) be described by a context free grammar?

templatetypedef · Accepted Answer

No, this language is not context-free. The format you're describing is essentially equivalent to this language

{ @n@w | n is a natural number and |w| = n }

You can show that this isn't context-free by using the context-free pumping lemma. Let the pumping length be p and consider the string @1^p@x^{1111...1 (p times)}. This is a string encoding of a binary piece of data that show have length 111...1 (p times). Now split the string into u, v, x, y, z where |vy| > 1 and |vxy| ≤ p. If v or y is the @ sign, then uv⁰xy⁰z isn't in the language because it doesn't have enough @ signs. If v and y are purely contained in 1^p, then pumping up the string will end up producing a string not in the language because the binary data string won't have the right size. Similarly, if v and y are purely contained in x^{111...1 (p times)}, pumping up or down will make the payload the wrong size. Finally, if v is in the length field and x is in the payload, pumping up v and x simultaneously will make the payload have the wrong length because v is written in decimal (so each extra character increases the payload size by a factor of ten) while x's length isn't.

Hope this helps!

Can languages with char counts be described by context free grammars?

Answers (1)

Related Questions