Reputation: 8172
I am looking at a the German HBCI/FinTS protocol. One peculiarity of this protocol is that it can contain binary blobs, which are prefixed by @NUM_OF_BINARY_CHARS@
. Otherwise the protocol is quite simple, a grammar could be described as follows (a bit simplified, terminals are quoted by "):
message = segment+
segment = elements "'"
elements = element "+" elements | element
element = items
items = item ":" items | item
item = [a-zA-Z0-9,._-]* | escaped item
escaped = ?[-@?_-a-zA-Z0-9,.]
The @ is missing here!
A sample message could look something like this
FirstSegment+Elem1+Item1:Item2+@4@:'+@+The_last_four_chars_are_binary+Elem4'SecondSegment+Elem5'
Can this language (with the escaping of binary strings) be described by a context free grammar?
Upvotes: 0
Views: 84
Reputation: 372764
No, this language is not context-free. The format you're describing is essentially equivalent to this language
{ @n@w | n is a natural number and |w| = n }
You can show that this isn't context-free by using the context-free pumping lemma. Let the pumping length be p and consider the string @1p@x1111...1 (p times). This is a string encoding of a binary piece of data that show have length 111...1 (p times). Now split the string into u, v, x, y, z where |vy| > 1 and |vxy| ≤ p. If v or y is the @ sign, then uv0xy0z isn't in the language because it doesn't have enough @ signs. If v and y are purely contained in 1p, then pumping up the string will end up producing a string not in the language because the binary data string won't have the right size. Similarly, if v and y are purely contained in x111...1 (p times), pumping up or down will make the payload the wrong size. Finally, if v is in the length field and x is in the payload, pumping up v and x simultaneously will make the payload have the wrong length because v is written in decimal (so each extra character increases the payload size by a factor of ten) while x's length isn't.
Hope this helps!
Upvotes: 2