Meysam
Meysam

Reputation: 18167

How to decode infinite length ending with two zero bytes

I am decoding an array of bytes which are in TLV format (Tag-Length-Value). Sometimes the Length of a Value is infinite. i.e I have to read the succeeding bytes until I get to two consequent zero bytes meaning the end of Value. The problem is when the Value of a TLV is itself constructed of another TLV with infinite Length. I am trying to figure out the best way to decode such TLVs with nested infinite values.

In the following example, each square indicates one byte. After reading L1, I realize that the length of V1 is infinite, which means that I have to read the succeeding bytes until I get to two zero bytes. But the first two zero bytes after L1 indicate the end of V3, not V1:

                                V3                            
                                .                             
                                |             end of V2       
                                \            -----------
+----+----+----+----+----+----+----+----+----+----+----+----+----+
| T1 | L1 | T2 | L2 | T3 | L3 | 13 | 0  | 0  | 0  | 0  | 0  | 0  |
+----+----/----+----/----+----/----+----+----+----+----+----+----+
          |         |         |    -----------         -----------
          |         |         |    end of V3           end of V1
          \         |         |
    V1 Starts here  |         |
                    |         |
                    \         |
              V2 Starts here  |
                              |
                              \
                        V3 starts here 

I need a method/algorithm to detect the end of V1. Any advice would be appreciated!

Upvotes: 2

Views: 239

Answers (3)

UmNyobe
UmNyobe

Reputation: 22910

You might want to use a matching parentheses algorithm with the following :

  • Any two zeros is a closing parenthesis
  • The first 2 bytes or any two non zero bytes after two zeros is an opening parenthesis.

At the end you can construct a tree where each tag enclosed directly (with one level of containment) is its child. Given this meta data you can do whatever you need...

Upvotes: 2

Michael Chinen
Michael Chinen

Reputation: 18717

  1. If the value you are encoding is infinite, increment a counter (that starts at 1) each time you encounter another infinite (nested) length.
  2. When you reach two zeros, decrement that value.
  3. When your counter is zero, you have the last index of the value.

Of course, this only works if inner nested values are end before the outer (as in your example). If this is not true, I don't think you can differentiate which ends first without more encoding info.

Upvotes: 2

R. Martinho Fernandes
R. Martinho Fernandes

Reputation: 234614

A simple approach is to use a recursive function: each time the function reads until it finds two double zeros, and then returns that position, so that the previous invocation knows where to continue looking for its own double zero terminator.

Alternatively, you could track the nesting level with an integer: increase each time you start reading a new TLV, decrease it when it finds a double zero. The reading is over when the nesting level is zero again.

Upvotes: 3

Related Questions