Can parse on binary! in Rebol 2 capture a binary! instead of a string! (like in Rebol 3)?

Consider the following:

>> bin: to-binary {Rebol}
== #{5265626F6C}

>> parse/all bin [s: to end]
== true

I expect s to have captured the head of the binary series, and be of type BINARY!. In Rebol 3 this is the case:

>> type? s
== binary!

>> s == bin
== true

In Rebol 2, it seems that parse must have converted the data to a string (or at least be "imaging" the binary as a string! under the hood, and not comparing equal)

>> type? s
== string!

>> s == bin
== false

Because Rebol 2 is not Unicode, a binary byte string and a character string are basically equivalent. But with Rebol 3's Unicode I surmise you could end up with very different behavior if you wrote:

parse/all to-string bin [s: to end]

Because it would start interpreting multiple byte sequences into the string encoding, which doesn't work if what you really wanted was uninterpreted bytes. :-(

If one wants to write code that works in either Rebol 2 or Rebol 3 equally well in parsing BINARY!, how would you work around this? (Ideally making Rebol 2 act more like 3, of course.)

Upvotes: 2

Views: 229

Answers (2)

Indeed, Rebol 2 is actually just "imaging" the data as a STRING! and not copying it, notice the following

>> bin: to-binary {Rebol}
== #{5265626F6C}

>> parse bin [s: (clear s)] 
== true

>> s
== ""

>> bin
== #{}

That's because Rebol 2 had routines available for aliasing string data as binary and vice-versa: AS-BINARY and AS-STRING. Unlike their TO-BINARY and TO-STRING variants, they do not actually make copies of the data.

Here's one idea that you (ummmm, well, I) could try...make a compatibility function (let's call it bin-pos):

bin-pos: func [pos [binary! string!]] [
    return either string? pos [
        ;; we must be using r2, image the parse position back to binary
        as-binary pos
    ] [
        ;; just a no-op in r3, binary parse input yields binary parse positions
        pos
    ]
]

So in the above example, for Rebol 2 the right thing happens, if anywhere you would use s you instead substitute bin-pos s:

>> type? (bin-pos s)
== binary!

>> (bin-pos s) == bin
== true

For cases where you use the COPY dialect word and a new string is made, the same technique will work...but perhaps a different wrapper name should be used. bin-capture?

Upvotes: 2

earl
earl

Reputation: 41755

You could just add a parse action to your rules that ensures captured data is binary!:

>> bin: to binary! {Rebol}

>> parse/all bin [s: to end (s: to binary! s)]

>> type? s
== binary!

You could wrap this conversion in an ensure-binary helper for documentation purposes.

(Note that if I understand the last paragraph of your answer right, this is basically what you suggest there. However, I think you can just use this approach even for captures made without copy.)

Upvotes: 0

Related Questions