kdt
kdt

Reputation: 28489

Traversing BSON binary representation in python?

Rather than deserializing a whole BSON document to a python dict, I would like to traverse it directly, taking advantage of the native traversability of the BSON format[1,2]

Is that possible with any of the python BSON libraries available? I can readily see the methods for getting a dict out, but methods for traversing the binary format don't seem to be apparent.

  1. https://groups.google.com/forum/#!topic/bson/e7aBbwA6bAE
  2. http://bsonspec.org/

Upvotes: 1

Views: 573

Answers (2)

chilladx
chilladx

Reputation: 2207

This sounds like what you are looking for: https://github.com/bauman/python-bson-streaming

It allows to stream the bson, rather than loading the whole file in memory.

From the documentation:

from bsonstream import KeyValueBSONInput
from sys import argv

for file in argv[1:]:
    f = open(file, 'rb')
    stream = KeyValueBSONInput(fh=f, fast_string_prematch="somthing") #remove fast string match if not needed
    for id, dict_data in stream:
        if id:
            ...process dict_data...

Upvotes: 1

Sammaye
Sammaye

Reputation: 43884

The problem you have is that to convert the BSON string into a iterator, which is in itself an object you must actually convert into a language struct, i.e. a dictionary.

Even with a BSON library it would still have to convert it into a traversable object that python understands, a.k.a a dict.

However to answer your question: I know of none.

Upvotes: 0

Related Questions