Reputation: 28489
Rather than deserializing a whole BSON document to a python dict, I would like to traverse it directly, taking advantage of the native traversability of the BSON format[1,2]
Is that possible with any of the python BSON libraries available? I can readily see the methods for getting a dict out, but methods for traversing the binary format don't seem to be apparent.
Upvotes: 1
Views: 573
Reputation: 2207
This sounds like what you are looking for: https://github.com/bauman/python-bson-streaming
It allows to stream the bson, rather than loading the whole file in memory.
From the documentation:
from bsonstream import KeyValueBSONInput
from sys import argv
for file in argv[1:]:
f = open(file, 'rb')
stream = KeyValueBSONInput(fh=f, fast_string_prematch="somthing") #remove fast string match if not needed
for id, dict_data in stream:
if id:
...process dict_data...
Upvotes: 1
Reputation: 43884
The problem you have is that to convert the BSON string into a iterator, which is in itself an object you must actually convert into a language struct, i.e. a dictionary.
Even with a BSON library it would still have to convert it into a traversable object that python understands, a.k.a a dict.
However to answer your question: I know of none.
Upvotes: 0