orialz
orialz

Reputation: 75

Accessing files in Mongodb

I am using sacred package in python, this allows to keep track of computational experiments i'm running. sacred allows to add observer (mongodb) which stores all sorts of information regarding the experiment (configuration, source files etc). sacred allows to add artifacts to the db bt using sacred.Experiment.add_artifact(PATH_TO_FILE).

This command essentially adds the file to the DB.

I'm using MongoDB compass, I can access the experiment information and see that an artifact has been added. it contains two fields: 'name' and 'file_id' which contains an ObjectId. (see image)

I am attempting to access the stored file itself. i have noticed that under my db there is an additional sub-db called fs.files in it i can filter to find my ObjectId but it does not seem to allow me to access to content of the file itself.

object id under .files

file_id under artifact/object

Upvotes: 1

Views: 2880

Answers (3)

Jarno
Jarno

Reputation: 7232

I wrote a small library called incense to access data from MongoDB stored via sacred. It is available on GitHub at https://github.com/JarnoRFB/incense and via pip. With it you can load experiments as Python objects. The artifacts will be available as objects that you can again save on disk or display in a Jupyter notebook

from incense import ExperimentLoader

loader = ExperimentLoader(db_name="my_db")
exp = loader.find_by_id(1)
print(exp.artifacts)
exp.artifacts["my_artifact"].save()  # Save artifact on disk.
exp.artifacts["my_artifact"].render()  # Display artifact in notebook.

Upvotes: 1

M K
M K

Reputation: 416

Code example for GridFS (import gridfs, pymongo)

If you already have the ObjectId:

artifact = gridfs.GridFS(pymongo.MongoClient().sacred)).get(objectid)

To find the ObjectId for an artifact named filename with result as one entry of db.runs.find:

objectid = next(a['file_id'] for a in result['artifacts'] if a['name'] == filename)

Upvotes: 2

0x126
0x126

Reputation: 166

MongoDB file storage is handled by "GridFS" which basically splits up files in chunks and stores them in a collection (fs.files).

Tutorial to access: http://api.mongodb.com/python/current/examples/gridfs.html

Upvotes: 1

Related Questions