Reputation: 1
I am writing a Java application which runs user-submitted Java code. I run each piece of user-submitted code in its own sandbox. This sandbox involves (among other things) running each code submission in a separate process, in a separate JVM (as I understand it, there is no other way to reliably control the memory and CPU usage of the submitted code, short of bytecode-level checks/analysis).
I want each sandboxed process to have access to a certain database. The database is large (around 10 GB, could be significantly larger in the future) and user-submitted code might make many billions of more-or-less random accesses to the database. So it is important that user-submitted code be able to access the database efficiently.
This is what I think I should do: load the database into memory from the main overseer process, and then give each sandboxed process read-only access to the loaded database. How can I do this? (Again, I am working in Java.)
Or do I have the wrong idea? Should I try a different approach?
Upvotes: 0
Views: 157
Reputation: 7555
I don't think given the amount of data you are talking about (10GB or possibly much more) keeping it in memory is feasible.
I would recommend going with an SQLite database solution.
From each spawned process, you can open up the database in read-only mode, and access it through standard JDBC calls, or wrap it in some API of your own design.
This also has the advantage that you can move to a fully-fledged database solution if performance becomes an issue.
If you don't control the format of your data in the first, you can easily write an importer that updates the SQLite database from the new data file.
Upvotes: 1
Reputation: 48692
Do not give them direct access to the database at all. Instead provide an API for the Java programs to use, with that API having no methods for altering the content of the data-base.
Upvotes: 0