Topher Fangio
Topher Fangio

Reputation: 20667

Patterns for storing grid-like data in a database

We are attempting to build something akin to a very basic version of a spreadsheet or a database table. We have rows and columns, and we want the user to be able to add rows or columns, and do some basic sorting/filtering.

One concept we've had was to simply create a new database table for every "document" that we want to store. This way, the SQL is simple and fast. However, this brings about a plethora of problems including managing the different tables per client, having to execute "ALTER TABLE" statements when they want to add a new column, and similar issues.

So, what is the best way to store grid-like data with unknown columns in a single (or multiple) tables in a database such that sorting and filtering doesn't become a nightmare?

Upvotes: 2

Views: 1676

Answers (1)

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

Until you share more information about what you are trying to accomplish, this simple ER model might be a good place to start:

enter image description here

ROW and COLUMN are integer "coordinates" within the spreadsheet.

Pros:

  • Simple.
  • Potentially very efficient for "sorting" and "filtering".
  • Efficient (storage-wise) for "sparse" data - i.e. where there are big "holes" in the spreadsheet.

Cons:

  • Inefficient (storage-wise) for "dense" data, since a lot of repetition will happen in the CELL's PK.
    • Depending on your DBMS, you might be able to minimize the space overhead using compressed indexes.
    • You could even use compressed clustered table (key-compressed index-organized table in Oracle parlance), with the unfortunate consequence of being able to access the calls efficiently only row-wise or column-wise but not both, and a limited ability to filter by VALUE (secondary indexes are expensive in a clustered table).
  • You have to choose the type of the VALUE in advance, making it awkward to store values of non-uniform types. There are ways to work around this, but they add complexity and can potentially be a performance concern.

Upvotes: 2

Related Questions