Reputation: 9484
We are building a solution for document storage and for each document we need to store a lot of extra metadata with it to comply with local regulations, ranging from basic data like title or description to dates of relevant events or disposition and classification rules.
I've seen different types of solutions, but none convinces me:
I'm biased towards number 5, providing a parallel full-text index (Lucene.Net? Other?) to search by relevant metadata (not everything has to be "searchable").
Any suggestion? Similar experiences?
Upvotes: 1
Views: 907
Reputation: 141
Maybe you can take a look at JCR(Java Content Repository). JCR is a standard for content repository which captures the common requirements of content management like versioning, full-text search and edit. Also it provides a level of abstract on the content storage, which means you can use one API to put contents into any kind of storage system like database, xml file, etc. Of course you can add metadata to your document by adding some properties to document node with JCR API. You don't have to worry about how the document and metadata will be stored. JCR will take care of it. Jackrabbit is the reference implementation of JCR. Have a try.
Upvotes: 1
Reputation: 27234
Why not use CouchDB? Its designed precisely to address this type of requirement.
If that is not an option, consider using Lua or JSon (per your #5 option) as the meta-data descriptor.
Upvotes: 1
Reputation: 16583
Table 1: Document information (PK is document ID)
Table 2: Metadata definitions (PK is metadata definition ID)
Table 3: Document ID, Metadata defintion ID, metadata value
The biggest drawback to this is that you'd either have to have a single type (varchar, presumably), or you'd have to have n columns (where n is the number of data types you're willing to store), and use a column in the metadata definitions table to identify which column in table 3 to pull the value from.
My opinions on the 5 solutions listed:
That's my thoughts - I've never designed a system like this, but I have dealt with commercial systems that have used several of these schemes.
Upvotes: 1