JD.
JD.

Reputation: 15551

Should we move to a database from XML files?

We have an application that persists to XML files. Apart from one large XML file (which acts like an index into other files), all other files are stored in separate folders and are very small (they mostly contain meta data about a document (i.e. video/pdf etc)).

From a relational point of view, there are not too many relationships between data/objects apart from meta data associated to a document that is physically stored on disk (i.e. via a directory path). So all the data is associated to documents.

Apart from searching the XML index file, all other searching is done via using the "Windows index searching".

Although the system is for a single user in the future it will be changed to multiple users which means that the index file will have several users updating it concurrently. This file may become very large (10,000+ entries where each entry contains some meta data and a reference to the document on disk)

Another requirement is to have more than one index file on different machines each managing their own document repository. Now this means that to search/browse for content we have to search across multiple machines.

With ALL this in mind, I see that having a database may resolve some of the issues but there is a lot of work to get to the stage where we have to address the issues (i.e. create an ORM model, database, repositories, commands etc).

My question or questions are can some of these issues be resolved by other ways without going via the database route?

TIA JD

Upvotes: 0

Views: 102

Answers (2)

Michael Kay
Michael Kay

Reputation: 163352

It sounds to me as if your workload is definitely moving in a direction that needs a database. Since the data is all XML already, moving to a native XML database should be the least disruptive route. The popular products these days tend to be MarkLogic if you can afford it and eXist if you can't. (I don't have an interest in either but know contented users of both.)

Upvotes: 2

Fred Foo
Fred Foo

Reputation: 363627

You could try a native XML database to speed up your XML handling. I've used both Berkeley DB XML (embedded, library) and eXist (networked, client-server, REST) with some success. In particular, the former solved the problem of replacing lots of small XML files scattered everywhere by a single, indexed file, so it might replace your XML index file. The latter has full-text search via Lucene.

Upvotes: 2

Related Questions