Reputation: 2852
I am having an huge XML file containing the Resumes. This file is in two format viz- A single master file containing all the Resumes for ex-
<Resumes>
<Resume>
<Name>ABC</Name>
......
......
</Resume>
<Resume>
<Name>PQR</Name>
......
......
</Resume>
......
......
</Resumes>
and multiple files for ex-
file 1-
<Resumes>
<Resume>
<Name>ABC</Name>
......
......
</Resume>
</Resumes>
file-2
<Resumes>
<Resume>
<Name>PQR</Name>
......
......
</Resume>
</Resumes>
and so on.
I want to use baseX
or eXist
XML DB for storing the XML. So in future, if I want to
add more Resumes (in XML) format then which one will be better?
Upvotes: 0
Views: 667
Reputation: 5294
For eXist-db, let me quote from a post on exist-open by Wolfgang Meier in response to a similar question:
I don't think there's a theoretical limit, but there are certainly some practical considerations. Storing a very large document can block the db more than storing many small ones. It requires a single transaction (and sufficient disk space for the transaction log).
The dblp bibliography, which I use for some automated performance tests, comes as a single document with more than 600mb. This loads well if you slightly increase the cache size and memory settings. I know other users have to deal with much larger documents (many gigabytes), but if you have a choice, I would definitely recommend to split your data in smaller chunks, which are easier to handle.
Granted, eXist-db has become even more efficient and robust since November 2009 when Wolfgang wrote this post, but I think his advice still applies. Two final notes:
Make sure you use the latest version of eXist, e.g. either 1.4.2 or the 2.0 Tech Preview. These benefit from the advances I spoke about.
To squeeze out the most performance of eXist-db, read the eXist-db documentation article entitled, Performance Tuning.
Upvotes: 2