Reputation: 54457
As part of our application, we are using Jackrabbit (1.6.4) to store documents. Each document that is retrieved by our application is put into a folder structure in Jackrabbit, which is created if not existing.
Our DBA has noticed that the following query is executed a lot against the Oracle (11.2.0.2.0) database holding the Jackrabbit schema - more than 50000 times per hour, causing a lot of IO on the database. In fact, it is one of the top 5 SQL statements in terms of IO over elapsed time (97% IO):
select BUNDLE_DATA from VERSION_BUNDLE where NODE_ID = :1
Taking a look at the database, one notices that this table initially only contains a single record, comprising the node_id
(data type RAW) key with the DEADBEEFFACEBABECAFEBABECAFEBABE
value and then a couple of bytes in the bundle_data
BLOB column. Later on, more records are added with additional data.
The SQL for the table looks like this:
CREATE TABLE "VERSION_BUNDLE"
(
"NODE_ID" RAW(16) NOT NULL ENABLE,
"BUNDLE_DATA" BLOB NOT NULL ENABLE
);
I have the following questions:
BUNDLE_DATA
value changed by Jackrabbit at all or is it just read for every access to the repository?Update: The table only initially contains one record, additional records are added over time as decided internally by Jackrabbit. The access still seems to be read-only for most of the cases, as insert or update statements are not reported as being run with a high number.
Upvotes: 2
Views: 1863
Reputation: 592
Why is Jackrabbit accessing this table so frequently?
We have seen that this table is accessed very often even if you are not asking for versions. Take a look to this thread from Jackrabbit users mailing list
Upvotes: 1
Reputation: 68288
Why is Jackrabbit accessing this table so frequently?
Then it's a sign that you're creating versions in your repository. Is that something which your application is supposed to do?
Any Jackrabbit tuning options to make this faster?
Not that I'm aware of ; one option to investigate is to upgrade to a more recent Jackrabbit version. Version 2.4.2 was just released and 1.6.4 is almost two years old. It's a possibility that there were performance improvements between these releases.
Is the BUNDLE_DATA value changed by Jackrabbit at all or is it just read for every access to the repository?
By the looks of it's the GUID of the root repository node.
Is there any way to tune the database schema to make it deal better with this scenario?
As far as I know the schema is auto-generated by Jackrabbit so the only options are to modify the table definition in a compatible way after it's been created. But that's a topic for a DBA, which I am not.
Upvotes: 1
Reputation: 11
If the JCR-Store is based within a Oracle database you could reorganize the underlying table.
If you're storing binary objects like pictures, documents - just have also a look at VERSION_BINVAL.
Upvotes: 1
Reputation: 52386
Is this physical i/o or logical? With the data being read that often I'd be surprised if the blocks are being aged out of the cache fast enough for physical i/o to be required.
Upvotes: 2