Reputation: 39
I have an SQL query to get the content size from a content URL, but when I execute the SQL and compare the result to the directory size of contentstore it doesn't match. For example first I get the content size from database (PostgreSQL):
SELECT count(*),
SUM(round(u.content_size/1024/1024,2)) AS "Size (MB)",
to_char(n.audit_created:: date,'yyyy') AS yyyy_
FROM alf_node AS n,
alf_node_properties AS p,
alf_node_properties AS p1,
alf_namespace AS ns,
alf_qname AS q,
alf_content_data AS d,
alf_content_url AS u
WHERE n.id=p.node_id
AND ns.id=q.ns_id
AND p.qname_id=q.id
AND p.long_value=d.id
AND d.content_url_id=u.id
AND p1.node_id=n.id
AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
GROUP BY yyyy_
ORDER BY yyyy_ DESC;
It gives me a total size of 600 GB. But when I see the size property for folder content store in alf_data
it shows 2.4 TB. How is that possible?
Upvotes: 4
Views: 2192
Reputation: 3818
It is not so easy to know the space of the contentstore based only on what you currently have in your repository.
When a document is deleted or updated with a new version, Alfresco does not immediately delete the binary from the disk.
The process to actually free the disk space is a bit more "complex". How to configure the first step of this process is described in the following article: Cleaning up orphaned content (purge)
The "purge" by default is configured to be executed every day at 4AM and will purge any orphaned file that has been removed from the repository at least 14 days earlier.
As described in the article, by default, Alfresco does not delete the purged files but it only moves them in a different folder (contentstore.deleted).
It is safe to delete the files in the contentstore.deleted after a backup.
As usual, in alfresco-global-properties, you can modify the default configuration (that you can find in alfresco/repository.properties)
You can end up with a lot of orphaned files in many ways. Just to give some examples:
Upvotes: 6
Reputation: 4156
And last, but not least, you're rounding up file sized in MB, for each file before summing that up, try summing in bytes then rounding the results like this:
SELECT count(*),round(SUM(u.content_size)/1024/1024, 3) AS "Size (MB)", to_char(n.audit_created:: date,'yyyy') as yyyy_
FROM alf_node AS n,
alf_node_properties AS p,
alf_node_properties AS p1,
alf_namespace AS ns,
alf_qname AS q,
alf_content_data AS d,
alf_content_url AS u
WHERE n.id=p.node_id
AND ns.id=q.ns_id
AND p.qname_id=q.id
AND p.long_value=d.id
AND d.content_url_id=u.id
AND p1.node_id=n.id
AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
group by yyyy_
ORDER BY yyyy_ DESC;
Upvotes: 5