gsakthivel
gsakthivel

Reputation: 375

Why are the deleted documents end up in version2Store, in addition to archiveStore after deletion in Alfresco 5.2.0?

Pardon me if I do not understand the lifecycle of documents in Alfresco. This client is still using Alfresco 5.2.0 Community Edition. The requirement is to delete all the documents that meets a condition, which is a metadata property having a certain value. I wrote a CMIS program to do this.

CmisObject cmisObject = cmisSession.getObject(objectIdString);
cmisObject.delete(true);

I also used the following approach.

Document document = (Document) cmisObject;
document.deleteAllVersions();

Before this deletion, I have two nodes corresponding to the document. One is in the active store and the other is in the version2Store. After delete, document node is removed from the active workspace. After deletion, I now find two nodes, one is in the archive store and the other is in the version2Store.

First question is - it appears that whenever a document node is either in active store or archive store, it also is in version2Store. Is this how versioning works? Is there anything else affecting the deletion of versions?

Next question is to learn how to remove documents from the archive store. Reason is that these deletions are like purging. The client must do this to meet some regulation. Since they have the same amount of storage in the alfresco data side even after deleting the documents through CMIS, they are skeptical and think that the documents are still there (and can be restored).

Even if there is a non-programmatic way to remove the documents in archive store, like an admin tool or something, let me know. I would like to search the archive store between a date range, and remove them from there.

I do not know if deleting nodes from the archive store is possible through CMIS api. Please let me know if I am missing anything.

Then I was trying to use Rest API, as it has some nice delete calls. This Rest API feature seems to be not available in this installation (5.2.0). It looks like Rest API feature was just released in 5.2.1 and above.

Well, I could write a special web-scripts, if absolutely necessary for my use case.

Do you have any recommendations?

Here is a screenshot of database lookup database lookup

Upvotes: 0

Views: 38

Answers (2)

Heiko Robert
Heiko Robert

Reputation: 2737

Although you already found the solution how to bypass the trashbin (archive), let me add some background about node removal in Alfresco:

  • By default, when deleting a node, Alfresco only moves the node to the trash (similar to your windows or mac OS). As long the node is in the trash all versions will persist in the version store (but a node does not necessary have a version in the version2Store). Only when you also cleanup the trash afterwards, the nodes will "get prepared" for deletion.
  • You could skip the trash by adding the aspect sys:temporary to any node before doing the removal (via api or UI).
  • Even if you clean the trash or delete the node having the temporary aspect, the the system waits for a period of x protected days (by default 14 days) until the orphan-cleaner job moves (not deletes) the binary to the contentstore.deleted folder and keeps it there forever(!) unless you define your own job/script to delete the nodes also there. The idea behind the "protected days" is a combination of fixing the incompatibility of an filesystem having no transaction support while a database could always roll back and the possibility to easily restore the database to a previous date without the need to restore the binary folder contentstore which may have terrabytes of data.

If there are already nodes in your trash can and you would like to get rid of them (including all referenced nodes in the version store), the easiest is to login to Share as an admin, click on the username in the top nav bar, select "My Profile", select "Trashcan", then click "Clear" button. Unfortunately this only deletes the nodes shown in the list, not potentially some thousand more not displayed in the list. Starting from Alfresco 5.2. Alfresco/alfresco-trashcan-cleaner-module is preinstalled. So you could configure that module to automatically delete content from the trash after a given time.

I recommend to read folloging blog posts:

Upvotes: 2

gsakthivel
gsakthivel

Reputation: 375

If I add the aspect 'P:sys:temporary' and update its properties, before deleting the document, then once deleted, the document (node) is gone from the active and version2 stores, and also it does not get put into the archive store.

Glad the alfresco repository that I am working with supports this aspect. Saved lots of time...

public void makeTheDocumentTemporary(CmisObject docObject) {
    Property<Object> secondaryTypes = docObject.getProperty("cmis:secondaryObjectTypeIds");
    List<Object> aspects = (secondaryTypes != null) ? secondaryTypes.getValues() : null;
    
    if (aspects == null) {
        aspects = new ArrayList<Object>();
    } else {
        aspects = new ArrayList<Object>(aspects); // Defensive copy
    }
    if (!aspects.contains("P:sys:temporary")) {
       aspects.add("P:sys:temporary");
    }
    HashMap<String, Object> props = new HashMap<String, Object>();
    props.put("cmis:secondaryObjectTypeIds", aspects);

    try {
      docObject.updateProperties(props);
    } catch (Exception e) {
       System.err.println("Error updating aspect properties: " + e.getMessage());
    }
}

Code to update with this aspect is like shown above!

Upvotes: 0

Related Questions