Reputation: 3155
I'm using Alfresco 3.4d and imported some nodes as well as created a few with NodeService
. Today I noticed that a Lucene query by ID does sometimes return two rows instead of just one. Not all nodes show this kind of behavior.
For example, when I execute the following Lucene query in the Alfresco Node Browser, I get the result shown below: ID:"workspace://SpacesStore/96c0cc27-cb8c-49cf-977d-a966e5c5e9ca"
How is it even possible that a query by ID can return more than one row? I tried rebuilding the Lucene index, but it didn't help. When I delete the node, the query returns 0 rows. What can I do to remove those "ghost" nodes from the query result?
Upvotes: 2
Views: 3512
Reputation: 1281
I also ran across this problem and asked the Alfresco support for advice. They told me that it is perfectly normal to have duplicate entries in the lucene ID field and that this is related to whether there is an ANCESTOR present or not. They recommended using the sys:node-uuid field when doing a lucene search for the node's ID, e.g.:
@sys\:node-uuid:f13a21dd-b020-4c70-aa21-1a0e5c89d42b
Upvotes: 2
Reputation: 6643
I don't know directly how this is possible but in your 'code' where you retrieve the nodes you could always do: if node.isDocument or node.isContainer to get true result or type is cm:content or cm:folder.
You could also try to re-index, but I doubt that will be of any help
Upvotes: 0
Reputation: 51
I've seen this problem since Alfresco 3.2r, but maybe it is even older! I used the Lucene index Viewer "Luke" (http://www.getopt.org/luke/) to check the index directly and I saw that the corrupt index entry contains almost no information. As workaround we combined our search to some basic information like node type or aspect. I will ask a colleague if he has more information about this.
Upvotes: 2