Reputation: 43
I have some questions about how index in Alfresco One works with transactional queries.
Suppose that in my model.xml I add a custom property like this:
<type name="doc:myDoc">
<title>Document</title>
<parent>cm:content</parent>
<properties>
<property name="doc:level">
<title>Level</title>
<type>d:text</type>
<mandatory>true</mandatory>
<index enabled="true">
<atomic>true</atomic>
<stored>false</stored>
<tokenised>both</tokenised>
</index>
</property>
...
</properties>
</type>
And I have on my alfresco-global.properties these sets
solr.query.cmis.queryConsistency=TRANSACTIONAL_IF_POSSIBLE
solr.query.fts.queryConsistency=TRANSACTIONAL_IF_POSSIBLE
system.metadata-query-indexes.ignored=false
My first question is... How Alfresco knows which properties I want to index on DB? Read my model.xml and index only the indexed properties that I specify there? Index all the custom properties? Or I need to create a script to add these new indexes?
I read the script metadata-query-indexes.sql but I don't understand how rewrite it in order to add a new index for my property. If it's necessary this script, could you give me an example with the doc:myDoc property that I wrote before, please?
I read that PATH, SITE, ANCESTOR, OR, any d:content, d:boolean or d:any (among others) properties in your query or it will not be executable against the DB. But I don't understand what d:content is exactly.
For example, a query (based on my custom property written before) like TYPE:whatever AND @doc\:level:"value" is considered d:content? This query is supported by BD or goes to SOLR?
"Any property checks must be expressed in a form that means "identical value check" as querying the DB does not provide the same tokenization / similarity capabilities as the SOLR index. E.g. instead of my:property:"value" you'd have to use =my:property:"value" and "value" must be written in the proper case the value is stored in the DB."
This means that if I use the =, for example doing =@doc\:level:"value", this query isn't accepted on DB and goes to SOLR? I can't search for an exact value on DB?
Upvotes: 0
Views: 1120
Reputation: 2246
A nice explanation can be found here.
https://community.alfresco.com/people/andy1/blog/2017/06/19/explaining-eventual-consistency
When changes are made to the repository they are picked up by SOLR via a polling mechanism. The required updates are made to the Index Engine to keep the two in sync. This takes some time. The Index Engine may well be in a state that reflects some previous version of the repository. It will eventually catch up and be consistent with the repository - assuming it is not forever changing.
Upvotes: 0
Reputation: 415
I've been researching TMQs recently. I'm assuming that you need transactionality, which is why TMQ queries are interesting. Queries via SOLR are eventually consistent, but TMQs will immediately return the change. There are certain applications where eventual consistency is a huge problem, so I'm assuming this is why you are looking into them.
Alfresco says that they use TMQs by default, and in my limited testing (200k documents), I found no appreciable performance difference between a solr and TMQ query. I can't imagine they are horrible for performance if Alfresco set it up to be the default style, but I need to do further testing with millions of documents to be sure. It will of course depend on your database load. If your database is a bottleneck and you don't need the transactionality, you could consider using @ syntax in metadata searches to avoid them, or you could disable them via properties configuration.
1) How Alfresco knows which properties I want to index on DB? Read my model.xml and index only the indexed properties that I specify there? Index all the custom properties? Or I need to create a script to add these new indexes?
When you execute a query using a syntax that is compatible with a TMQ, Alfresco will do so. The default behavior is "TRANSACTIONAL_IF_POSSIBLE": http://docs.alfresco.com/4.2/concepts/intrans-metadata-configure.html
You do not have to have the field marked as indexable in the model for this to work. This is unclear from the documentation but I've tried disabling indexing for the field in the model and these queries still work. You don't even have to have solr running!
2) Another question is about query syntax that isn't supported by DB and goes directly to SOLR.
Your example of TYPE and an attribute does not go to solr. It's things like PATH that must go to SOLR.
3) "Any property checks must be expressed in a form that means "identical value check" as querying the DB does not provide the same tokenization / similarity capabilities as the SOLR index. E.g. instead of my:property:"value" you'd have to use =my:property:"value" and "value" must be written in the proper case the value is stored in the DB."
What they are saying is that you must use the = operator, not the default or @ operator. The @ operator depends on tokenization, but TMQs go straight to the database. However, you can use * in an attribute if you omit the "", like so:
=cm:\title:Startswith*
Works for me on 5.0.2 vía TMQ. You can absolutely search for an exact value as well however.
I hope this cleared it up for you. I highly recommend putting the solr.query.fts.queryConsistency=TRANSACTIONAL to force TMQs always in a test evironment and testing different queries if you still have questions about what syntax is supported.
Regards
Upvotes: 1