Query Troubleshooting
Slow Queries
The first step in query troubleshooting is often to detect a query is slow, or traverses many nodes. Queries that traverse many nodes are logged as follows:
*WARN* org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor
Traversed 22000 nodes with filter Filter(query=
select * from [nt:base] where isdescendantnode('/etc') and lower([jcr:title]) like '%coat%');
consider creating an index or changing the query
To get good performance, queries should not traverse more than about 1000 nodes (specially for queries that are run often).
Potentially Slow Queries
In addition to avoiding queries that traverse many nodes, it makes sense to avoid queries that don't use an index. Such queries might be fast (and only traverse few nodes) with a small repository, but with a large repository they are typically slow as well. Therefore, it makes sense to detect such queries as soon as possible (in a developer environment), even before the code that runs those queries is tested with a larger repository. Oak will detect such queries and log them as follows (with log level INFO for Oak 1.6.x, and WARN for Oak 1.8.x):
*INFO* org.apache.jackrabbit.oak.query.QueryImpl Traversal query (query without index):
select * from [nt:base] where isdescendantnode('/etc') and lower([jcr:title]) like '%coat%';
consider creating an index
Query Plan
To understand why the query is slow, the first step is commonly to get the
query execution plan. To do this, the query can be executed using explain select ...
.
For the above case, the plan is:
[nt:base] as [nt:base] /* traverse "/etc//*"
where (isdescendantnode([nt:base], [/etc])) and (lower([nt:base].[jcr:title]) like '%coat%') */
That means, all nodes below /etc
are traversed.
Making the Query More Specific
In order to make the query faster, try to add more constraints, or make constraints tighter.
This will usually require some knowledge about the expected results.
For example, if the path restriction is more specific, then less nodes need to be read.
This is also true if an index is used. Also, if possible use a more specific node type.
To understand if a nodetype or mixin is indexed, consult the nodetype index
at /oak:index/nodetype
, property declaringNodeTypes
.
But even if this is not the case, the nodetype should be as specific as possible.
Assuming the query is changed to this:
select * from [acme:Product]
where isdescendantnode('/etc/commerce')
and lower([jcr:title]) like '%coat%')
and [commerceType] = 'product'
The only relevant change was to improve the path restriction. But in this case, it already was enough to make the traversal warning go away.
Queries Without Index
After changing the query, there is still a message in the log file that complains the query doesn't use an index, as described above:
*INFO* org.apache.jackrabbit.oak.query.QueryImpl
Traversal query (query without index):
select * from [acme:Product] where isdescendantnode('/etc/commerce')
and lower([jcr:title]) like '%coat%'
and [commerceType] = 'product'; consider creating an index
The query plan of the index didn't change, so still nodes are traversed.
In this case, there are relatively few nodes because it's
an almost empty development repository, so no traversal warning is logged.
But for production, there might be a lot more nodes under /etc/commerce
,
so it makes sense to continue optimization.
Where Traversal is OK
If it is known from the data model that a query will never traverse many nodes,
then no index is needed. This is a corner case, and only applies to queries that
traverse a fixed number of (for example) configuration nodes, or
if the number of descendant nodes is guaranteed to be very low by using
a certain nodetype that only allows for a fixed number of child nodes.
If this is the case, then the query can be changed to say traversal is fine.
To mark such queries, append option(traversal ok)
to the query.
This feature should only be used for those rare corner cases.
select * from [nt:base]
where isdescendantnode('/etc/commerce')
and lower([jcr:title]) like '%coat%'
and [commerceType] = 'product'
option(traversal ok)
Estimating Node Counts
To find out how many nodes are in a certain path, you can use the JMX bean NodeCounter
,
which can estimate the node count. Example: run
getEstimatedChildNodeCounts
with p1=/
and p2=2
might give you:
/: 2522208,
...
/etc: 1521504,
/etc/commerce: 29216,
/etc/images: 1231232,
...
So in this case, there are still many nodes below /etc/commerce
in the production repository.
Also note that the number of nodes can grow over time.
Prevent Running Traversal Queries
To avoid running queries that don't use an index altogether,
you can change the configuration in the JMX bean QueryEngineSettings
:
if you set FailTraversal
to true
, then queries without index will throw an exception
when trying to execute them, no matter how many nodes are in the repository.
This doesn't mean queries will never traverse over nodes, it just means
that queries that must traverse over nodes will fail.
Using a Different or New Index
There are multiple options:
- Consider creating an index for
jcr:title
. But forlike '%..%'
conditions, this is not of much help, because all nodes with that property will need to be read. Also, usinglower
will make the index less effective. So, this only makes sense if there are very few nodes with this property expected to be in the system. - If there are very few nodes with that nodetype,
consider adding
acme:Product
to the nodetype index. This requires reindexing. The query could then use the nodetype index, and within this nodetype, just traverse below/etc/commerce
. TheNodeCounter
can also help understand how manyacme:Product
nodes are in the repository, if this nodetype is indexed. To find out, rungetEstimatedChildNodeCounts
withp1=/oak:index/nodetype
andp2=2
. - If the query needs to return added nodes immediately (synchronously; that is without delay), consider creating a property index. Note that Lucene indexes are asynchronous, and new nodes may not appear in the result for a few seconds.
- To ensure there is only one node matching the result in the repository, consider creating a unique property index.
- Consider using a fulltext index, that is: change the query from using
lower([jcr:title]) like '%...%'
to usingcontains([jcr:title], '...')
. Possibly combine this with adding the propertycommerceType
to the fulltext index.
The last plan is possibly the best solution for this case.
Index Definition Generator
In case you need to modify or create a Lucene property index, you can use the Oak Index Definition Generator tool.
As the tool doesn't know your index configuration, it will always suggest to create a new index; it might be better to extend an existing index. However, note that:
- Changing an existing index requires reindexing that index.
- If an out-of-the-box index is modified, you will need to merge those modifications when migrating to newer software. It is best to add documentation to the index definition to simplify merging, for example in the form of “info” properties.
Verification
After changing the query, and possibly the index, run the explain select
again,
and verify the right plan is used, in this case that might be, for the query:
select * from [acme:Product]
where isdescendantnode('/etc/commerce')
and contains([jcr:title], 'Coat')
and [commerceType] = 'product'
[nt:unstructured] as [acme:Product] /* lucene:lucene(/oak:index/lucene)
full:jcr:title:coat ft:(jcr:title:"Coat")
So in this case, only the fulltext restriction of the query was used by the index,
but this might already be sufficient. If it is not, then the fulltext index might
be changed to also index commerceType
, or possibly
to use evaluatePathRestrictions
.
Queries With Many OR or UNION Conditions
Queries that contain many “or” conditions, or with many “union” subqueries, can be slow as they have to read a lot of data. Example query:
/jcr:root/content/(a|b|c|d|e)//element(*, cq:Page)[
jcr:contains(@jcr:title, 'some text')
or jcr:contains(jcr:content/@keywords, 'some text')
or jcr:contains(jcr:content/@cq:tags, 'some text')
or jcr:contains(jcr:content/@team, 'some text')
or jcr:contains(jcr:content/@topics, 'some text')
or jcr:contains(jcr:content/@jcr:description, 'some text')]
This query will be internally converted into 5 subqueries, due to the “union” clause (a|b|c|d|e). Then, each of the 5 subqueries will run 6 subqueries: one for each jcr:contains condition. So, the index will be contacted 30 times.
To avoid this overhead, the index could be changed (or a new index created) to do aggregation on the required properties (here: jcr:title, jcr:content/keywords,…). This will simplify the query to:
/jcr:root/content/(a|b|c|d|e)//element(*, cq:Page)[jcr:contains(., 'some text')]
This should resolve most problems. To further speed up the query by avoiding to running 5 subqueries, it might be better to use a less specific path constraint, but instead use a different way to filter results, such as:
/jcr:root/content//element(*, cq:Page)[jcr:contains(., 'some text') and @category='x']
Ordering by Score Combined With OR / UNION Conditions
Queries that expect results to be sorted by score (“order by @jcr:score descending”), and use “union” or “or” conditions, may not return the result in the expected order, depending on the index(es) used. Example:
/jcr:root/conent/products/(indoor|outdoor)//*[jcr:contains(., 'test')]
order by @jcr:score descending
Here, the query is converted to a “union”, and the result of both subqueries is combined. If the score for each subquery is not comparable (which is often the case for Lucene indexes), then the order of the results may not match the expected order. Instead of using path restrictions as above, it is most likely better to use a an additional condition in the query, and index that:
/jcr:root/content/products//*[jcr:contains(., 'test') and
(@productTag='indoor' or @productTag='outdoor')]
order by @jcr:score descending
If this is not possible, then try to avoid using “union”, and use an “or” condition as follows. This will only work for SQL-2 queries however:
select * from [nt:base] as a where contains(*, 'test') and issamenode(a, '/content') and
([jcr:path] like '/content/x800/%' or [jcr:path] like '/content/y900/%')
order by [jcr:score] desc