Fork me on GitHub

Oak SQL-2 Query Grammar


Query

SELECT
 
DISTINCT 
*
column
 
, ...
FROM selector
 
join ...

 
WHERE constraint
 
UNION
 
ALL
query

 
ORDER BY ordering
 
, ...
 
queryOptions

All queries should have a path restriction (even if it's just, for example, “/content”), as this allows to shrink indexes.

“distinct” ensures each row is only returned once.

“union” combines the result of this query with the results of another query, where “union all” does not remove duplicates. Note that for fulltext queries, it is problematic to use union, because scoring is done for each subquery individually. The score is not useful to compare results of different subqueries, so that the union of multiple fulltext queries won't be ordered by score as one might expect.

“order by” may use an index. If there is no index for the given sort order, then the result is fully read in memory and sorted before returning the first row.

Examples:

select * from [sling:Folder] as a where [sling:resourceType] = 'x' and isdescendantnode(a, '/content')
select [jcr:path] from [oak:QueryIndexDefinition] as a where [type] = 'lucene' and isdescendantnode(a, '/') order by [reindexCount] desc
select [jcr:path], [jcr:score], * from [nt:base] as a where [type] = 'report' and isdescendantnode(a, '/etc') option(traversal fail)

Column

 
selectorName .
propertyName
*
EXCERPT (
 
selectorName
)
REP : SPELLCHECK ( )

 
AS columnName

It is recommended to enclose property names in square brackets.

For the “excerpt” property, see Excerpts and Highlighting.

Not listed above are “special” properties such as “[jcr:path]” (the path), “[jcr:score]” (the score), “[rep:suggest()]”.

Examples:

*
[jcr:path]
[jcr:score]
a.*
a.[sling:resourceType]

Selector

nodeTypeName
 
AS selectorName

The nodetype name can be either a primary nodetype or a mixin nodetype. It is recommended to specify the nodetype name in square brackes.

Examples:

[sling:Folder] as a

Join

INNER
LEFT
RIGHT
OUTER
JOIN rightSelector ON

selectorName . propertyName = joinSelectorName . joinPropertyName
ISSAMENODE ( selectorName , joinSelectorName
 
, selectorPathName
)
ISCHILDNODE ( childSelectorName , parentSelectorName )
ISDESCENDANTNODE ( descendantSelectorName , ancestorSelectorName )

An “inner join” only returns entries if nodes are found on both the left and right selector. A “left outer join” will return entries that don't have matching nodes on the right selector. A “right outer join” will return entries that don't have matching nodes on the left selector. For outer joins, all the properties of the selector that doesn't have a matching node are null.

Examples:

All nodes below /oak:index that don't have a child node:

select a.* from [oak:QueryIndexDefinition] as a 
  left outer join [nt:base] as b on ischildnode(b, a)
  where isdescendantnode(a, '/oak:index') 
  and b.[jcr:primaryType] is null 
  order by a.[jcr:path]

Constraint

andCondition
 
OR andCondition
 
...

“or” conditions of the form “[x]=1 or [x]=2” are automatically converted to “[x] in(1, 2)”, and can use the same an index.

“or” conditions of the form “[x]=1 or [y]=2” are more complicated. Oak will try two options: first, what is the expected cost to use a “union” query (one query with x=1, and a second query with y=2). If using “union” results in a lower estimated cost, then “union” is used. This can be the case, for example, if there are two distinct indexes, one on x, and another on y.


And Condition

condition
 
AND condition
 
...

A special case (not found in relational databases) is “and” conditions of the form “[x]=1 and [x]=2”. They will match nodes with multi-valued properties, where the property value contains both 1 and 2.


Condition

comparison
inComparison
NOT constraint
( constraint )
 
selectorName .
propertyName IS
 
NOT
NULL
CONTAINS (
 
selectorName .
propertyName
selectorName . *
, staticOperand )
ISSAMENODE
ISCHILDNODE
ISDESCENDANTNODE
(
 
selectorName ,
pathName )
SIMILAR (
 
selectorName .
propertyName
*
, staticOperand )
NATIVE (
 
selectorName ,
languageName , staticOperand )
SPELLCHECK (
 
selectorName ,
staticOperand )
SUGGEST (
 
selectorName ,
staticOperand )

“not” conditions can not typically use an index.

“contains”: see Full-Text Queries.

“similar”: see Similarity Queries.

“native”: see Native Queries.

“spellcheck”: see Spellchecking.

“suggest”: see Suggestions.

Examples:

select [jcr:path] from [nt:base] where similar(*, '/test/a') 
select [jcr:path] from [nt:base] where native('solr', 'name:(Hello OR World)')
select [rep:suggest()] from [nt:base] where suggest('in ') and issamenode('/')
select [rep:spellcheck()] from [nt:base] as a where spellcheck('helo') and issamenode(a, '/')

Comparison

dynamicOperand
=
< >
<
< =
>
> =
LIKE
staticOperand

“like”: when comparing with LIKE, the wildcards characters are ‘_’ (any one character) and ‘%’ (any characters). An index is used, except if the operand starts with a wildcard. To search for the characters ‘%’ and ‘_’, the characters need to be escaped using '' (backslash).

Comparison using <, >, >=, and <= can use an index if the property in the index is ordered.

Examples:

[name] like '%: 100 \%'

In Comparison

dynamicOperand IN ( staticOperand
 
, ...
)

Examples:

[status] in('active', 'inactive')

Static Operand

literal
$ bindVariableName
CAST ( literal AS type )

A string (text) literal starts and ends with a single quote. Two single quotes can be used to create a single quote inside a string.

Example:

'John''s car'
$uuid
cast('2020-12-01T20:00:00.000' as date)

Ordering

simpleName
 
DESC

Ordering by an indexed property will use that index if possible. If there is no index that can be used for the given sort order, then the result is fully read in memory and sorted there. If an index for ordering is used, then only entries are listed where the given property (or function) is not null.

As a special case, sorting by “jcr:score” in descending order is ignored (removed from the list), as this is what the fulltext index does anyway (and if no fulltext index is used, then the score doesn't apply). If for some reason you want to enforce sorting by “jcr:score”, then you can use the workaround to order by “LOWER([jcr:score]) DESC”. Note that for fulltext queries, it is problematic to use union, because scoring is done for each subquery individually. The score is not useful to compare results of different subqueries, so that the union of multiple fulltext queries won't be ordered by score as one might expect.

Examples:

[lastName]
[price] desc

Dynamic Operand

 
selectorName .
propertyName
LENGTH ( dynamicOperand )
NAME
LOCALNAME
PATH
SCORE
(
 
selectorName
)
LOWER
UPPER
FRIST
( dynamicOperand )
COALESCE ( dynamicOperand , dynamicOperand )
PROPERTY ( propertyName , type )

The selector name is only needed if the query contains multiple selectors.

Property names can be relative, so that for example ‘jcr:content/test’ means the property names ‘test’ in the child node ‘jcr:content’. Relative path fragments can also contain * to represent ‘any’ node at that point. // is not supported as part of relative path. So, a/*/test, */a/test, a/*/*/test etc are valid while a//test, a/*/b//test, etc are not.

The wildcard ‘*’ means any property.

“coalesce”: this returns the first operand if it is not null, and the second operand otherwise. @since Oak 1.8

“property”: This feature is rarely used. It allows to filter for all properties with a given type. Example: the condition property(*, Reference) = $uuid will search for any property of type Reference.

“lower”, “upper”, “length”: Indexes on functions are supported @since Oak 1.6, see OAK-3574.

“first”, “path”: Supported @since Oak 1.42, see OAK-9625.

Examples:

lower([firstName])
coalesce([lastName], name())
length(coalesce([lastName], name()))
first([alias])
name()
path()
[jcr:content/title]

Type

STRING
BINARY
DATE
LONG
DOUBLE
DECIMAL
BOOLEAN
NAME
PATH
REFERENCE
WEAKREFERENCE
URI

This is the list of all JCR property types.


Options

OPTION (
TRAVERSAL
OK
WARN
FAIL
DEFAULT
INDEX TAG tagName
OFFSET number
LIMIT number
 
, ...
)

“traversal”: by default, queries without index will log a warning, except if the configuration option QueryEngineSettings.failTraversal is changed The traversal option can be used to change the behavior of the given query: “ok” to not log a warning, “warn” to log a warning, “fail” to fail the query, and “default” to use the default setting.

“index tag”: by default, queries will use the index with the lowest expected cost (as in relational databases). To only consider some of the indexes, add tags (a multi-valued String property) to the index(es) of choice, and specify this tag in the query. See Query Option Index Tag.

“offset” / “limit”: sets the offset / limit at the time of parsing the query See Query Option Offset / Limit.

Examples:

option(traversal fail)

Explain

EXPLAIN
 
MEASURE
query

Does not run the query, but only computes and returns the query plan. With EXPLAIN MEASURE, the expected cost is calculated as well. In both cases, the query result will only have one column called ‘plan’, and one row that contains the plan.

Examples:

explain measure 
select * from [nt:base] where [jcr:uuid] = 1

Result:

plan = [nt:base] as [nt:base] 
/* property uuid = 1 where [nt:base].[jcr:uuid] = 1 */  
cost: { "nt:base": 2.0 } 

This means the property index named “uuid” is used for this query. The expected cost (roughly the number of uncached I/O operations) is 2.


Measure

MEASURE query

Runs the query, but instead of returning the result, returns the number of rows traversed. The query result has two columns, one called ‘selector’ and one called ‘scanCount’. The result has at least two rows, one that represents the total (selector set to ‘query’), and one per selector used in the query.

Examples:

measure 
select * from [nt:base] where [jcr:uuid] = 1

Result:

selector = query
scanCount = 0
selector = nt:base
scanCount = 0

In this case, the scanCount is zero because the query did not find any nodes.