Optimizing XQueries with eXist
Explanations of each of the following tips can be found at the end of the document
DON'TS
Here are the things to avoid:
DOS
Here are some recommendations for optimization:
Code Quality
Here are some points about writing good quality code. An example of a correct XQuery is to follow TODO
EXPLANATIONS
Don't use eval()
It's important to realise that eXist caches the XQuery after compiling it. The snag is, the arguments to the eval() function can't be cached. Beyond that, using eval() leads to a style of programming that's hard to read and to debug. And eval() can always be replaced by a standard expression.
Don't evaluate expressions several times over and avoid redundant expressions
eXist doesn't perform any analysis or optimisation of queries akin to what a Java compiler does. So: no refactoring of repeatedly-evaluated expressions, no elimination of code that won't be executed, etc. Pay particular attention to repeatedly evaluated expressions, they should be evaluated once only and the result placed into a variable, which also makes for more readable code
Don't use //
$a//b causes a complete traversal of all nodes of which $a is the root in search of an element b. In most cases the location of b is fairly precisely known, and so would be better to specify it.
Don't query constructed document fragments
A typical example (to avoid) :
let $e := <a><b>content</b></a> (: $e is a constructed document fragment :)
let result := $e/b/text()
Minimise the execution of queries based on a given search expression.
A query like
res := collection("/db/projects") /a/b [ id = $val ]
causes a complete scan of an entire collection. Admittedly, queries like this are at the heart of an XQuery (and account for most of its execution time). But once the result $res has been retrieved, it can be efficiently used as a starting point for navigation to its parent, siblings and children:
$a := $res / parent::a
$next-sibling := $a / next-sibling:a
Make appropriate use of indexes adapted to your search criteria.
There are currently three types of user-configurable indexes in eXist. All require pre-indexation either of the base collection or of specified node-sets in sub-collections.
Index 2. is slower than 3., but has two advantages
Index 3 lacks these advantages, but is almost as fast as a relational database. Such an index cannot be constrained by an XPath, but only by a tag name. Both index and and index 2 are typed (integers or strings), and allow matching by criteria of equality or inequality (comparison).
Document in XQuery the argument and return types
Don't write :
declare function local:add($n, $m) {This is more explicit and auto-documenting. And for the same price you get run-time arguments checking . If you know for shure the types you manipulate, declare them !
<result> $n + $m </result>
};
declare function local:add($n as xs:integer, $m as xs:integer)Keep data retrieval separate from result construction
as element(result) {
<result> $n + $m </result>
};