How to test eXist (developper manual)

Jean-Marc Vanel

The last version is now on the eXist SVN repository . It can be seen in real time and in HTML on sf.net :
http://svn.sourceforge.net/viewcvs.cgi/exist/trunk/eXist-1.0/webapp/developper.html?view=html&rev=2968&sortby=date


 Last update : $Header$

Introduction

I noticed that, due to the complexity of the product (persistent DOM, XQuery engine, XML:BD collections, XUpdate, user and permission management, XML-RCP, SOAP, REST remote interfaces, Cocoon adapter, ...) the numbers of bug reports grows. Meanwhile the users don't know how to report bug efficiently using JUnit, or simply providing enough information to reproduce the bug. After all, most of them are not developers. So to help testers and developpers, I did this page explaining how to test eXist. Note that it could be adapted for other projects as well.

Basics of Bug reporting

Reporting bug efficiently means essentially permit the maintainer to easily reproduce and understand the bug.

This implies:

  1. providing enough information
  2. eliminate superfluous information, and provide the bug in its simplest form
  3. automatize the process of provoking the bug

Let's examine these three points. The information provided must include:

Now about the second point, it is important to thrive to provide the shortest data possible. If your query involves just an XQuery expression and no data, don't provide data. If your query involves, say, a processing instruction (PI), try to see if an XML with just a root and a PI provokes the bug.
If you have the feeling that the bug is related to quite large data and/or a well defined suite of operations like load data, update data, etc, the best way is to create XML data by program, using SAX or DOM, or XQuery or XSLT (see also below "Writing stress tests").

The third point is the one least familiar to most readers. Suppose you have provided exactly the 6 items listed above for a bug report. Now think about the maintainer, which you hope she or he will quickly produce a corrective patch. The first thing he'll do is reproduce the bug in a debug environment. To do this, he'll have to manually load the document in the eXist client under debug, manually paste and run the query. And he would do this manually everytime he needs to restart the debugger because the execution went too far. But there's a better way and you can help the maintainer help you ! You can provide a small piece of Java code, that demonstrates the problem. You could use the sample code provided in :
samples/org/exist/examples/*/*.java

But there's even simpler and more useful to everybody in the project: using JUnit tests (see below "Writing JUnit tests" ).  A JUnit test is a more manageable piece of code than a main() class, and quite many JUnit tests are allready present in the eXist distribution.

Using the Subversion repository (SVN) at sourceforge

You don't have to wait for the next snapshot by Wolfgang! Read below about how to download the latest source updates everytime you need .

Reference book on SVN :
http://svnbook.red-bean.com/nightly/en/svn-book.html

Browse the Subversion repository on the Web :
http://svn.sourceforge.net/viewcvs.cgi/exist/trunk/eXist-1.0/

Note that there are some improvements compared to CVS:

With the command line

The first time do what is told on the eXist sourceforge page about SVN. The module name is eXist-1.0 , so just do :

svn checkout -r HEAD https://svn.sourceforge.net/svnroot/exist/trunk/eXist-1.0
Then to compile just do :
cd eXist-1.0
./build.sh
or if you have installed ant :
ant

Then (at any later time) to get all the last source updates just do :

cd $EXIST_HOME
svn update
./build.sh

assuming that the variable $EXIST_HOME contains the absolute path to the directory eXist-1.0 .

To have a Unix-like shell (including svn and much more) on Windows, just install cygwin, it's very convenient and easy to install.

With eclipse

Install the subclipse plugin from

http://subclipse.tigris.org/update_1.0.x

Then the documentation is here :
http://127.0.0.1:65318/help/topic/org.tigris.subversion.subclipse.ui/html/toc.html
( local link ) .

Using eclipse

You can create an eclipse project either from scratch through SVN (see above), or from an existing eXist directory that has been dowloaded in command line.

Create a project from an existing eXist directory

Create a new Java project from the file menu. At the second screen, check
"create java project from existing source" .

Then navigate to your eXist1.0 directory.
Fill project name, for example eXist1.0
Then you can push the "finish" button .
You see that it recognizes all stuff. That's because files .classpath and .project are in the CVS .
 
Then you can commit files in the package explorer (left). But caution , if the CVS checkout was not done by eclipse, your developper password may be asked on the shell command line .

Prepare a test database

To test eXist without spoiling production database, you have to point to a different data directory :

cd $EXIST_HOME
mkdir test
cp conf.xml client.properties test
cd test
ln -s ../samples .
ln -s ../src .
ln -s ../webapp .
mkdir data

Change the location of the data directory in file test/conf.xml :

<db-connection database="native" files="data"
      pageSize="4096" cacheSize="48M" free_mem_min="5">

Of course you can erase all the content of the test/data/ directory as often as necessary.

Run eXist with the test database

Run the client GUI

You can now start eXist with data and configuration in the test/ directory, with an empty content, and no password. For example with the client GUI you can type this :

./bin/client.sh --config $EXIST_HOME/test/conf.xml --local --verbose &

Note it doesn't work with a relative path after --config, since the client.sh script starts in another directory.

Start the servers

The following recipe applies to start all the eXist servers (ports 8080 and 8081) . Just replace the normal conf.xml with the customized one :
cp conf.xml conf-orig.xml
cp test/conf.xml conf.xml
./bin/startup.sh &
The next time you start the normal standard database, or if you do a cvs checkout, don't forget to reactivate it this way :
cp conf-orig.xml conf.xml

Then direct your browser to :

http://localhost:8080/exist/index.xml

Running tests

Run a plain main() test

./bin/run.sh -Dexist.home=$exist/test org/exist/xquery/test/SAXStorageTest

Run JUnit test with eclipse

With eclipse 3.X and the eXist1.0 eclipse project from the eXist CVS, just create a JUnit run with the chosen test case (e.g. exist.xquery.test.AllTests), and either:

Many of the JUnit eXist tests use the samples/ directory, this is why we made a symbolic link to it.

Run one JUnit test with the JUnit GUI

./bin/run.sh -Dexist.home=$EXIST_HOME/test \
junit.swingui.TestRunner \
org.exist.xquery.test.XPathQueryTest

How to run all tests

There is now an Ant target that runs many tests, and produces an HTML report in :

eXist-1.0/test/report/html/index.html

Just type this to run tests and produces the HTML report :

build.sh test


To find all JUnit tests in eXist using eclipse, just go to the hierarchy view and right-click "Focus on ..." , and choose junit.framework.TestCase . There are currently 3 main programs that run all tests in their respective packages:

src/org/exist/xupdate/test/AllTests.java
src/org/exist/xquery/test/AllTests.java
src/org/exist/xmldb/test/AllTests.java

On my 2.4 GHz machine, they take a few seconds, so you can pass them as often as you wish to reassure you that the CVS or your own modifications didn't add bugs (what is called non-regression tests).


Debugging in eclipse

GUI client in local mode

How to debug the GUI client in local mode:

Main class: org.exist.start.Main
Program arguments: client -l
VM arguments: -Dexist.home=test

Standalone server

Main class: org.exist.Server
Program arguments: [none]
VM arguments: [none]
Working directory: ${workspace_loc:eXist-1.0}/test

As client, use the GUI client with this URI:
xmldb:exist://localhost:8081

So you launch it this way:
./bin/client.sh --config $EXIST_HOME/test/conf.xml \
-ouri=xmldb:exist://localhost:8081

Controlling logging

$EXIST_HOME/log4j.xml is used both by client.sh and startup.sh ,
and $EXIST_HOME/webapp/WEB-INF/log4j.xml is used soley by eXist as a .war file.

OK, but where do I have to put my log4j.xml ?

For example, in the top dir of your application or a of a jar . You can also specify an alternate location via property
-Dlog4j.configuration= 
on the Java command line. eXist  relies on the automatic configuration mechanism of log4j .

Profiling XQuery

To activate the XQuery Profiler , add this line at the beginning of the XQuery script :
declare option exist:profiling "enabled=yes verbosity=1";
You can also activate the profiling locally in your XQuery :
let $unused := util:enable-profiling(1)
The profiling starts with this function call and will end with a call to :
util:disable-profiling()


Creating new tests

Introduction

JUnit is the industry standard framework for Java tests. It is open source, light-weight, integrated with eclipse and others, and it was designed by great designers including members of the famous Gang of Four. On the JUnit site I recommend reading the Java Report article: Test Infected - Programmers Love Writing Tests.

Consider the equations:

If they were respected the task of maintainers would be much more easy! And moreover the amount of existing automated tests guaranty that in the future the bugs will not reappear unnoticedly.

What is a good test?

Writing JUnit tests

With eclipse it's particuliarly easy. Just create a class extending junit.framework.TestCase, and take inspiration of one of the existing test cases in these packages:

org/exist/xupdate/test
org/exist/xquery/test
org/exist/xmldb/test

Or just add a function in one of the existing test cases, whose declaration will be something like:

public void testMyNewFeature() {}

Remember that for every atomic test (a method testXXX() ) the method setUp() will be called by JUnit just before, and tearDown() just after.

In either case, go in the Run/Run... window of eclipse, and right-click on Junit/New to create a Run configuration for the test. As explained before, change the working directory to ${workspace_loc:eXist-1.0}/test to avoid modifying production database. And just run it! Make shure that the JUnit view is visible, and you'll see the cursor advancing through the tests, leaving green (OK) or blue (KO) behind on the test tree view.
TODO : give more guidelines about Writing JUnit tests, adding to existing tests .

Writing stress tests

A stress test tries to push the program to its limits, with either:

Contrary to JUnit tests, they can take a long time, and/or consume at lot of memory. Ideally, a stress test should provide results in the form of a table with columns such as:

JUnit stress test

A stress test can be conveniently located in the main() of a JUnit test, e.g.:

org/exist/xquery/test/SAXStorageTest

Note that in this test the data are created by program (not by a file). This has several advantages:

Another example of a test with data created by program (this time with DOM) :

org/exist/xmldb/test/IndexingTest.java

This test currently fails, because of the limitations of the indexing algorithm in eXist. Il will soon succeed , with the new indexing scheme DLN , due in April .

A way to convert existing JUnit tests into stress test can be to use the JUnitPerf tool . JUnitPerf is a collection of JUnit test decorators used to measure the performance and scalability of functionality contained within existing JUnit tests.

Client stress tests

Java stress tests are just one way of providing repoductible problems. Users are encouraged to provide some test script (in PHP/Perl/Bash /Python or whatever can be run on Linux or Windows) that reproduces a typical usage pattern of their application and demonstrates the problems:

We could then run it through a memory profiler or debugger to see what happens.

Using the HTTP request logger and replayer

It logs Http Requests in a log file suitable for replaying to eXist later. The main goal is to reproduce problems when bugs are met in a long duration run. The request logs can also be used as stress tests . Official documentation :
http://wiki.exist-db.org/space/HTTP+requests+logger

The official W3C test suite : XQTS

Read the howto in file :
webapp/xqts/xqts.xql

then direct your browser to :
http://localhost:8080/exist/xqts/xqts.xql
TODO : to complete

Using AspectJ and AJDT for assertions, profiling, logging and rule enforcement

TODO
 For now , I wrote my experiences here :
http://jmvanel.free.fr/aspects/getting_started_aspectj.html

Recommended reading :
AspectJ in Action
PRACTICAL ASPECT-ORIENTED
              PROGRAMMING
                 RAMNIVAS LADDAD

Check the quality of code

Introduction

Refactoring: the book; the site ! http://Refactoring.com

eXtreme Programming (XP)

to complete ...............

The PMD code checker

I recommend the use of PMD (Projet Meets Delay ;-) ). This wonderful Java project checks the code against rules expressed in XPath, or in Java .

PMD in the eXist source distribution

This is PMD made easy ! I added the PMD code checking, with a customized rule set to the eXist source distribution . There is an Ant target called pmd that fetches the PMD libraries, and starts the code checking :

ANT_OPTS=-Xmx500M ant pmd

As you can see , it take quite a large amount of memory, and also CPU : expect it to last 15 minutes .
Then the results are here :

eXist-1.0/test/pmd-report.html

A recent PMD report can be found here :
pmd-report.html
currently 6007 messages !

There should be a batch running at noon and midnight to update this on the eXist site ...

PMD in command line

Just unzip pmd-bin-XXX.zip, and type something like :

$PMD/etc/pmd.sh $EXIST_HOME/src/org/exist/collections/ csv \
rulesets/basic.xml,rulesets/favorites.xml,rulesets/unusedcode.xml,\
rulesets/design.xml,rulesets/unusedcode.xml,rulesets/codesize.xml,\
rulesets/strings.xml \
> ~/jmvanel.free.fr/exist/collection-test.csv

There is also a plug-in for eclipse, it works well with eclipse 3.2 . It installs through the menu "Help>Software Updates" ; see the procedure here :
http://pmd.sourceforge.net/eclipse/


Here is an updated listing of the (potential) problems and flaws reported by eXist in the package hierarchy org/exist :

http://jmvanel.free.fr/exist/pmd-report.html


Every night tens of sourceforge projects, including eXist, are checked in batch by PMD, but for a restricted set of rules :

http://pmd.sourceforge.net/scoreboard.html

http://pmd.sourceforge.net/reports/exist_eXist-1.0.html

Detecting Copied and Pasted with PMD - CPD

PMD also can find the Copied and Pasted parts (CPD=Copy / Paste Detect) :

java -Xmx512m -cp $PMD/lib/pmd-1.8.jar net.sourceforge.pmd.cpd.GUI &

This is with the stand-alone PMD. You can also do the same with the plug-in for eclipse, but without starting eclipse:

java -Xmx512m \
-cp $ECLIPSE_HOME/plugins/net.sourceforge.pmd.core_3.2.0/lib/pmd-3.2.jar \
net.sourceforge.pmd.cpd.GUI &

It's very quick, less than 10 seconds! Here is an example of what is found in the whole src/ for chunks > 50 tokens : http://jmvanel.free.fr/exist/cpd-50.txt - Last UPDATED 2005-07-18

Here is an example of what is found in the same package collection , about 300 lines duplicated out of 1600: http://jmvanel.free.fr/exist/collection-cpd-50.txt

Here is an updated listing of the (currently 324! ) duplicates of length >= 75 Java tokens found in org/exist/ :

http://jmvanel.free.fr/exist/cpd-75.txt

This listing is more than 9000 lines, so one can estimate that at least there are 9000 lines to supress in eXist! In the package hierarchy org/exist we have currently 129 000 lines of code, so we have at least 7% of "fat" in the eXist code. Is is probably much more, since:

After reading more carefully the CPD report, many repeated parts seem to come from the parser fot the Xquery language, and not from the developpers :-) .

Detecting Copied and Pasted with PMD - CPD , applied to XQuery source code

 You can also use CPD to detect copy/paste in Xquery source :
java -cp $PMD/lib/pmd-3.5.jar:$PMD/lib/jaxen-1.1-beta-7.jar:$PMD/lib/jakarta-oro-2.0.8.jar \
net.sourceforge.pmd.cpd.GUI &
or on Windows :
cd %PMD%\bin
cpdgui.bat

cd %PMD%\bin
cpd gui



Check packages dependancies and more with JDepend

JDepend is a nice Java open source project. JDepend traverses Java class file directories and generates dependency relations and design quality metrics for each Java package. It analyses directly the .class files, not the source.
Here is how I obtained the dependancies graphs:
# Analyse eXist packages
java -classpath ~/usr2/jdepend-2.9/lib/jdepend.jar \
jdepend.xmlui.JDepend \
-file jdepend.report.xml \
$exist/build/classes
# produce the graphviz file by an XSLT transform:
saxon jdepend.report.xml jdepend2grahpviz.xslt > jdepend.report.dot
# produce the final graphic files
dot -Tps jdepend.report.dot -o jdepend.report.jpg
dot -Tsvg jdepend.report.dot -o jdepend.report.svg
You can consult the graph by following the links just above. The SVG works very well with amaya 8.5; you can even search a string in the graph.
There is also a Swing UI for JDepend; call it this way:
java -classpath ~/usr2/jdepend-2.7/lib/jdepend.jar \
jdepend.swingui.JDepend $exist/build/classes &

I put in red the packages involved in a closed cycle of dependancies. Pairs of mutually directly referencing packages are connected by a single red double headed arrow.

There's lots of red in eXist currently! Good design practices recommend to avoid cycles. This leads to a clean layered package structure.
The most heavily interdependant packages are:
In a normal design, dependancies should be on the order above, in a layered package structure.
There are only a few packages that are not in red, like :
It's better than nothing. But clearly all those dependancies contribute to the fragility of the software. They are a clear sign of excessive complexity. Probably the first thing to do is to remove the dependancy of storage towards xquery and dom . It might be just a matter of adding some interfaces?
Also xquery probably should not depend both on dom and storage .

Profiling

This is about the generic profiling at the Java level of classes and methods . See also Profiling XQuery .

TODO : update! Add paragraph about eclipse TPTP , etc .
Last updated 2005-05-21

How to do :
Typical results :