README What is XML Calabash? XML Calabash is an implementation of XProc: An XML Pipeline Language, http://www.w3.org/TR/xproc/ This is a beta release. It passes all but a few of the tests in the XProc Test Suite and is believed to be otherwise complete. Prerequisites XML Calabash is built with Java 1.5 on top of Saxon. To run it, you'll need: * Java 1.5 or later, * Saxon 9.2.1.1 or later (Any 9.2.x.y release might work, but I'm currently using 9.2.1.1) * Apache HTTP Client (and Apache Commons logging and codec), if you want to use p:http-request * Saxon EE, if you want to use p:validate-with-xml-schema * XQuery API for Java (XQJ), if you want to use p:xquery * Jing if you want to use p:validate-with-relax-ng * TagSoup, if you want to be able to parse text/html content with the p:unescape-markup step * RenderX's XEP, if you want to use the p:xsl-formatter step. (You may possibly need the XEP developers kit in addition to XEP.) Several of these libraries are included in the lib/ directory as a convenience. Whether I should do this, or not, and whether I should provide more libraries this way are all open questions in my mind. Starting with version 0.9.22, I'm using IzPack to build an installer. If you want to build the sources with Ant, you'll need the IzPack jar. (Someday I'll see if I can make that dependency conditional in build.xml) (Note: dependency on commercial applications for core XProc steps is a temporary expediency. Eventually, I'll integrate support for open source tools like Xerces and FOP. Patches welcome.) XML Calabash also implements several extension steps. These are not part of the XProc core language standard and cannot be expected to reliably interoperate with other implementations. * cx:collection-manager provides a mechanism for associating sets of documents with collection() URIs for use in the p:xslt (2.0) and p:xquery steps. * cx:delta-xml provides an integration with DeltaXML's commercial XML-diffing tool suite. Requires the DeltaXML tools, naturally. * cx:message is a debugging aid; it's the identity step with a message option that will be printed on stderr. * cx:nvdl supports NVDL validation of mixed-namespace documents. * cx:unzip provides access to the files in a ZIP archive. * ml:adhoc-query, ml:invoke-module, and ml:insert-document implement Mark Logic Server XCC primitives (incompletely). * cx:metadata-extractor is a thin shell around Drew Noakes metadata extractor library, which you'll need of course. How do I use it? Download the latest release. Inside the archive you'll find calabash.jar. Make sure that jar file and the prerequisites are on your class path.[*] Then you can run it from the command line: java com.xmlcalabash.drivers.Main options pipeline.xpl For example: java com.xmlcalabash.drivers.Main xpl/pipe.xpl Congratulations! You've run your first pipeline! You can use -iport=file to change the inputs and -oport=file to change the output location. For example: $ java com.xmlcalabash.drivers.Main -isource=xpl/pipe.xpl -oresult=/tmp/out.xml xpl/pipe.xpl That will run xpl/pipe.xpl using xpl/pipe.xpl as the input and writing the result to /tmp/out.xml. If you run java com.xmlcalabash.drivers.Main with no options, it will print a short usage summary. What do I do if it all goes wrong? Tell Norm, ndw@nwalsh.com. There is also an XProc developers mailing list, xproc-dev@lists.w3.org which may provide useful advice. ------------------------------------------------------------ [*] Here is the CLASSPATH that Norm uses: CLASSPATH=/projects/src/calabash/out/production/Saxon9.2.0.3\ :/projects/src/calabash/out/production/Isorelax\ :/projects/src/calabash/out/production/Msv\ :/share/java/msv/relaxngDatatype.jar\ :/share/java/msv/xsdlib.jar\ :/share/java/tagsoup/tagsoup-1.2.jar\ :/projects/src/calabash/out/production/Calabash\ :/projects/src/xmlresolver/out/production/Xmlresolver\ :/projects/src/jing-trang/build/jing.jar\ :/usr/local/DeltaXMLCore-5_1/command.jar\ :/usr/local/DeltaXMLCore-5_1/deltawing.jar\ :/usr/local/DeltaXMLCore-5_1/deltaxml.jar\ :/projects/marklogic/Java/xcc.jar\ :/share/java/xepdev-4.13/lib/xep-debug.jar\ :/share/java/xep/lib/xt.jar\ :/share/java/apache-httpclient/commons-httpclient-3.1/commons-httpclient-3.1.jar\ :/share/java/apache-logging/commons-logging-1.1.1/commons-logging-1.1.1.jar\ :/share/java/apache-codec/commons-codec-1.3.jar\ :/share/java/xerces2/xml-apis.jar\ :/share/java/xerces2/xercesImpl.jar\ :/share/java/saxonee-9.2.0.3j/saxon9ee.jar Paths that begin /projects/src/calabash/out/... are versions of libraries that Norm has built from source as an aid to debugging.