The Composer Implementation

31 december 2006

Motive and background:

I had a many different kind of text files marked up as xml, and I wanted to publish some of it, nicely formatted, to my website. In addition to transforming them to some kind of xhtml structure, I wanted to collage them with text and images. My idea was an xhtml tree with only a course structure—something like a christmas tree with no decorations—where new text was attached to one twig and where an xml data file transformed into some xhtml element was attached to another twig, and so on, as fig 1 illustrates.

A Usage Example fig 1 A Usage Example

I made up a declarative syntax—that I am quite satisfied with—for describing this collage, and I built a simple program, which I called Composer, using the XOM library for xml processing, that performed these tasks.

This was batch processing system, just as I wanted, producing static files that I could upload to a server. After a while I discovered that it would be useful to know if the sources used by a template had been modified, in order to avoid reprocessing the template. Thus I built what now is the Inventory class.

Later I discovered it would be useful to employ a web browser, even on my local computer, to select the templates by means of regular expressions. I added a request(URI) method to the Composer, and used a class from the Restlet API to build a simple class I call Mediator.

The Composer class

The drawing below illustrates a Composer object.

A Composer object fig 2 A Composer object

The object has two sets of ID: processable() and requestable(). The first set contains the ID for templates that may be processed independent of http request. The last set is a linked set (order of insertion matters), containing the ID for templates that may be found by URI requests: the URI matches a regular expression of the template, the first matched is used for processing.

The inventory() has information about the already processed templates, making it possible to use a static file, avoiding reprocessing a template corresponding to a requested URI.

The Inventory class

The Composer has a default Inventory without any Recorder object. The Composer may also be constructed with custom Inventory, such as an Inventory having a Recorder object that maintains an inventory folder.

The Inventory and Recorder classes fig 3 The Inventory and Recorder classes

The long is a time in milliseconds, used by the Inventory to decide how often the Recorder should check for modified dependencies.

The Mediator class

A Mediator object listens to browser requests. Furthermore it has access to the Composers inventory(), to the composers request(URI) method, and to the file system. By checking the inventory it decides if it should ask the Composer—by calling request(URI)—to process a template, or if it simply may use an existent static file, to serve to the browser.

The Role of the Mediator fig 4 The Role of the Mediator

Processing a Template

Use a directory, for example c:/tf, and drop the files i.xml and p.xml into this directory. The “Site” and the “Page” may be in separate files because the Composer aggregates all files with the extension xml in the directory, and collects all the elements it understands. Several templates may therefore also be held a single file if convenient. With the files in place, the following code should produce some output.


 1 Composer co = new Composer("c:/tf");
 2 System.out.println(co.processable()); // [first_page]
 3 for (String id : co.processable())
 4     co.process(id);
				

In this case the co.processable() has one member—namely the String "first_page"—and line 4 therefore results in an output file, located at c:/tf/h/1.htm, as specified in the “first_page” template (the text enclosed within the “specifiedAs” tags).

c:/tf/i.xml


 1 <?xml version="1.0" encoding="utf-8"?>
 2 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 3       xmlns:sv="http://www.solidera.com/vocabulary#">
 4    <sv:Site rdf:about="http://example.org">
 5       <sv:furnishedWith rdf:resource="#first_page"/>
 6    </sv:Site>
 7 </rdf:RDF>
				

c:/tf/p.xml


 1 <?xml version="1.0" encoding="utf-8"?>
 2 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 3       xmlns:sv="http://www.solidera.com/vocabulary#"
 4       xmlns="http://www.w3.org/1999/xhtml">
 5    <sv:Page rdf:ID="first_page">
 6       <sv:location>h/1.htm</sv:location>
 7       <sv:specifiedAs rdf:parseType="Literal">
 8          <html>
 9             <head/>
10             <body>
11                Useful content could be added as figure 1 illustrates.
12             </body>
13          </html>
14       </sv:specifiedAs>
15    </sv:Page>
16 </rdf:RDF>
				

Requesting the Processing of a Template

The following code is a working example for http requests.


 1 Composer co = new Composer("c:/tf");
 2 System.out.println(co.processable()); // [first_page]
 3 for (String id : co.processable())
 4     co.process(id);
 5 Mediator me = new Mediator(co);
 6 org.restlet.Container rc = new org.restlet.Container();
 7 rc.getServers().addProtocol(org.restlet.data.Protocol.HTTP, 8182);
 8 rc.getClients().addProtocol(org.restlet.data.Protocol.FILE);
 9 rc.getDefaultHost().attach("", me)
10 rc.start();
				

The lines 1–4 is the same as in the previous example. Line 5 is the main new thing in this example: a Mediator is created form the Composer object (introduced in line 1). Lines 6–8 has nothing to do with the Composer, but needed in order to use the Restlet API. A Mediator is a kind of Restlet, and may therefore be attached as line 9 shows. Lines 6–10 are explained in the Restlet tutorial.

Entering the URI http://localhost:8182/first_page in a browser, the file c:/tf/h/1.htm created at line 4 will open.

We will extend this example by adding one line to the i.xml file and by adding the file a.xml.

c:/tf/i.xml (modified)


 1 <?xml version="1.0" encoding="utf-8"?>
 2 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 3       xmlns:sv="http://www.solidera.com/vocabulary#">
 4    <sv:Site rdf:about="http://example.org">
 5       <sv:furnishedWith rdf:resource="#first_page"/>
 6       <sv:furnishedWith rdf:resource="#first_archetype"/>
 7    </sv:Site>
 8 </rdf:RDF>
				

c:/tf/a.xml


 1 <?xml version="1.0" encoding="utf-8"?>
 2 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 3       xmlns:sv="http://www.solidera.com/vocabulary#"
 4       xmlns="http://www.w3.org/1999/xhtml">
 5    <sv:Archetype rdf:ID="first_archetype">
 6       <sv:request>http://example.org/(\w+)</sv:request>
 7       <sv:location>h/[-@1-].htm</sv:location>
 8       <sv:specifiedAs rdf:parseType="Literal">
 9          <html><head><title>Page [-@1-]</title></head>
10          <body>
11          <sv:Element>
12             <sv:data>doc/[-@1-].xml</sv:data>
13             <sv:xslt>xsl/s.xsl</sv:xslt>
14          </sv:Element>
15          </body></html>
16       </sv:specifiedAs>
17    </sv:Archetype>
18 </rdf:RDF>
				

A browser asking for http://localhost:8182/ab, for example, will start the process that builds the file c:/tf/h/ab.htm with the xhtml:title of “Page ab” by using the sources c:/tf/doc/ab.xml and c:/tf/xsl/s.xsl. The sources must of course be present for the program to work properly.

Keeping an Inventory

The inventory used by the Composer is a record of the available static files and their media types, and of any template or source files used to build these static files. An inventory is useful for knowing if a source has been modified, thus beeing able to serve static files if possible, and reprocess templates only when nescesary. A Recorder helps the inventory keep this record persistent. This is an example its of use:


 1 Recorder re = new Recorder("c:/tf", "c:/tf/inventory");
 2 Inventory in = new Inventory(re, 60000);
 3 Composer co = new Composer("c:/tf", in);
 4 System.out.println(co.requestable()); // [first_archetype]
 5 System.out.println(co.processable()); // [first_page]
 6 for (String id : co.processable())
 7     co.process(id);
 8 Mediator me = new Mediator(co);
 9 org.restlet.Container rc = new org.restlet.Container();
10 rc.getServers().addProtocol(org.restlet.data.Protocol.HTTP, 8182);
11 rc.getClients().addProtocol(org.restlet.data.Protocol.FILE);
12 rc.getDefaultHost().attach("", me);
13 rc.start();
				

When the Recorder object is created on line 1, it collects and analyzes any saved record from the c:/tf/inventory directory, and it copies any changed or new templates from c:/tf to this inventory directory. On line 2 an Inventory object is constructed to hold this Recorder and a time of 60000 milliseconds, meaning that, even when a source has been modified, the available static file, if such exist, will be served on all requests that happens before 60000 milliseconds have elapsed since the last check of the modification date, but if a request happens after that, the static file will be rebuilt from the modified sources. On line 3 a Composer object, holding this inventory, is created. Line 4 and 5 are inessential verifications that the templates have been found. The remaining lines 6–13 are the same as in the previous example.

When the browser now asks for http://localhost:8182/ab, as in the previous example, in addition to serving the file c:/tf/h/ab.htm, a record of the URI, the location, the media type, the template, and the sources will be written to the inventory folder.


 1 <?xml version="1.0" encoding="utf-8"?>
 2 <Resource xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 3       xmlns:="http://www.solidera.com/vocabulary#"
 4       rdf:about="http://example.org/ab">
 5    <template>first_archetype</template>
 6    <media>XHTML</media>
 7    <location>c:/tf/h/ab</location>
 8    <file>c:/tf/doc/ab.xml</file>
 9    <file>c:/tf/xsl/s.xsl</file>
10 </Resource>
        

Injecting Content

As exemplified by line 11–14 in the file c:/tf/a.xml, content is injected into the xhtml structure by means of an Element enclosing data and xslt content.


   <sv:Element>
     <sv:data>doc/ab.xml</sv:data>
     <sv:xslt>xsl/s.xsl</sv:xslt>
   </sv:Element>
        

The data content may, in addition to a single xml file (lines 1–2 below), be an aggregation of files (lines 3–4) or a list of files (lines 5–6), or fragments of a file (lines 7–8). Not exemplified is the method of aggragation by means of enclosing several data elements within any other element.


 1 <sv:data>doc/ab.xml</sv:data> <!-- a file -->
 2 <sv:data>http://example.org/ab.xml</sv:data> <!-- a file -->
 3 <sv:data>doc</sv:data> <!-- directory file aggregion -->
 4 <sv:data>doc/{\w+}.xml</sv:data> <!-- regexp file aggreation -->
 5 <sv:data>list doc</sv:data> <!-- directory file list -->
 6 <sv:data>list doc/{\w+}.xml</sv:data> <!-- regexp file list -->
 7 <sv:data>doc/ab.xml apple orange</sv:data> <!-- picking by ID -->
 8 <sv:data>doc/ab.xml /*/Fruit</sv:data> <!-- picking by element name -->
        

The data content is transformed by means an xslt file (line 1 below) or a file with xslt parameters (line 2).


 1 <sv:xslt>xsl/s.xsl</sv:xslt>
 2 <sv:xslt>xsl/s.xsl x=a y=b</sv:xslt> <!-- parameters -->
        

Upon completion of the transformation, or of the last transformation if several xslt elements are chained, the resulting document's root element replaces the Element element.

Data, both xml and plain text files, may also be injected without an xsl transformation. Any tag name, except “data”, may be used. For example css content may be included by enclosing an css text file within a xhtml:style tag having the desired attributes.


   <sv:Element>
      <style type="text/css">css/c.css</style>
   </sv:Element>
        

Processing Books

A Book template may be used for batch processing many files in a directory. I will demonstrate an example. Modify the i.xml file to include one more item and add the file b.xml into the Composer's directory.

c:/tf/i.xml (modified)


 1 <?xml version="1.0" encoding="utf-8"?>
 2 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 3       xmlns:sv="http://www.solidera.com/vocabulary#">
 4    <sv:Site rdf:about="http://example.org">
 5       <sv:furnishedWith rdf:resource="#first_page"/>
 6       <sv:furnishedWith rdf:resource="#first_archetype"/>
 7       <sv:furnishedWith rdf:resource="#first_book"/>
 8    </sv:Site>
 9 </rdf:RDF>
				

c:/tf/b.xml


 1 <?xml version="1.0" encoding="utf-8"?>
 2 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 3      xmlns:sv="http://www.solidera.com/vocabulary#"
 4      xmlns="http://www.w3.org/1999/xhtml">
 5   <sv:Book rdf:ID="first_book">
 6      <sv:specifiedAs rdf:parseType="Collection">
 7         <sv:Element>
 8            <sv:data>list doc/b</sv:data>
 9            <sv:xslt>xsl/p.xsl</sv:xslt>
10         </sv:Element>
11      </sv:specifiedAs>
12   </sv:Book>
13 </rdf:RDF>
				

As explained before the list keyword produces an xml document containing a list of all files in the directory doc/b. The document has elements with name File with data for each file. The p.xsl then transforms the list's File elements into Page elements, where the rdf:ID, the location, and the data are constructed from the File's slug (a so named child element of File, containing the file name excluding the extension).

c:/tf/xsl/p.xsl


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:sv="http://www.solidera.com/vocabulary#"
      xmlns="http://www.w3.org/1999/xhtml">
   <xsl:template match="/">
      <sv:specifiedAs rdf:parseType="Collection">
         <xsl:apply-templates select="//sv:File"/>
      </sv:specifiedAs>
   </xsl:template>
   <xsl:template match="sv:File">
      <xsl:variable name="id" select="concat("first_book_page_", sv:slug)"/>
      <xsl:variable name="location" select="concat("h/", sv:slug, ".htm")"/>
      <xsl:variable name="data" select="concat("doc/", sv:slug, ".xml")"/>
      <sv:Page rdf:ID="{$id}">
         <sv:location><xsl:value-of select="$location"/></sv:location>
         <sv:specifiedAs rdf:parseType="Literal">
            <html><head/>
            <body>
               <sv:Element>
                  <sv:data><xsl:value-of select="$data"/></sv:data>
                  <sv:xslt>xsl/s.xsl</sv:xslt>
               </sv:Element>
            </body></html>
         </sv:specifiedAs>
      </sv:Page>
   </xsl:template>
</xsl:stylesheet>
				

The Composer constructs an in-memory xml document as a list (an rdf Collection) of Pages. For example if there are two files, doc/ab.xml and doc/cd.xml, in the doc directory, two Pages will result: one Page producing h/ab.htm having doc/ab.xml as a source and another Page producing h/cd.htm having doc/cd.xml as a source. Both Pages will in this example use xsl/s.xsl for transforming the doc directory files. The URI for the produced pages are http://localhost:8182/first_book_page_ab and http://localhost:8182/first_book_page_cd. These are also the URI to enter in a browser to retrive the produced pages. The URI may be changed as desired by including a request element as a child to the Page element, similar to the Archetype example.

Template Vocabulary

The three words Page, Archetype and Book are the names of the templates. In addition to the templates there is Site, an index of templates; there is Section, an element to be reused by any template; and there is Element, a kind of porthole to templates and sections, for data sources. The vocabulary and parent-child relationships are shown in fig 5. The words furnishedWith and specifiedAs are object properties in OWL vocabulary. The words location, resource, data, and xslt are datatype properties.

Template Vocabulary fig 5 Template Vocabulary

Acknowledgement

The XOM library made it easy for me to write the xml processing code for the Composer. When extended the Composer for http requests I used a Handler from the Restlet library, and simply added the logic for the information recived from the Composer ind its Inventory.