SVG Essentials/Generating SVG

From WikiContent

< SVG Essentials
Revision as of 22:22, 6 March 2008 by Evanlenz (Talk | contribs)
Jump to: navigation, search
SVG Essentials

The previous chapters have described the major features of SVG. All the examples have been relatively modest and have been written in an ordinary text editor. For graphics of any great complexity, though, few people will write the SVG from scratch. Let's face it: almost nobody does this by hand. Instead, graphic designers and programmers will use some sort of graphic tool that outputs SVG, or they will take existing raw data and convert it to SVG. If you're dealing with a graphic program's output that is already in SVG format, you can sit back and relax; all the heavy lifting has been done for you. If you ever take a look at the SVG that it generated, it may be hard to read. Some programs, for example, may not use groups (the <g> element) efficiently or they may not optimize paths. When you use these programs, you are trading off the ease of generating SVG for the absolute control you have when you write the entire file by hand.

If you're dealing with data that's already in XML format, you may just need to extract the pertinent data and plug it into an SVG framework. In such a case, you can use tools that implement Extensible Stylesheet Language Transformations (XSLT). If the data is in XML but needs a fair amount of processing, you may need to write a program in Java or some other language to do the conversion. Luckily, you can take advantage of freely available XML parsers to do the busy work for you.

Finally, if you are dealing with data that isn't in XML format, you have some work ahead of you. If you have raw data in either ASCII or binary form you may need to write custom code to do the conversion.

In this chapter, we'll start with a custom Perl program to convert geographical mapping data that's not in an XML format to an SVG file. Then we will use Java to convert a representation of a matrix in Mathematical Markup Language (MathML) format to SVG. The last example will use XSLT to convert an XML-formatted aeronautical weather report to SVG.

Contents

Using Perl to Convert Custom Data to SVG

If anyone lives a life that revolves around graphics display, it's a mapmaker. Cartographers are finding XML markup in general and SVG in particular to be excellent vehicles for putting data into a portable format. At present, though, much of the data that is currently available is in custom or proprietary formats. One of these is the proprietary format developed by Earth Science Resources, Inc. for use by their ARC/INFO Geographic Information System. Data created in this system can be exported in an ASCII "ungenerate" form. Such a file contains a series of polygon descriptions, followed by a line with the word END on it. Each polygon starts with a line that consists of an integer polygon identification number and the x- and y-coordinates of the polygon's centroid. This line is followed by the x- and y-coordinates of the polygon's vertices, one vertex per line. A line with the word END on it marks the end of the polygon. Here is a sample file:

         1      -0.122432044171565E+03       0.378635608621089E+02
      -0.122418712172884E+03       0.378527169597E+02
      -0.122434402770255E+03       0.378524342437443E+02
      -0.122443301934511E+03       0.378554484803880E+02
      -0.122446316168374E+03       0.378610463416856E+02
      -0.122438565286068E+03       0.378683666259093E+02
      -0.122418712172884E+03       0.378527169591107E+02
END
         2      -122.36                      37.82
      -122.378                     37.826
      -122.377                     37.831
      -122.370                     37.832
      -122.378                     378.826
END
END

Converting such a file to SVG is a simple task. The only twist is that ARC/INFO stores data in Cartesian coordinates, so we will have to flip the y-coordinates upside-down. The program we'll write in Perl will run from the command line as follows:

perl mapSVG.pl input-file 
            width 
            decimals
         

Where the width is the width of the resulting SVG graphic in pixels, and decimals is an optional parameter giving the number of digits you wish to keep after the decimal point in coordinate values.

The program will start with a utility subroutine that grabs one token at a time from the input file:

#!/usr/bin/perl

#
#   @line_buffer is a global
#
@line_buffer = ( );

#
#   Input file PFILE is opened in main
#   part of program.
#
sub get_token
{
    my ($data);
    if ((scalar @line_buffer) == 0) # out of data?
    {
        $data = <PFILE>;          # grab a line
        $data =~ s/^\s+//;          # get rid of leading... 
        $data =~ s/\s+$//;          # ...and trailing whitespace
        @line_buffer = split /\s+/, $data;  # place tokens into a buffer
    }
    $data = shift @line_buffer;     # take one token out and return it
    return $data;
}

Here is the remainder of the program:

if (scalar @ARGV < 2)     [1]
{
    print "Usage: $0 polygon_file width (decimals)\n";
    print "polygon_file - file in ARC/INFO ungenerate format\n";
    print "width - desired width of output SVG\n";
    print "decimals - optional # of decimal places to keep\n";
    print "Output SVG goes to standard output.\n";
    exit 0;
}

open PFILE, $ARGV[0] or die("Cannot open polygon file $ARGV[1]"); 

$width = $ARGV[1];
if ($width <= 0)
{
    die("Width must be greater than zero.");
}

$n_decimals =  ((scalar @ARGV) == 3) ? $ARGV[2] : 0;

#
#   Set maxima and minima     [2]
#
$min_x = 1.0e100;
$min_y = 1.0e100;
$max_x = -1.0e100;
$max_y = -1.0e100;

undef @polygon_list; 

#
#   a file consists of a series of polygon numbers followed
#   by pairs of x-y coordinates. Each polygon is finished
#   by an END token, and the file is marked by an END token
#   instead of a polygon number
#
while (1)       [3]
{
    $polygon_number = get_token();
    last if ($polygon_number =~ /END/);
    
    undef   @polygon;   # the storage area for this particular polygon
    
    while (1)
    {
        $x = get_token();
        last if ($x =~ /END/);
        $y = get_token();
        push @polygon, $x, $y;

        #
        # keep track of maximum and minimum coordinates
        #
        if ($x < $min_x) {$min_x = $x;}
        if ($x > $max_x) {$max_x = $x;}
        if ($y < $min_y) {$min_y = $y;}
        if ($y > $max_y) {$max_y = $y;}
    }
    
    push @polygon_list, [ @polygon ];     [4]

}

close PFILE;

print STDERR "max x=$max_x  min x=$min_x width=", $max_x-$min_x, "\n";
print STDERR "max y=$max_y  min y=$min_y height=", $max_y-$min_y, "\n";

#
#   Figure out the scaling factor to make the width equal to
#   the one specified on the command line, then find the
#   corresponding height.
#
$scale = $width / ($max_x - $min_x );
$height = ($max_y - $min_y) * $scale;

#
#   Round it up so viewport and viewBox are integral
#
$height = int ($height + 0.5);

#
#   Insert extra pixels for padding
#
$pad_width = $width + 30;
$pad_height = $height + 30;

#
#   Begin constructing the SVG file
#
print <<"SVG_HEADER";        [5]
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN"
    "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">

<svg width="$pad_width" height="$pad_height"
    viewBox="0 0 $pad_width $pad_height">
<title>Map constructed from $ARGV[0]</title>
<g transform="translate(15,15)" style="fill: none; stroke: black;">
SVG_HEADER

$poly_num = 1;
foreach $poly (@polygon_list)
{
    $n = 0;
    print qq%<polyline id="poly$poly_num" points="\n\t%;
    
    #
    #   get rid of first coordinate
    #
    shift @$poly;        [6]
    shift @$poly;
    
    foreach $coord (@$poly)
    {
        if ($n % 2 == 0)    # x-coordinate
        {
            $coord = ($coord - $min_x) * $scale;
        }
        else                # flip y-coordinate
        {
            $coord = ($max_y - $coord) * $scale;     [7]
        }
        if ($n_decimals != 0)
        {
            $coord = int($coord * (10**$n_decimals))/(10**$n_decimals);
        }
            
        print $coord, " ";
        
        #
        #   to avoid excessively long text lines, place only
        #   eight coordinates on a line
        #
        $n = ($n+1) % 8;
        print "\n\t" if ($n == 0);
    }
    print qq%" />\n%;    # close off the <path> element
    $poly_num++;
}

#
#   Close off open tags to end the file.
#
print "</g>\n</svg>\n";

[1] Obligatory argument retrieval and error checking.

[2] Set up variables. The initial maxima and minima are huge enough to handle the coordinates of a map of anything smaller than a minor galaxy. The variable @polygon_list will be a list of the polygons, each of which will itself be represented as a list.

[3] The outer loop reads polygon ID numbers, and the inner loop stores the coordinates in the @polygon list.

[4] The square brackets are very important. They create an anonymous list to contain the coordinates in @polygon, and that list is pushed onto @polygon_list. Without the brackets, it would simply append all the coordinates, ungrouped, to the end of @polygon_list.

[5] This line prints, verbatim, all the text up to (but not including) a line that begins with the literal SVG_HEADER. Because SVG_HEADER is enclosed in double quotes, we can use variable interpolation in the verbatim text.

[6] The $poly loop variable, set in the outer foreach, is an entire list, which is why it must be preceded by an @ to be accessed properly.

[7] We had to keep track of the maximum and minimum x-coordinate to calculate the width scaling factor properly; we kept track of the maximum y-coordinate so we could change from Cartesian coordinates to SVG style coordinates.

Running this program with the data for the state of Michigan with an output width of 250 pixels and a decimal accuracy of three digits produces Figure 12-1. Michigan was chosen because it requires several polygons to draw, and its outline is more visually interesting than that of, say, Colorado. The data came from the U.S. Census Bureau Cartographic Boundary Files Web Site at http://www.census.gov/geo/www/cob/.

Figure 12-1. Conversion from ARC/INFO ungenerate to SVG

Conversion from ARC/INFO ungenerate to SVG

Using Java to Convert XML to SVG

While preparing to write Appendix D I had to decide how to produce the matrix equations. At that time, I didn't have a definitive answer from the production staff at O'Reilly on the format in which to submit the data. I decided on an ad-hoc subset of MathML,[1] an XML application for describing the presentation and content of mathematical information. This choice gave me the maximum flexibility; I would not be tied down to a proprietary equation editor, conversion to the TeX typesetting language (a common format among publishers) could be done with a trivial XSLT file, and using MathML would give me an example for this chapter.

This example is atypical in that the majority of its output will be <text> elements. It is further atypical in that it is not a general tool, nor is it intended to be. It is, however, typical in that it shows how to parse an input XML document, construct a new XML document in memory and output through a serializer.

The subset of MathML in this example is as follows; all the elements are container elements:

MathML element Contains
<math> root element of a MathML document
<mrow> a formula
<mtable> a matrix (table)
<mtr> a matrix row
<mtd> a data cell within a row
<mn> a numeric value
<mi> an identifier (variable)
<mo> a mathematical operator
<msub> subscripted content


The general outline of the program is as follows:

  • Parse the input document to create a Document Object Model in memory.
  • Find the matrix with the maximum number of rows; this will determine the height of the output SVG document.
  • Go through the formula and output matrices and operators as specified. These become elements in an SVG Document that is also built in memory.
  • If the operator is a parenthesis, make it as large as the largest matrix in the entire formula. This is an assumption that saves a lot of programming, and works well for most of the formulas that the appendix needs.
  • Each matrix is output within an SVG group (<g>) element. Again, for ease of programming, the matrix is drawn with its upper left corner at (0,0) and will be moved to its proper place with a translate transformation. The square brackets that enclose the matrix's values will be drawn with a <path> element.
  • Once the SVG document has been built, use the parser's serialize method to create an output file.

This program uses the Xerces XML parser from the Apache Software Foundation; you may download it and read its documentation at http://xml.apache.org. The program starts with a large number of imports from the standard Java libraries and the Xerces parser.

import java.awt.Dimension;
import java.awt.Font;
import java.awt.FontMetrics;
import java.awt.Graphics;
import java.awt.Image;
import java.awt.image.BufferedImage;

import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.io.IOException;

import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.DocumentFragment;
import org.w3c.dom.DocumentType;
import org.w3c.dom.DOMImplementation;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.w3c.dom.Text;

import org.apache.xerces.parsers.DOMParser;
import org.apache.xml.serialize.XMLSerializer;
import org.apache.xml.serialize.OutputFormat;

The program itself starts with constants that establish the default line height, font height for normal characters, font height for subscripted characters, extra height at the top and bottom of the image, and extra horizontal space between items:

public class MLtoSVG {

    //
    // Constants
    //

    private static final int LINE_HEIGHT = 24;
    private static final int FONT_HEIGHT = 14;
    private static final int SUBSCRIPT_HEIGHT = 12;
    private static final int EXTRA_HEIGHT = 10;
    private static final int X_SPACING = 2;

Although global variables are considered a mortal sin, I was too lazy to pass parameters ad infinitum, so the input document, output document, and root element of the output document became properties of the class:

    /** The "before" document */
    protected Document mlDocument;
    
    /** The "after" document */
    protected Document svgDocument;

    /** Permanent pointer to SVG document's root element */
    protected Element svgRoot;

The main function is entirely straightforward:

    public static void main(String argv[]) {

        // is there anything to do?
        if ( argv.length == 0 ) {
            System.err.println("usage: java MLtoSVG filename");
            System.exit(1);
        }

        // vars
        MLtoSVG converter = null;

        converter = new MLtoSVG();  
        converter.readDocument( argv[0] );
        converter.processDocument( );
        converter.printDocument( );

    } // main(String[])

The readDocument method will set up a DOM parser and read the input file. However, the standard parser does not come with any error handling. Thus, there is the following implementation of the org.xml.saxErrorHandler abstract class. Rather than reinvent the wheel, I stole the code outright from a sample program that came with the Xerces parser, and produced the ParserErrorHandler class.

/*
 * ParserErrorHandler code taken from DOMWriter.java sample code.
 * Copyright (c) 1999 The Apache Software Foundation.  All rights 
 * reserved. For licensing details, see http://www.apache.org/
 */

import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.SAXNotRecognizedException;
import org.xml.sax.SAXNotSupportedException;

public class ParserErrorHandler
    implements ErrorHandler {

    //
    // ErrorHandler methods
    //

    /** Warning. */
    public void warning(SAXParseException ex) {
        System.err.println("[Warning] "+
                           getLocationString(ex)+": "+
                           ex.getMessage());
    }

    /** Error. */
    public void error(SAXParseException ex) {
        System.err.println("[Error] "+
                           getLocationString(ex)+": "+
                           ex.getMessage());
    }

    /** Fatal error. */
    public void fatalError(SAXParseException ex) throws SAXException {
        System.err.println("[Fatal Error] "+
                           getLocationString(ex)+": "+
                           ex.getMessage());
        throw ex;
    }

    //
    // Private methods
    //

    /** Returns a string of the location. */
    private String getLocationString(SAXParseException ex) {
        StringBuffer str = new StringBuffer();

        String systemId = ex.getSystemId();
        if (systemId != null) {
            int index = systemId.lastIndexOf('/');
            if (index != -1) 
                systemId = systemId.substring(index + 1);
            str.append(systemId);
        }
        str.append(':');
        str.append(ex.getLineNumber());
        str.append(':');
        str.append(ex.getColumnNumber());

        return str.toString();

    } // getLocationString(SAXParseException):String

}

This program uses the Xerces parser explicitly; some programs will put the parser name in a string and then instantiate it dynamically:

String parserName = "org.apache.xerces.parsers.DOMParser";
DOMParser parser =
 (DOMParser)Class.forName(parserName).newInstance();

I didn't see the need for this, so I constructed the parser directly, set the error handler, parsed the file whose name is the input to the readDocument function, and saved the document that is returned. Failure at any point is met with a stack trace.

    /** Read the input document and construct a document tree. */
    public void readDocument( String uri ) {
        ParserErrorHandler errHandler = new ParserErrorHandler();
        mlDocument = null;
        try {
            DOMParser parser = new org.apache.xerces.parsers.DOMParser();
            parser.setErrorHandler( errHandler );
            parser.parse( uri );
            mlDocument = parser.getDocument();
        } catch ( Exception e ) {
            e.printStackTrace(System.err);
        }
    } // readDocument(String)

Similarly, once the SVG document has been built, sending it to a file — in this case the standard output — is no great challenge. Most of this function performs error-checking.

    public void printDocument( ) {
        
        if (svgDocument == null)
        {
            return;
        }
        PrintWriter out = null;
        try{
            out =
            new PrintWriter(new OutputStreamWriter(System.out, "UTF8"));    [1]
        }
        catch (Exception e)
        {
            System.out.println("Error creating output stream");
            System.out.println(e.getMessage());
            System.exit(1);
        }
        OutputFormat oFormat = new OutputFormat( "xml", "UTF8", true );    [2]
        XMLSerializer serial = new XMLSerializer( out, oFormat );    [3]
        try
        {
            serial.serialize( svgDocument );
        }
        catch (java.io.IOException e)
        {
            System.out.println(e.getMessage());
        }
    }

[1] First, construct an output stream with your favorite encoding method.

[2] A serializer requires an output format. This constructor's three parameters are the output method (which is normally one of "xml", "html", or "text"); the character encoding, which should be "UTF8" to keep your international clients happy; and a Boolean that tells whether the output should be indented or not.

[3] The OutputFormat is used when creating the serializer.

This leaves the majority of the work to the processDocument function.

    public void processDocument( )
    {
        svgDocument = null;
        
        /* anything to do? */
        if (mlDocument == null)
        {
            return;
        }

        /* Create the output document */
        DOMImplementation dImplement = mlDocument.getImplementation();     [1]
        DocumentType dType = dImplement.createDocumentType(     [2]
            "svg",
            "-//W3C//DTD SVG 1.0//EN",
            "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"
        );
        svgDocument = dImplement.createDocument( null, "svg", dType);
        svgRoot = (Element) svgDocument.getDocumentElement();

        Element     mrowElement;    // store <mrow> element for ease 
                                       of access
        NodeList    matrices;       // list of all <mtable> elements
        NodeList    rows;           // list of all <mtr> elements
        NodeList    nodes;          // immediate children of <mrow>
        int         i;              // ubiquitous counter
        
        /* current x  position while creating matrices */
        int currX = 0;

        /* maximum number of rows in any one matrix */
        int maxRows = 0;
        int totalHeight;

            [3]

        /* Find the first <mrow> node */
        nodes = mlDocument.getElementsByTagName("mrow");     [4]
        mrowElement = (Element) nodes.item(0);
        
        /* Find the maximum number of rows among all the matrices */
        matrices = mrowElement.getElementsByTagName("mtable");      [5]    
        for (i = 0; i < matrices.getLength(); i++)
        {
            rows = ((Element) matrices.item(i)).getElementsByTagName("mtr");
            if (rows.getLength() > maxRows)
            {
                maxRows = rows.getLength();
            }
        }
        
        /* Calculate total height */ 
        totalHeight = maxRows * LINE_HEIGHT + EXTRA_HEIGHT;
        
        /* Now create the SVG for the matrices and operators */
        nodes = mrowElement.getChildNodes();     [6]
        for (i=0; i < nodes.getLength(); i++)     [7]
        {
            if (nodes.item(i).getNodeName().equals("mtable"))
            {
                currX += generateMatrix( nodes.item(i), currX, totalHeight );
            }
            else if (nodes.item(i).getNodeName().equals("mo"))
            {
                currX += generateOperator( nodes.item(i), currX, totalHeight );
            }
        }
        
        currX += 2 * X_SPACING; // put some padding at the right

        svgRoot.setAttribute("width", Integer.toString( currX ));     [8]
        svgRoot.setAttribute("height", Integer.toString( totalHeight) );
        svgRoot.setAttribute("viewBox",
            "0 0 " + currX + " " + totalHeight);

    }

[1] The processDocument function starts by creating the SVG document. Since every parser implementation has its own way of storing the objects, you must ask the DOMImplementation class to give you the details.

[2] The implementation is asked to create a document. The createDocumentType function's parameters will be used when producing the <!DOCTYPE> in the output; you provide a root element name, and a public and system identifier for the DTD. The createDocument function's parameters are the namespace URI (in this case, none is needed), the name of the top-level document element, and the document type. The root element is retrieved for future reference by calling getDocumentElement.

[3] Now it's time to extract information from the MathML document. Among the methods you use to access information in the Document Object Model are the following:

NodeList getElementsByTag (DOMString tagName)
Calling element .getElementsByTag(" name ") returns a list of all the descendant nodes of the specified element that have the specified tag name. These are not just the child nodes; they are descendants at any depth, which is very advantageous in this program.
Node item (int n)
Calling nodeList .item( n ) retrieves node number n from the given node list.
NodeList getChildNodes ( )
Calling node .getChildNodes( ) retrieves a list of all the immediate children of the given node.
short getNodeType ( )
Calling node .getNodeType( ) returns an integer that tells which kind of node this is (element, text, CDATA, entity, etc.).
String getNodeName ( )
Calling node .getNodeName( ) returns a string based on the node type. If the node is an element, the tag name is returned; if it's a text node, the string "#text" is returned.
Node getNextSibling ( )
Calling node .getNextSibling( ) retrieves the given node's next sibling, or null if this is the last of the siblings. Sibling nodes are nodes that are all descendants of a common parent, listed in the order in which they appear in the document.

[4] The first <mrow> element encloses the entire matrix expression. The easiest way to find it is to grab all the tags and store the address of the first one. The code casts the results of the item call to Element. This is safe, since the node lists were constructed by calls that return only Elements.

[5] In order to center all the matrices vertically, we need to find the matrix with the largest number of rows. Presuming that there are no nested matrices, this is done by walking through each <mtable> element, extracting a list of all its <mtr> descendants, and checking that list's length. This maximum number of rows is used to calculate the totalHeight of the resulting graphic.

[6] We then call the getChildNodes function to retrieve a NodeList of all the immediate children of the <mrow> element.

[7] The program iterates through these children, generating a matrix when encountering an <mtable>, or an operator symbol when encountering an <mo> element. Everything else is ignored, which conveniently skips over the "hidden" text nodes that come from carriage returns between lines in the source file.

The generateMatrix and generateOperator create the output SVG. Once they are done, the currX variable will contain the total width of the graphic.

[8] The processDocument function wraps up by using the setAttribute function to add the width, height, and viewBox attributes to the svgRoot variable.

setAttribute is just one of the functions used to populate the new document. Other functions include:

Element createElement (DOMString elementName)
Calling document .createElement(" name ") returns an element which belongs to the specified document, but is not part of the document hierarchy (it hasn't been placed in the "tree" yet).
Text createTextNode (String data)
Calling document .createTextNode(" value ") returns a text node whose content is the given value. Again, this node belongs to the specified document, but it is not part of the document hierarchy (it hasn't been placed in the "tree" yet).
void setAttribute (DOMString data, DOMString value)
Calling element .setAttribute(" attr ", " value ") sets the particular attribute of the specified element to the given value. If the attribute already exists, any previous value it had is replaced with the new value.
Node appendChild (Node newchild)
Calling parentNode .appendChild( newchild ) appends the given new child node to the child nodes of the parent node. It returns a reference to the new child node. This puts the node into the document hierarchy.
Node insertBefore (Node newchild, Node refchild)
Calling parentNode .insertBefore( newChild , referenceChild ) in- serts the new child node into the list of child nodes of the parent node just before the child node specified in the second parameter. The function returns a reference to the new child node. (The sample program doesn't use this function, but it's included here because it's generally useful.)

Let us now turn our attention to the generateMatrix function.

public int generateMatrix( Node mtableNode, int currX, int totalHeight )
{
    double      y;                 [1]
    int         x = 0;
    NodeList    rowList;        // list of all <mtr> elements
    NodeList    cellList;       // list of all <mtd> elements
    Element     newElement;     // a catch-all "new element"
    Element     gElement;       // a created <g> element
    Element     startColumn;    // marks beginning of a column
    Element     textElement;    // a created <text> element
        
    int         nRows;          // number of rows in table
    int         nCells;         // number of cells per row
    int         i;              // ubiquitous loop counter
    int         row, col;       // more counter variables
    int         colWidth;       // maximum width of a column
        
    Dimension   textInfo;       // holds text width and height

    [2]
    rowList = ((Element) mtableNode).getElementsByTagName("mtr");
    nRows = rowList.getLength();
        
    /* Check to see that all rows have the same number of cells */
    cellList = ((Element) rowList.item(0)).getElementsByTagName("mtd");
    nCells = cellList.getLength();
        
    for (i = 1; i < nRows; i++)
    {
        cellList = ((Element) rowList.item(i)).getElementsByTagName("mtd");
        if (cellList.getLength() != nCells)
        {
            System.err.println("All rows must have " + nCells + " cells ");
            System.exit(1);
        }
    }
        
    y = (totalHeight - nRows * LINE_HEIGHT) / 2.0;      [3]
        
    newElement = svgDocument.createElement("g");     [4]
    newElement.setAttribute("transform",
        "translate(" + currX + ", " + y + ")" );
    newElement.setAttribute("font-family", "sans-serif");
    newElement.setAttribute("font-size",
        Integer.toString(FONT_HEIGHT));
    gElement = (Element) svgRoot.appendChild( newElement );
        
    newElement = svgDocument.createElement("path");     [5]
    newElement.setAttribute("d",
        "M3 0h-3v" + (nRows*LINE_HEIGHT) +
            "h3");
    newElement.setAttribute("fill", "none");
    newElement.setAttribute("stroke", "black");
        
    /* The next "nCells" siblings of this element     [6]
       will be the first column of the matrix */ 
    startColumn = (Element) gElement.appendChild( newElement );

    /* Now get all the <mtd> cells in order */     [7]
    cellList = ((Element) mtableNode).getElementsByTagName("mtd");

    x = X_SPACING;

    textElement = null;

    for (col = 0; col < nCells; col++)
    {
        Node    currNode;

        colWidth = 0;
        for (row = 0; row < nRows; row++)
        {
            currNode = cellList.item( row * nCells + col );
            newElement = svgDocument.createElement("text");     [8]
            textInfo = constructTextNode( newElement, currNode,
                FONT_HEIGHT );
            textElement = (Element) gElement.appendChild( newElement );
            textElement.setAttribute("y",
                Integer.toString( row * LINE_HEIGHT + textInfo.height ) );
            textElement.setAttribute("text-anchor", "middle");
            textElement.setAttribute("font-size",
                Integer.toString(FONT_HEIGHT));
            if (textInfo.width > colWidth)
            {
                colWidth = textInfo.width;
            }
        }
            
        /* go back and put in the "x" coordinates */     [9]
        startColumn = (Element) startColumn.getNextSibling();
        for (row = 0; row < nRows; row++)
        {
            startColumn.setAttribute("x",
                Double.toString( x + colWidth/2.0 ) );
            startColumn = (Element) startColumn.getNextSibling();
        }
        x += colWidth + X_SPACING;
        startColumn = textElement;

    }
        
    x += X_SPACING;
    /* the closing bracket */     [10]
    newElement = svgDocument.createElement("path");
    newElement.setAttribute("d",
        "M" + (x-3) + " 0h3v" + (nRows*LINE_HEIGHT) +
        "h-3");
    newElement.setAttribute("fill", "none");
    newElement.setAttribute("stroke", "black");
    startColumn = (Element) gElement.appendChild( newElement );

    return x + 2*X_SPACING;
}

[1] We start off with a clump of declarations. I tend to use a lot of temporary variables so that I don't have to constantly call the DOM access functions.

[2] The function starts by extracting the number of rows in this matrix and seeing that all the rows have the same number of cells in them.

[3] After this sanity check, the function calculates where the top y coordinate of the matrix should be when the matrix is vertically centered.

[4] Once the top position of the matrix is known, the program creates a <g> element to enclose the matrix. The transform attribute positions it, and the font-family and font-size will apply to all the text in that matrix. We are using presentation attributes because it's easier to set them individually than to construct a string for the appropriate style attribute.

The resulting element is appended to the root element's children, and, for ease of future access, is stored in the variable gElement.

[5] The first thing inside the <g> is a <path> that draws the left bracket enclosing the matrix.

[6] Following the bracket, generateMatrix draws the matrix's cells column by column. Each cell will become a <text> element. As we insert each cell for a particular column into the output tree, we'll keep track of its text width. At the end of the column, we'll know the column's maximum width, and we will go back and update the x attributes of each of the <text> elements. This requires us to remember the starting point after which we began adding cells. That's why we had to save the position of the <path> element in startColumn.

[7] Here's the code that creates the cells for the entire matrix. Note that we grabbed all the <mtd> descendants of the <mtable> node. This bypasses any intervening elements, but that's OK since we don't have any nested tables.

[8] The actual task of building the cell contents is left to the constructTextNode function; in this segment of the code we set the y, font-size, and text-anchor attributes of the cell.

[9] Here is the code that goes back to fill in the x attribute after each column is complete. The getNextSibling function proceeds through the text elements that were just added. Once this is done, we reset the startColumn variable to point to the very last textElement element we added, for its siblings will become the next column.

[10] Finally, the function creates another <path> element for the right bracket of the matrix, and the matrix has been generated. We return the total width of the matrix, plus a bit of extra padding.

Now let's look at the constructTextNode function. The way the program is set up, text may be subscripted within an <msub> element, which this function should handle. If the text is enclosed in <mi> or <mn> elements, constructTextNode ignores the tags and takes only the enclosed text. The parameters to constructTextNode are the destination node to which the text is to be added in the output tree, the parent source node that contains the original text, and the font size for the text.

public Dimension constructTextNode( Node destNode, Node parentNode, 
    int size )
{
    NodeList    children = parentNode.getChildNodes();     [1]
    Node        currNode;
    int         i;
    Dimension   d = new Dimension(0, 0);
    Dimension   subDim;

    for (i=0; i < children.getLength(); i++ )
    {
        subDim = new Dimension(0,0);
        currNode = children.item(i);
        if (currNode.getNodeName().equals("#text"))       [2]
        {
            Text    textNode;
            String  value = currNode.getNodeValue();

            subDim = stringInfo( value, size );
            textNode = svgDocument.createTextNode( value );
            destNode.appendChild( textNode );
        }
        else if (currNode.getNodeName().equals("msub"))     [3]
        {
            Element newElement;
            newElement = svgDocument.createElement("tspan");
            newElement.setAttribute( "baseline-shift",
                "sub");
            newElement.setAttribute("font-size",
                Integer.toString(SUBSCRIPT_HEIGHT));
            newElement = (Element) destNode.appendChild( newElement );
            subDim = constructTextNode( newElement, currNode,
                SUBSCRIPT_HEIGHT );             
        }
        else if (currNode.getNodeType() == Node.ELEMENT_NODE)      [4]
        {
            subDim = constructTextNode( destNode, currNode, size );
        }
            
        d.width += subDim.width;      [5]
        if (subDim.height > d.height)
        {
            d.height = subDim.height;
        }
    }
    return d;
}
    
public Dimension stringInfo( String str, int fontSize )     [6]
{
    BufferedImage buffer = new BufferedImage(
        100, 100, BufferedImage.TYPE_INT_RGB);
    Graphics g = buffer.getGraphics();
    Font f = new Font("SansSerif", Font.PLAIN, fontSize);
    g.setFont( f );
    FontMetrics fm = g.getFontMetrics();

    return new Dimension( fm.stringWidth( str ), fm.getAscent() );
}

[1] The function starts by getting all the immediate children of the source node and setting the total width and height of the text, stored in variable d, to zero.

[2] We look at each of the children of the current source node in turn. If the child is a text element, we create a text node for it, find its width and ascent height by calling stringInfo, and append it to the destination node's children.

[3] However, if the child is an <msub> element, we must create a <tspan> element in the output with a baseline-shift attribute for subscripting. We then recursively call constructTextNode to add all the enclosed text to this new <tspan> node.

[4] If it's not text and not subscripted, it's some other element node that we don't care to handle, so just do a recursive call to gather all its text and append it to the destination node.

[5] No matter what course of action we took, subDim now contains the width and height of the text that was appended. We add the width to the total and keep track of the maximum height.

[6] constructTextNode calls the stringInfo function to determine the width and ascent height of text. Because we don't know how an SVG viewer will render text, and thus can't determine its length exactly, we have to do the best we can. In this instance, we load the generic sans-serif font, set its size, get the font metrics, and use the results from that. Since the output will come from the Batik SVG viewer, which is also Java-based, it's reasonable to presume that the numbers will come out fairly close to the final results. stringInfo opens up an in-memory image in order to access a graphics environment.

The last function to consider is generateOperator, which contains no new DOM concepts. It checks to see if the first child of the mo element is a text node whose value is a left or right parenthesis. If so, the function produces an elliptical arc as tall as the largest matrix in the entire expression. Otherwise, it's some other mathematical operator, and is treated as plain text by calling the constructTextNode function.

public int generateOperator( Node moNode, int currX, int totalHeight )
{
    double      y;
    int         fontsize;
    Element     newElement, textElement;
    Dimension   textInfo;

    currX += X_SPACING;
    y = (totalHeight) / 2.0;
        
    fontsize = FONT_HEIGHT;
    if (moNode.getFirstChild().getNodeType() == Node.TEXT_NODE)
    {
        String str = moNode.getFirstChild().getNodeValue();
        if (str.equals( "(" ))
        {
            newElement = svgDocument.createElement("path");
            newElement.setAttribute( "d",
                "M" + (currX+12) + " 8 a12 " + y + " 0 0 0 " +
                    "0" + " " + (totalHeight-EXTRA_HEIGHT - 8) );
            newElement.setAttribute( "fill", "none");
            newElement.setAttribute( "stroke", "black" );
            svgRoot.appendChild( newElement );
            return 16 + X_SPACING;
        }
        else if (str.equals( ")" ))
        {
            newElement = svgDocument.createElement("path");
            newElement.setAttribute( "d",
                "M" + currX + " 8 a12 " + y + " 0 0 1 " +
                    "0" + " " + (totalHeight-EXTRA_HEIGHT - 8) );
            newElement.setAttribute( "fill", "none");
            newElement.setAttribute( "stroke", "black" );
            svgRoot.appendChild( newElement );
            return 10 + X_SPACING;
        }
    }

    newElement = svgDocument.createElement("text");
    textInfo = constructTextNode( newElement, moNode, fontsize );
        
    textElement = (Element) svgRoot.appendChild( newElement );
    if (fontsize != FONT_HEIGHT)
    {
        y = 0;
        textElement.setAttribute("stroke-width",
            Double.toString( (FONT_HEIGHT * 1.0 / fontsize )) );
    }
    textElement.setAttribute("transform",
        "translate(" + currX + ", " + (y+textInfo.height/2.0) + ")" );
    textElement.setAttribute("font-size",
        Integer.toString(fontsize));
        
    return textInfo.width + 2 * X_SPACING;
}

What do we get when we put this all together? The following MathML matrices:

<math>
<mrow>
<mo>(</mo>
<mtable>
<mtr>
    <mtd> <mi>x</mi><msub><mn>1</mn></msub> </mtd>
    <mtd> <mi>y</mi><msub><mn>1</mn></msub> </mtd>
    <mtd> <mi>1</mi> </mtd>
</mtr>
</mtable>
<mo>*</mo>
<mtable>
<mtr>
    <mtd> <mo>cos(</mo><mi>a</mi><mo>)</mo> </mtd>
    <mtd> <mo>-sin(</mo><mi>a</mi><mo>)</mo> </mtd>
    <mtd> <mn>0</mn> </mtd>
</mtr>

<mtr>
    <mtd> <mo>sin(</mo><mi>a</mi><mo>)</mo> </mtd>
    <mtd> <mo>cos(</mo><mi>a</mi><mo>)</mo> </mtd>
    <mtd> <mn>0</mn> </mtd>
</mtr>

<mtr>
    <mtd> <mn>0</mn> </mtd>
    <mtd> <mn>0</mn> </mtd>
    <mtd> <mn>1</mn> </mtd>
</mtr>
</mtable>
<mo>)</mo>
<mo>=</mo>
<mtable>
<mtr>
    <mtd> <mi>x</mi><msub><mn>2</mn></msub> </mtd>
    <mtd> <mi>y</mi><msub><mn>2</mn></msub> </mtd>
    <mtd> <mi>1</mi> </mtd>
</mtr>
</mtable>
</mrow>
</math>

become this SVG:

<?xml version="1.0" encoding="UTF8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20001102//EN"
    "http://www.w3.org/TR/2000/CR-SVG-20001102/DTD/svg-20001102.dtd">
<svg height="82" viewBox="0 0 378 82" width="378">
    <path d="M14 8 a12 41.0 0 0 0 0 64" fill="none" stroke="black"/>
    <g font-family="sans-serif" font-size="14" 
        transform="translate(18, 29.0)">
        <path d="M3 0h-3v24h3" fill="none" stroke="black"/>
        <text font-size="14" text-anchor="middle" x="15.5"
            y="15">x<tspan baseline-shift="sub" font-size="12">1</tspan>
        </text>
        <text font-size="14" text-anchor="middle" x="44.0"
            y="15">y<tspan baseline-shift="sub" font-size="12">1</tspan>
        </text>
        <text font-size="14" text-anchor="middle" 
            x="68.5" y="15">1</text>
        <path d="M79 0h3v24h-3" fill="none" stroke="black"/>
    </g>
    <text font-size="14" transform="translate(106, 48.5)">*</text>
    <g font-family="sans-serif" font-size="14" 
        transform="translate(115, 5.0)">
        <path d="M3 0h-3v72h3" fill="none" stroke="black"/>
        <text font-size="14" text-anchor="middle" 
            x="28.5" y="15">cos(a)</text>
        <text font-size="14" text-anchor="middle" 
            x="28.5" y="39">sin(a)</text>
        <text font-size="14" text-anchor="middle" 
            x="28.5" y="63">0</text>
        <text font-size="14" text-anchor="middle" 
            x="86.5" y="15">-sin(a)</text>
        <text font-size="14" text-anchor="middle" 
            x="86.5" y="39">cos(a)</text>
        <text font-size="14" text-anchor="middle" 
            x="86.5" y="63">0</text>
        <text font-size="14" text-anchor="middle" 
            x="127.5" y="15">0</text>
        <text font-size="14" text-anchor="middle" 
            x="127.5" y="39">0</text>
        <text font-size="14" text-anchor="middle" 
            x="127.5" y="63">1</text>
        <path d="M138 0h3v72h-3" fill="none" stroke="black"/>
    </g>
    <path d="M262 8 a12 41.0 0 0 1 0 64" fill="none" stroke="black"/>
    <text font-size="14" transform="translate(274, 48.5)">=</text>
    <g font-family="sans-serif" font-size="14" 
        transform="translate(288, 29.0)">
        <path d="M3 0h-3v24h3" fill="none" stroke="black"/>
        <text font-size="14" text-anchor="middle" x="15.5"
            y="15">x<tspan baseline-shift="sub" font-size="12">2</tspan>
        </text>
        <text font-size="14" text-anchor="middle" x="44.0"
            y="15">y<tspan baseline-shift="sub" font-size="12">2</tspan>
        </text>
        <text font-size="14" text-anchor="middle" 
            x="68.5" y="15">1</text>
        <path d="M79 0h3v24h-3" fill="none" stroke="black"/>
    </g>
</svg>

which becomes Figure 12-2.

Figure 12-2. Sample converted from MathML to SVG

Sample converted from MathML to SVG

You can find out more about manipulating XML with Java in the aptly named book Java & XML by Brett McLaughlin, published by O'Reilly & Associates.

Using XSLT to Convert XML Data to SVG

Defining the Task

The final example in this chapter uses the Extensible Stylesheet Language Transformations (XSLT) to extract information from an XML file and insert it into an SVG file. The source data is in the Weather Observation Markup Format (OMF), defined at http://zowie.metnet.navy.mil/~spawar/JMV-TNG/XML/OMF.html. OMF is, for the most part, a wrapper for several different types of weather reports. The OMF elements add annotation, decoded information, and quantities calculated from the raw reports. Here is a sample report:

<Reports TStamp="997568716">
<SYN Title='AAXX' TStamp='997573600' LatLon='37.567, 126.967' BId='471080'
SName='RKSL, SEOUL' Elev='86'>
<SYID>47108</SYID>
<SYG T='22.5' TD='14.1' P='1004.1' P0='1014.1' Pd='0 0.1' Vis='22000'
Ceiling='INF' Wind='30-70, 1.5' WX='NOSIG' Prec=' ' Clouds='44070'>
32972 40703 10225 20141 30041 40141 50001 84070
</SYG></SYN>
</Reports>

Our objective is to extract the reporting station, the date and time, temperature, wind speed and direction, and visibility from the report. These data will be filled into the graphic template of Figure 12-3.

Figure 12-3. Graphic weather template

Graphic weather template

This example is atypical, in that all the information is contained in the attributes of the source XML rather than the content of the elements. Paradoxically, this makes our example typical, since real-world markup so often fails to follow the pristine examples found in textbooks or reference manuals. You'll eventually encounter such data, so it may as well be now. The OMF format attributes we're interested in are listed here, along with the plan for displaying them in the final graphic. The first two required attributes come from the <SYN> element, the rest are optional attributes from its child <SYG> element.

TStamp
The timestamp in seconds since midnight, January 1, 1970 UTC. In the final graphic, the date and time will be represented in text, and the time will also be shown on an analog clock. The color of the clock face will be light yellow to indicate hours between 6 A.M. and 6 P.M., and light blue for evening and night hours.
SName
The reporting station's call letters, possibly followed by a comma and the station's full name. The final graphic will represent this as text.
T
The air temperature in degrees Celsius. This will be displayed by coloring in the thermometer to the appropriate level. If the temperature is greater than zero, the coloring will be red; if less than or equal to zero, it will be blue.

Wind
A direction and speed, separated by a comma. The direction is measured in degrees; 0 indicates wind blowing from true north, and 270 indicates wind from the west. This will be represented by a line on the compass.

Wind direction may also be expressed as two numbers separated by a dash, indicating a range of directions. Thus, 0-40 indicates wind from the north to north-east. In this case, two dashed lines will be drawn on the compass.

The wind speed is expressed in meters per second. If two numbers are given, separated by a dash, the second number indicates the speed of gusts of wind. This information will be displayed in text form.

;Vis

Surface visibility in meters, or the value INF for unlimited visibility. The final graphic will represent this by filling in a horizontal bar. Any visibility above 40 kilometers will be presumed to be unlimited.

How XSLT Works

To convert an OMF source file to its destination SVG format, we will create a list of specifications that tells which elements and attributes in OMF are of interest to us. These specifications will then detail what SVG elements to generate whenever we encounter an item of interest. If we were asking a human to do the transformation by hand, we could write out an English language description:

  • Begin a new SVG document by typing this:
    <!DOCTYPE svg PUBLIC "-//W3C/DTD SVG 1.0//EN",
     "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">	
    
  • Go through the source document. As you find each element, look for instructions on how to process it.
  • To process the <Reports> element, type the following into a text editor, then process any sub-elements as specified in comments.
    <svg viewBox="0 0 350 200" height="200" width="350">
        <!-- process any SYN elements you find within this element -->
    </svg>
    
  • To process a <SYN> element, type this text and fill in the blanks:
    <text font-size="10pt" x="10" y="20">
    <!-- fill in the value of the SName attribute --> 
    </text>
        
    <!-- process any SYG elements you find within this element -->
    
  • To process a <SYG> element:
    1. Extract the value of the T attribute, and use that value when following the instructions for "how to draw a thermometer."
    2. Extract the value of the Wind attribute, and use that value as you follow the instructions for "how to draw a wind compass."
    3. (etc.)
  • To draw a thermometer, calculate the height of the bar as 50 minus the value you got from the T attribute, and use that result where you see the italicized text as you type the following:
    <path
        d = "M 25 height 25 90
        A 10 10 0 1 0 35 90
        L 35 height Z"
        style="stroke: none; fill: {$tint};"/>
    <path 
        d= "M 25 0 25 90 A 10 10 0 1 0 35 90 L 35 0 Z"
        style="stroke: black; fill: none;"/>
    
  • To draw a wind compass (etc.).

Rather than writing our specifications in English and handing them to a human to perform, we will write the specifications in the XSLT markup format. We'll hand the XSLT file, along with the OMF file, to the Apache Software Foundation's Xalan processor, and it will process elements and fill in the blanks to produce an output SVG file. Here is a quick English-to-XSLT translation guide.

English XSLT
Process an element element
<xsl:template match="element">
    <!-- output to produce -->
</xsl:template>
Process any items within the current element
<xsl:apply-templates select="items"/>
Fill in the value of an item
<xsl:value-of select="item"/>
Use the value of an item as a variable named var
<xsl:variable name="var">
    <!-- instructions to produce item's value -->
</xsl:variable>

Developing an XSL Stylesheet

We'll add details as we proceed, but this gives us more than enough to start. The XSLT file begins like this:

<xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"     [1]
    doctype-public="-//W3C//DTD SVG 1.0//EN"
    doctype-system=
        "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"/>

<xsl:template match="Reports">     [2]
<svg width="350" height="200" viewBox="0 0 350 200">
    <defs>
        <path id="wind-line" d="M 40 40 h 25"
            style="stroke: black; fill: none;"/>
    </defs>
    <xsl:apply-templates select="SYN"/>     [3]
</svg>
</xsl:template>

[1] The <xsl:output> specifies that the output will be an XML file and that we want it indented nicely. It also generates the appropriate <!DOCTYPE ...> instruction.

[2] <xsl:template> directs the XSLT processor to generate the specified output whenever it encounters a <Reports> element. This template will be called only once, since there's only one such element in the source document. It creates the outermost <svg> element and a <defs> element for later use.

[3] After outputting the <svg> and <defs>, <xsl:apply-templates> directs the processor to find any child <SYN> elements and generate whatever its <xsl:template> element specifies.

And how is the <SYN> element to be processed? Like this:

<xsl:template match="SYN">
    <!-- output the station name as a text element -->
    <text x="10" y="20" style="font-size: 10pt;">
        <xsl:value-of select="@SName"/>
    </text>
    
    <!-- process any child SYG elements -->
    <xsl:apply-templates select="SYG"/>
</xsl:template>

The <xsl:value-of> inserts the value of the selected item. The preceding @ indicates that we want the value of an attribute; in this case, the SName attribute. We also want the timestamp from this element, but it requires special handling, so we'll come back to it later.

The processor then finds all subordinate <SYG> elements and processes them as our XSLT document specifies.

Note

In this example, we've only used element and attribute names as the values of a match or select. In reality, you can put any XPath expression as a value. XPath is a notation that lets you select parts of an XML document with extreme precision. For example, while processing an XHTML document, you could select only the odd <td> elements that are within <tr> elements that have been set align="right".

The majority of the work needs to be done when we encounter the <SYG> element, since it contains the temperature, wind, and visibility attributes. While it would be possible to output all the relevant SVG within one <xsl:template>, a modular approach is easier to read and maintain. XSLT lets you create templates that act somewhat like functions; they don't correspond to any element in the source document, but you may explicitly call them by name and pass parameters to them. We take advantage of this in the following template:

<xsl:template match="SYG">
    <!-- pass the temperature to the thermometer -->
    <xsl:call-template name="draw-thermometer">
        <xsl:with-param name="t" select="@T"/>
    </xsl:call-template>
    
    <!-- draw-wind needs wind speed and direction -->
    <xsl:call-template name="draw-wind">
        <xsl:with-param name="w" select="@Wind"/>
    </xsl:call-template>

    <!-- draw-visibility needs the value of the Vis attribute -->
    <xsl:call-template name="draw-visibility">
        <xsl:with-param name="v">
            <xsl:value-of select="@Vis"/>
        </xsl:with-param>
    </xsl:call-template>  
</xsl:template>

If the value of a parameter is an attribute value, the easiest way to set it is with a select. Another way to set the value is to put the content between a beginning and ending tag, as is shown in the third call.

Now we can write the template for draw-thermometer. We need to use the passed-in parameter to determine the height to fill the thermometer, and whether the thermometer should be filled with red or blue.

<xsl:template name="draw-thermometer">
    <xsl:param name="t">0</xsl:param>     [1]
    <xsl:variable name="height" select="50-$t"/>      [2]
    <xsl:variable name="tint">     [3]
    <xsl:choose>
        <xsl:when test="$t &gt; 0">red</xsl:when>
        <xsl:otherwise>blue</xsl:otherwise>
    </xsl:choose>
    </xsl:variable>

<g id="thermometer" transform="translate(10, 40)">
    <path
        d = "M 25 {$height} 25 90 A 10 10 0 1 0 35 90 L 35 {$height} Z"     [4]
        style="stroke: none; fill: {$tint};"/>

    <path d= "M 25 0 25 90 A 10 10 0 1 0 35 90 L 35 0 Z"
        style="stroke: black; fill: none;"/>
        
    <g id="thermometer-text"
        style="font-size: 8pt; font-family: sans-serif;">
        <text x="20" y="95" style="text-anchor: end;">-40</text>
        <text x="20" y="55" style="text-anchor: end;">0</text>
        <text x="20" y="5" style="text-anchor: end;">50</text>
        <text x="10" y="110" style="text-anchor: end;">C</text>
        <text x="40" y="95">-40</text>
        <text x="40" y="55">32</text>
        <text x="40" y="5">120</text>
        <text x="50" y="110">F</text>
        <text x="30" y="130" style="text-anchor: middle;">Temp.</text>

        <text x="30" y="145" style="text-anchor: middle;">     [5]
            <xsl:value-of select="$t"/> /
            <xsl:value-of select="round($t div 5 * 9 + 32)"/>
        </text>
    </g>
</g>
</xsl:template>

[1] You can specify the default value for a parameter if none is passed in.

[2] XSLT lets you declare variables for use within a template. These are actually semi-variables; every time the template gets called, the variable will get set to an initial value, but for the duration of the template, it cannot be changed further. Note that XSLT can also do simple calculations such as the one shown in the select attribute, which figures out the height to which the thermometer should be filled. When referring to a variable or parameter in an expression, precede its name with a $.

[3] We must set the tint variable conditionally. This is done with the <xsl:choose> element, which contains one or more <xsl:when> elements. The first one whose test succeeds is the one whose output goes into the final document. The <xsl:otherwise> element is a catch-all in case all the preceding tests fail.

In the test, we used the entity reference &gt; for a greater than symbol to avoid problems with some XSLT processors; if you ever use a less than it must be written as &lt;.

[4] When referring to parameters or variables in the values of attributes of the output document, you must enclose them within curly braces. This particular <path> draws the thermometer, filled to the proper height.

[5] The <text> elements preceding this one are all fixed; this one outputs the temperature in both degrees Celsius and degrees Fahrenheit. Note the use of div for division in the formula; this is because the forward slash is already used in XPath to separate levels of element nesting.

This would be a good time to test what we've done so far. Before we can test, we have to add four items to the stylesheet. The first two are empty templates to draw the wind compass and visibility bar; they'll be completed later. The third is an empty template to handle text nodes. XSLT processors are set up with default templates to ensure that they will visit all the elements and text in the source document. The default behavior is to send the text within elements directly to the destination document. In this transformation we want to throw away the text, so we construct an empty template for text nodes; they will not appear in the SVG file. Finally, we need the closing </xsl:stylesheet> tag.

<xsl:template name="draw-wind">
    <!-- watch this space -->
</xsl:template>

<xsl:template name="draw-visibility">
    <!-- to be determined -->
</xsl:template>

<xsl:template match="text()"/>
</xsl:stylesheet>

On a Unix system, we invoke the Xalan processor from the command line with the following shell script. Xalan is the XSLT processor and Xerces is the XML parser. The resulting graphic, Figure 12-4, shows the station name and the thermometer.

java -cp /usr/local/xmljar/xalan.jar:\
/usr/local/xmljar/xerces.jar\
org.apache.xalan.xslt.Process
  -IN weather.xml -XSL omf.xsl -OUT weather.svg

Figure 12-4. XSL-generated SVG file showing thermometer

XSL-generated SVG file showing thermometer

You have seen that XSLT can do simple arithmetic; it can also do a reasonable amount of string manipulation. Here is the XSLT to handle the drawing of the wind compass. It uses the substring-before and substring-after functions to split the wind information into the parts that are needed for the drawing.

<xsl:template name="draw-wind">
    <xsl:param name="w" select="0"/>
    
    <xsl:variable name="dir"      [1]
        select="substring-before($w, ',')"/>
    <xsl:variable name="speed"
        select="substring-after($w, ',')"/>
        
    <xsl:variable name="dir1">     [2]
        <xsl:choose>
        <xsl:when test="contains($dir, '-')">
            <xsl:value-of select="number(substring-before($dir, 
                '-' ))-90"/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="number($dir) - 90"/>
        </xsl:otherwise>
        </xsl:choose>
    </xsl:variable>

    <xsl:variable name="dir2">     [3]
        <xsl:choose>
        <xsl:when test="contains($dir, '-')">
            <xsl:value-of select="number(substring-after($dir, 
                '-' ))-90"/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="number($dir) - 90"/>
        </xsl:otherwise>
        </xsl:choose>
    </xsl:variable>

<g id="compass" font-size="8pt" font-family="sans-serif"
    transform="translate(110, 70)">
    <circle cx="40" cy="40" r="30" style="stroke: black; fill: none;"/>
    <path
        d= "M 40 10 L 40 14
            M 70 40 L 66 40
            M 40 70 L 40 66
            M 10 40 L 14 40"
        style="stroke: black; fill: none;"/>
    <use transform="rotate({$dir1},40,40)" xlink:href="#wind-line">     [4]
        <xsl:if test="$dir1 != $dir2">
            <xsl:attribute name="stroke-dasharray">3 3</xsl:attribute>
        </xsl:if>
    </use>
    <use transform="rotate({$dir2},40,40)" xlink:href="#wind-line">
        <xsl:if test="$dir1 != $dir2">
            <xsl:attribute name="stroke-dasharray">3 3</xsl:attribute>
        </xsl:if>
    </use>
    <text x="40" y="9" text-anchor="middle">N</text>
    <text x="73" y="44">E</text>
    <text x="40" y="80" text-anchor="middle">S</text>
    <text x="8" y="44" text-anchor="end">W</text>
    <text x="40" y="100" text-anchor="middle">Wind (m/sec)</text>
    <text x="40" y="115" text-anchor="middle">
        <xsl:value-of select="$speed"/>     [5]
    </text>
</g>
</xsl:template>

[1] This splits the wind information into a direction and speed by grabbing the substring before and after the separating comma.

[2] If there is a hyphen in the direction, it must be split into two portions. This sets the first number, subtracting 90 degrees, since "north" is -90 degrees in SVG. If there's no hyphen, the first direction is simply offset by -90 degrees. The number function ensures string data is converted to numeric form after stripping leading and trailing whitespace.

[3] Similar code sets the second direction. You may wonder why we didn't use simpler code like this:

<xsl:choose>
<xsl:when test="contains($dir,'-')">
    <xsl:variable name="dir1"> ... </xsl:variable>
    <xsl:variable name="dir2"> ... </xsl:variable>
</xsl:when>
<xsl:otherwise>
    <xsl:variable name="dir1"> ... </xsl:variable>
    <xsl:variable name="dir2"> ... </xsl:variable>
</xsl:otherwise>
</xsl:choose>

Because a variable exists only within its enclosing block, this code won't work — the starting <xsl:when> or <xsl:otherwise> would create the variables, which would disppear immediately upon encountering the ending </xsl:when> or </xsl:otherwise>. This is why we have to repeat all the choice code within the variable declarations.

[4] After the boilerplate that creates the circle and the hash marks, we draw both wind direction lines. If a range of wind directions was specified in the source file, in which case variables $dir1 and $dir2 will be different, we want the direction lines to be dashed. We use an <xsl:if> element to test if the directions are unequal. If so, the <xsl:attribute> will add the named attribute, stroke-dasharray, to the current element, and the <xsl:attribute>'s content will become the value of that attribute. In this case, we'll add a stroke-dasharray to the currently open <use> element and give it a value of 3 3.

If the directions are the same, the <xsl:if> element does nothing, and we get the same solid line drawn twice.

[5] After the boilerplate text, we insert the wind speed.

Now we construct the XSLT commands to draw the visibility bar. We want to treat numbers greater than 40,000 as infinity, and we must also handle the special case of the word INF as a value for the visibility. This requires a three-way choice for an <xsl:choose> element to set the value of the width for the rectangle that will be drawn in green.

<xsl:template name="draw-visibility">
    <xsl:param name="v">0</xsl:param>
    <xsl:variable name="width">     [1]
        <xsl:choose>
        <xsl:when test="$v = 'INF'">100</xsl:when>
        <xsl:when test="$v &gt; 40000">100</xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$v * 100.0 div 40000.0"/>
        </xsl:otherwise>
        </xsl:choose>
    </xsl:variable>
<g id="visbar" transform="translate(220,110)" 
    font-size="8pt" text-anchor="middle">

    
    <rect fill="green" stroke="none"
        x="0" y="0" width="{$width}" height="20"/>

    
    <rect x="0" y="0" width="100" height="20" stroke="black" fill="none"/>
    
    <path fill="none" stroke="black"
        d="M 25 20 L 25 25 M 50 20 L 50 25 M 75 20 L 75 25"/>

    <text x="0" y="35">0</text>
    <text x="25" y="35">10</text>
    <text x="50" y="35">20</text>
    <text x="75" y="35">30</text>
    <text x="100" y="35">40+</text>
    <text x="50" y="60">
        Visibility (km)
    </text>
    <text x="50" y="75">
        <xsl:value-of select="format-number($v div 1000.0,'0.###')"/>     [2]
    </text>
</g>
</xsl:template>

[1] Setting the width of the area to be filled requires a three-way choice; the visibility could be the literal INF, a number greater than 40,000, or some other number. This <xsl:choose> scales the visibility to a maximum width of 100 for the fill.

[2] The visibility is in meters, but we want to show it in kilometers. To show it cleanly, we use the format-number function, which takes two parameters: the number to format and the formatting string. The formatting string used here says to print the integer part, even if it is zero, followed by at most three decimal places. Trailing zeroes will not be displayed.

Given these specifications, the weather report in Figure 12-5 is taking shape nicely, but it is still missing the date and time.

Figure 12-5. XSLT-generated SVG file without time data

XSLT-generated SVG file without time data

Extending XSLT in Java

While XSLT contains some arithmetic operations and string functions, they are nowhere near powerful enough to handle timestamp conversion into the hour and minute of the day, much less a nicely formatted date string. Luckily, it is possible to write extensions to XSLT to handle such tasks. You may write extensions in Java, JavaScript, NetRexx, JPython (the Java version of Python), PerlScript, or any other language that supports the Bean Scripting Framework. For full details, see http://xml.apache.org/xalan-j/extensions.html. Since Xalan is our tool of choice, and it is written in Java, we'll write this extension in Java.

To use an extension written in Java, you must add the text shown in boldface to the root element of the XSLT stylesheet. The first line associates the java namespace with calls to Java extensions. The second line says that XSLT doesn't have to attach that namespace to any tags it generates.

<xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  
   xmlns:java="http://xml.apache.org/xslt/java"
   exclude-result-prefixes="java">

We then write a class with static methods that take a timestamp as input and return the information we need. Since the hour and minute will only be used in a numeric context, we return them as Double. The following code is totally unsurprising; the only "gotcha" is that the OMF timestamp is measured in seconds since 1 January 1970, and Java's time and date methods are designed to work with time measured in milliseconds since 1 January 1970.

If you want to put your functions into a package, you may do so. You are not restricted to static methods, either. An extension may create an instance of a Java object and return it to be stored in an XSLT variable.

import java.util.Calendar;
import java.util.Date;
import java.text.DateFormat;

public class TimeStampUtils
{
    public static String getDate(String timeStampString)
    {
        DateFormat d = DateFormat.getDateInstance();
        long milliseconds = Long.parseLong( timeStampString ) * 1000;
        return 
            d.format(new Date(milliseconds));
    }
    
    public static Double getHour(String timeStampString)
    {
        long milliseconds = Long.parseLong( timeStampString ) * 1000;
        Calendar c = Calendar.getInstance();
        c.setTime( new Date( milliseconds ) );
        return new Double( c.get( Calendar.HOUR_OF_DAY ) );
    }
    
    public static Double getMinute(String timeStampString)
    {
        long milliseconds = Long.parseLong( timeStampString ) * 1000;
        Calendar c = Calendar.getInstance();
        c.setTime( new Date( milliseconds ) );
        return new Double( c.get( Calendar.MINUTE ) );
    }
}

To call one of these methods from XSLT, you give the namespace — in this case java: — followed by the fully qualified name of the method you wish to call. To retrieve the date string associated with the timestamp, we add the code in boldface to the definition of the <SYN> template. We also retrieve the hour and minute, and pass them to a template that will display the time as text and also draw an analog clock face.

<xsl:template match="SYN">
    <xsl:variable name="tstamp" select="@TStamp"/>
    <text font-size="10pt" x="10" y="20">
        <xsl:value-of select="@SName"/>
    </text>
    <text font-size="10pt" x="345" y="20" text-anchor="end">
        <xsl:value-of select="java:TimeStampUtils.getDate( $tstamp )"/>
    </text>
    
    <xsl:call-template name="draw-time-and-clock">
        <xsl:with-param name="hour"
            select="java:TimeStampUtils.getHour( $tstamp )"/>
        <xsl:with-param name="minute"
            select="java:TimeStampUtils.getMinute( $tstamp )"/>
    </xsl:call-template>
    
    <xsl:apply-templates select="SYG"/>
</xsl:template>

Finally, here is the template for drawing the time and clock. The only new item here is the <xsl:text> element. Its contents, which must be pure text, are placed into the output document verbatim. We're using it here to avoid problems with whitespace. If the boldface line in the following listing had been simply the colon that separates the minutes and hours, the leading tab on that line and the next one would have made their way into the resultant SVG <text> element, which would have produced extra space around the colon in the final graphic.

<xsl:template name="draw-time-and-clock">
    <xsl:param name="hour">0</xsl:param>
    <xsl:param name="minute">0</xsl:param>
    
    <!-- clock face is light yellow from 6 AM to 6 PM, otherwise light 
        blue -->
    <xsl:variable name="tint">
        <xsl:choose>
        <xsl:when test="$hour &gt; 6 and $hour &lt; 18">#ffffcc</xsl:when>
        <xsl:otherwise>#ccccff</xsl:otherwise>
        </xsl:choose>
    </xsl:variable>
    
    <!-- calculate angles for hour and minute hand of analog clock -->
    <xsl:variable name="hourAngle"
        select="(30 * ($hour mod 12 + $minute div 60)) - 90"/>
    <xsl:variable name="minuteAngle"
        select="($minute * 6) - 90"/>

<text x="345" y="40" style="font-size: 10pt; text-anchor: end;">
        <xsl:value-of select="$hour"/>
        <xsl:text>:</xsl:text>
        <xsl:value-of select="format-number($minute,'00')"/>
</text>
<text font-size="10pt" x="345" y="60" text-anchor="end">
GMT
</text>
<g id="clock" transform="translate(255, 30)">
    <circle cx="20" cy="20" r="20"
        style="fill: {$tint}; stroke: black;"/>
    <line transform="rotate({$minuteAngle}, 20, 20)"
        x1="20" y1="20" x2="38" y2="20" style="stroke: black;"/>
    <line transform="rotate({$hourAngle}, 20, 20)"
        x1="20" y1="20" x2="33" y2="20" style="stroke: black;"/>
</g>
</xsl:template>

When you run this transformation, the classpath that you give to Xalan must include the directory where your class file lives. In this case, it's in the same directory as the OMF file and XSL file, so we changed the script to include . in the classpath:

java -cp /usr/local/xmljar/xalan.jar:\
/usr/local/xmljar/xerces.jar:\
.\
org.apache.xalan.xslt.Process\
  -IN weather.xml -XSL omf.xsl -OUT weather.svg

Putting this all together produces Figure 12-6.

Figure 12-6. XSLT-generated SVG file showing complete data

XSLT-generated SVG file showing complete data

There is room for improvement in this XSLT file. Few people are interested in the four-letter station code that precedes the station name; it should be eliminated. If the temperatures are outside the range of -40 to 50 degrees Celsius, as frequently happens in desert areas or in Antarctica, the thermometer will be filled improperly. If any of the attributes is missing from the original OMF file, bad things will happen. Numeric operations on the null string result in a value called "not a number," which displays as NaN in text, and will cause SVG errors if inserted into a path's description. Finally, if there is more than one <SYN> element in the document, the XSLT will generate multiple SVG descriptions of thermometers, compasses, and visibility bars one on top of the other.

These corrections and any improvements to the output are left for the astute reader, who has been sufficiently astute to purchase XSLT by Doug Tidwell, published by O'Reilly & Associates. Chapter 8 of that marvelous book also contains an example of using XSLT to generate SVG from an XML file that is far better-behaved than the one we've used here. If you're serious about XML, you would be well advised to have this book on your shelf.

Notes

  1. You can find out about the complete MathML specification at http://www.w3.org/Math/.
Personal tools