PHP Cookbook/XML

From WikiContent

< PHP Cookbook
Revision as of 13:36, 7 March 2008 by Docbook2Wiki (Talk)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search
PHP Cookbook


Contents

Introduction

Recently, XML has gained popularity as a data-exchange and message-passing format. As web services become more widespread, XML plays an even more important role in a developer's life. With the help of a few extensions, PHP lets you read and write XML for every occasion.

XML provides developers with a structured way to mark up data with tags arranged in a tree-like hierarchy. One perspective on XML is to treat it as CSV on steroids. You can use XML to store records broken into a series of fields. But, instead of merely separating each field with a comma, you can include a field name, type, and attributes alongside the data.

Another view of XML is as a document representation language. For instance, the PHP Cookbook was written using XML. The book is divided into chapters; each chapter into recipes; and each recipe into Problem, Solution, and Discussion sections. Within any individual section, we further subdivide the text into paragraphs, tables, figures, and examples. An article on a web page can similarly be divided into the page title and headline, the authors of the piece, the story itself, and any sidebars, related links, and additional content.

XML text looks similar to HTML. Both use tags bracketed by < and > for marking up text. But XML is both stricter and looser than HTML. It's stricter because all container tags must be properly closed. No opening elements are allowed without a corresponding closing tag. It's looser because you're not forced to use a set list of tags, such as <a>, <img>, and <h1>. Instead, you have the freedom to choose a series of tag names that best describe your data.

Other key differences between XML and HTML are case-sensitivity, attribute quoting, and whitespace. In HTML, <B> and <b> are the same bold tag; in XML, they're two different tags. In HTML, you can often omit quotation marks around attributes; XML, however, requires them. So, you must always write:

<element attribute="value">

Additionally, HTML parsers generally ignore whitespace, so a run of 20 consecutive spaces is treated the same as one space. XML parsers preserve whitespace, unless explicitly instructed otherwise. Because all elements must be closed, empty elements must end with />. For instance in HTML, the line break is <br>, while in XML, it's written as <br />.[1]

There is another restriction on XML documents. Since XML documents can be parsed into a tree of elements, the outermost element is known as the root element . Just as a tree has only one trunk, an XML document must have exactly one root element. In the previous book example, this means chapters must be bundled inside a book tag. If you want to place multiple books inside a document, you need to package them inside a bookcase or another container. This limitation applies only to the document root. Again, just like trees can have multiple branches off of the trunk, it's legal to store multiple books inside a bookcase.

This chapter doesn't aim to teach you XML; for an introduction to XML, see Learning XML, by Erik T. Ray. A solid nuts-and-bolts guide to all aspects of XML is XML in a Nutshell, by Elliotte Rusty Harold and W. Scott Means. Both books are published by O'Reilly & Associates.

Now that we've covered the rules, here's an example: if you are a librarian and want to convert your card catalog to XML, start with this basic set of XML tags:

<book>
    <title>PHP Cookbook</title>
    <author>Sklar, David and Trachtenberg, Adam</author>
    <subject>PHP</subject>
</book>

From there, you can add new elements or modify existing ones. For example, <author> can be divided into first and last name, or you can allow for multiple records so two authors aren't placed in one field.

The first three recipes in this chapter cover writing and reading XML. Recipe 12.2 shows how to write XML without additional tools. To use the DOM XML extension to write XML in a standardized fashion, see Recipe 12.3. Reading XML using DOM is the topic of Recipe 12.4.

But XML isn't an end by itself. Once you've gathered all your XML, the real question is "What do you do with it?" With an event-based parser, as described in Recipe 12.5, you can make element tags trigger actions, such as storing data into easily manipulated structures or reformatting the text.

With XSLT, you can take a XSL stylesheet and turn XML into viewable output. By separating content from presentation, you can make one stylesheet for web browsers, another for PDAs, and a third for cell phones, all without changing the content itself. This is the subject of Recipe 12.6.

You can use a protocol such as XML-RPC or SOAP to exchange XML messages between yourself and a server, or to act as a server yourself. You can thus put your card catalog on the Internet and allow other programmers to query the catalog and retrieve book records in a format that's easy for them to parse and display in their applications. Another use would be to set up an RSS feed that gets updated whenever the library gets a new book in stock. XML-RPC clients and servers are the subjects of Recipe 12.7 and Recipe 12.8, respectively. Recipe 12.9 and Recipe 12.10 cover SOAP clients and servers. WDDX, a data exchange format that originated with the ColdFusion language, is the topic of Recipe 12.11. Reading RSS feeds, a popular XML-based headline syndication format, is covered in Recipe 12.12.

As with many bleeding-edge technologies, some of PHP's XML tools are not feature-complete and bug-free. However, XML is an area of active development in the PHP community; new features are added and bugs are fixed on a regular basis. As a result, many XML functions documented here are still experimental. Sometimes, all that means is that the function is 99% complete, but there may be a few small bugs lying around. Other times, it means that the name or the behavior of the function could be completely changed. If a function is in a highly unstable state, we mention it in the recipe.

We've documented the functions as they're currently planned to work in PHP 4.3. Because XML is such an important area, it made no sense to omit these recipes from the book. Also, we wanted to make sure that the latest functions are used in our examples. This can, however, lead to small problems if the function names and prototypes change. If you find that a recipe isn't working as you'd expect it to, please check the online PHP manual or the errata section of the catalog page for the PHP Cookbook, http://www.oreilly.com/catalog/phpckbk.

Generating XML Manually

Problem

You want to generate XML. For instance, you want to provide an XML version of your data for another program to parse.

Solution

Loop through your data and print it out surrounded by the correct XML tags:

header('Content-Type: text/xml');
print '<?xml version="1.0"?>' . "\n";
print "<shows>\n";

$shows = array(array('name'     => 'Simpsons',
                     'channel'  => 'FOX', 
                     'start'    => '8:00 PM',
                     'duration' => '30'),

               array('name'     => 'Law & Order', 
                     'channel'  => 'NBC',
                     'start'    => '8:00 PM',
                     'duration' => '60'));

foreach ($shows as $show) {
    print "    <show>\n";
    foreach($show as $tag => $data) {
        print "        <$tag>" . htmlspecialchars($data) . "</$tag>\n";
    }
    print "    </show>\n";
}

print "</shows>\n";  

Discussion

Printing out XML manually mostly involves lots of foreach loops as you iterate through arrays. However, there are a few tricky details. First, you need to call header( ) to set the correct Content-Type header for the document. Since you're sending XML instead of HTML, it should be text/xml.

Next, depending on your settings for the short_open_tag configuration directive, trying to print the XML declaration may accidentally turn on PHP processing. Since the <? of <?xml version="1.0"?> is the short PHP open tag, to print the declaration to the browser you need to either disable the directive or print the line from within PHP. We do the latter in the Solution.

Last, entities must be escaped. For example, the & in the show Law & Order needs to be &amp;. Call htmlspecialchars( ) to escape your data.

The output from the example in the Solution is:

<?xml version="1.0"?>
<shows>
    <show>
        <name>Simpsons</name>
        <channel>FOX</channel>
        <start>8:00 PM</start>
        <duration>30</duration>
    </show>
    <show>
        <name>Law &amp; Order</name>
        <channel>NBC</channel>
        <start>8:00 PM</start>
        <duration>60</duration>
    </show>
</shows>

See Also

Recipe 12.3 for generating XML using DOM; Recipe 12.4 for reading XML with DOM; documentation on htmlspecialchars( ) at http://www.php.net/htmlspecialchars.

Generating XML with the DOM

Problem

You want to generate XML but want to do it in an organized way instead of using print and loops.

Solution

Use PHP's DOM XML extension to create a DOM object; then, call dump_mem( ) or dump_file( ) to generate a well-formed XML document:

// create a new document
$dom = domxml_new_doc('1.0');

// create the root element, <book>, and append it to the document
$book = $dom->append_child($dom->create_element('book'));

// create the title element and append it to $book
$title = $book->append_child($dom->create_element('title'));

// set the text and the cover attribute for $title
$title->append_child($dom->create_text_node('PHP Cookbook'));
$title->set_attribute('cover', 'soft');

// create and append author elements to $book
$sklar = $book->append_child($dom->create_element('author'));
// create and append the text for each element
$sklar->append_child($dom->create_text_node('Sklar'));

$trachtenberg = $book->append_child($dom->create_element('author'));
$trachtenberg->append_child($dom->create_text_node('Trachtenberg'));

// print a nicely formatted version of the DOM document as XML
echo $dom->dump_mem(true);
<?xml version="1.0"?>
               <book>
                 <title cover="soft">PHP Cookbook</title>
                 <author>Sklar</author>
                 <author>Trachtenberg</author>
               </book>
            

Discussion

A single element is known as a node . Nodes can be of a dozen different types, but the three most popular are elements, attributes, and text. Given this:

<book cover="soft">PHP Cookbook</book>

PHP's DOM XML functions refer to book as type XML_ELEMENT_NODE , cover="soft" maps to an XML_ATTRIBUTE_NODE , and PHP Cookbook is a XML_TEXT_NODE.

For DOM parsing, PHP uses libxml, developed for the Gnome project. You can download it from http://www.xmlsoft.org. To activate it, configure PHP with --with-dom.

The revamped PHP 4.3 DOM XML functions follow a pattern. You create an object as either an element or a text node, add and set any attributes you want, and then append it to the tree in the spot it belongs.

Before creating elements, create a new document, passing the XML version as the sole argument:

$dom = domxml_new_doc('1.0');

Now create new elements belonging to the document. Despite being associated with a specific document, nodes don't join the document tree until appended:

$book_element = $dom->create_element('book');
$book = $dom->append_child($book_element);

Here a new book element is created and assigned to the object $book_element. To create the document root, append $book_element as a child of the $dom document. The result, $book, refers to the specific element and its location within the DOM object.

All nodes are created by calling a method on $dom. Once a node is created, it can be appended to any element in the tree. The element from which we call the append_child( ) method determines the location in the tree where the node is placed. In the previous case, $book_element is appended to $dom. The element appended to $dom is the top-level node, or the root node.

You can also append a new child element to $book. Since $book is a child of $dom, the new element is, by extension, a grandchild of $dom:

$title_element = $dom->create_element('title');
$title = $book->append_child($title_element);

By calling $book->append_child( ), this code places the $title_element element under the $book element.

To add the text inside the <title></title> tags, create a text node using create_text_node( ) and append it to $title:

$text_node = $dom->create_text_node('PHP Cookbook');
$title->append_child($text_node);

Since $title is already added to the document, there's no need to reappend it to $book.

The order in which you append children to nodes isn't important. The following four lines, which first append the text node to $title_element and then to $book, are equivalent to the previous code:

$title_element = $dom->create_element('title');
$text_node = $dom->create_text_node('PHP Cookbook');

$title_element->append_child($text_node);
$book->append_child($title_element);

To add an attribute, call set_attribute( ) upon a node, passing the attribute name and value as arguments:

$title->set_attribute('cover', 'soft');

If you print the title element now, it looks like this:

<title cover="soft">PHP Cookbook</title>

Once you're finished, you can output the document as a string or to a file:

// put the string representation of the XML document in $books
$books = $dom->dump_mem( );

// write the XML document to books.xml
$dom->dump_file('books.xml', false, true);

The only parameter dump_mem( ) takes is an optional boolean value. An empty value or false means "return the string as one long line." A true value causes the XML to be nicely formatted with child nodes indented, like this:

<?xml version="1.0"?>
<book>
  <title cover="soft">PHP Cookbook</title>
</book>

You can pass up to three values to dump_file( ) . The first one, which is mandatory, is the filename. The second is whether the file should be compressed with gzip. The final value is the same pretty formatting option as dump_mem( ).

See Also

Recipe 12.2 for writing XML without DOM; Recipe 12.4 for parsing XML with DOM; documentation on domxml_new_dom( ) at http://www.php.net/domxml-new-dom and the DOM functions in general at http://www.php.net/domxml; more information about the underlying DOM C library at http://xmlsoft.org/.

Parsing XML with the DOM

Problem

You want to parse an XML file using the DOM API. This puts the file into a tree, which you can process using DOM functions. With the DOM, it's easy to search for and retrieve elements that fit a certain set of criteria.

Solution

Use PHP's DOM XML extension. Here's how to read XML from a file:

$dom = domxml_open_file('books.xml');

Here's how to read XML from a variable:

$dom = domxml_open_mem($books);

You can also get just a single node. Here's how to get the root node:

$root = $dom->document_element( );

Here's how to do a depth-first recursion to process all the nodes in a document:

function process_node($node) {
    if ($node->has_child_nodes( )) {
        foreach($node->child_nodes( ) as $n) {
            process_node($n);
        }
    }

    // process leaves
    if ($node->node_type( ) =  = XML_TEXT_NODE) {
        $content = rtrim($node->node_value( ));
        if (!empty($content)) {
            print "$content\n";
        }
    }

}
process_node($root);

Discussion

The W3C's DOM provides a platform- and language-neutral method that specifies the structure and content of a document. Using the DOM, you can read an XML document into a tree of nodes and then maneuver through the tree to locate information about a particular element or elements that match your criteria. This is called tree-based parsing . In contrast, the non-DOM XML functions allow you to do event-based parsing.

Additionally, you can modify the structure by creating, editing, and deleting nodes. In fact, you can use the DOM XML functions to author a new XML document from scratch; see Recipe 12.3

One of the major advantages of the DOM is that by following the W3C's specification, many languages implement DOM functions in a similar manner. Therefore, the work of translating logic and instructions from one application to another is considerably simplified. PHP 4.3 comes with an updated series of DOM functions that are in stricter compliance with the DOM standard than previous versions of PHP. However, the functions are not yet 100% compliant. Future PHP versions should bring a closer alignment, but this may break some applications that need minor updates. Check the DOM XML material in the online PHP Manual at http://www.php.net/domxml for changes. Functions available in earlier versions of PHP are available, but deprecated.

The DOM is large and complex. For more information, read the specification at http://www.w3.org/DOM/ or pick up a copy of XML in a Nutshell; Chapter 18 discusses the DOM.

For DOM parsing, PHP uses libxml , developed for the Gnome project. You can download it from http://www.xmlsoft.org. To activate it, configure PHP with --with-dom.

DOM functions in PHP are object-oriented. To move from one node to another, call methods such as $node->child_nodes( ) , which returns an array of node objects, and $node->parent_node( ) , which returns the parent node object. Therefore, to process a node, check its type and call a corresponding method:

// $node is the DOM parsed node <book cover="soft">PHP Cookbook</book>
$type = $node->node_type();

switch($type) { 
case XML_ELEMENT_NODE:
    // I'm a tag. I have a tagname property.
    print $node->node_name();  // prints the tagname property: "book" 
    print $node->node_value(); // null
    break;
case XML_ATTRIBUTE_NODE:
    // I'm an attribute. I have a name and a value property.
    print $node->node_name();  // prints the name property: "cover"
    print $node->node_value(); // prints the value property: "soft"
    break;
case XML_TEXT_NODE:
    // I'm a piece of text inside an element.
    // I have a name and a content property.
    print $node->node_name();  // prints the name property: "#text"
    print $node->node_value(); // prints the content property: "PHP Cookbook"
    break;
default:
    // another type
    break;
}

To automatically search through a DOM tree for specific elements, use get_elements_by_tagname( ) . Here's how to do so with multiple book records:

<books>
    <book>
        <title>PHP Cookbook</title>
        <author>Sklar</author>
        <author>Trachtenberg</author>
        <subject>PHP</subject>
    </book>
    <book>
        <title>Perl Cookbook</title>
        <author>Christiansen</author>
        <author>Torkington</author>
        <subject>Perl</subject>
    </book>
</books>

Here's how to find all authors:

// find and print all authors
$authors = $dom->get_elements_by_tagname('author');

// loop through author elements
foreach ($authors as $author) { 
    // child_nodes( ) hold the author values
    $text_nodes = $author->child_nodes( );

    foreach ($text_nodes as $text) {    
         print $text->node_value( );
    }
    print "\n";
}

The get_elements_by_tagname( ) function returns an array of element node objects. By looping through each element's children, you can get to the text node associated with that element. From there, you can pull out the node values, which in this case are the names of the book authors, such as Sklar and Trachtenberg.

See Also

Recipe 12.2 for writing XML without DOM; Recipe 12.3 for writing XML with DOM; Recipe 12.5 for event-based XML parsing; documentation on domxml_open_file( ) at http://www.php.net/domxml-open-file, domxml_open_mem( ) at http://www.php.net/domxml-open-mem, and the DOM functions in general at http://www.php.net/domxml; more information about the underlying DOM C library at http://xmlsoft.org/.

Parsing XML with SAX

Problem

You want to parse an XML document and format it on an event basis, such as when the parser encounters a new opening or closing element tag. For instance, you want to turn an RSS feed into HTML.

Solution

Use the parsing functions in PHP's XML extension:

$xml = xml_parser_create();
$obj = new Parser_Object;  // a class to assist with parsing

xml_set_object($xml,$obj);
xml_set_element_handler($xml, 'start_element', 'end_element');
xml_set_character_data_handler($xml, 'character_data');
xml_parser_set_option($xml, XML_OPTION_CASE_FOLDING, false);

$fp = fopen('data.xml', 'r') or die("Can't read XML data.");
while ($data = fread($fp, 4096)) {
  xml_parse($xml, $data, feof($fp)) or die("Can't parse XML data");
}       
fclose($fp);

xml_parser_free($xml);

Discussion

These XML parsing functions require the expat library. However, because Apache 1.3.7 and later is bundled with expat, this library is already installed on most machines. Therefore, PHP enables these functions by default, and you don't need to explicitly configure PHP to support XML.

expat parses XML documents and allows you to configure the parser to call functions when it encounters different parts of the file, such as an opening or closing element tag or character data (the text between tags). Based on the tag name, you can then choose whether to format or ignore the data. This is known as event-based parsing and contrasts with DOM XML, which use a tree-based parser.

A popular API for event-based XML parsing is SAX: Simple API for XML. Originally developed only for Java, SAX has spread to other languages. PHP's XML functions follow SAX conventions. For more on the latest version of SAX — SAX2 — see SAX2 by David Brownell (O'Reilly).

PHP supports two interfaces to expat: a procedural one and an object-oriented one. Since the procedural interface practically forces you to use global variables to accomplish any meaningful task, we prefer the object-oriented version. With the object-oriented interface, you can bind an object to the parser and interact with the object while processing XML. This allows you to use object properties instead of global variables.

Here's an example application of expat that shows how to process an RSS feed and transform it into HTML. For more on RSS, see Recipe 12.12. The script starts with the standard XML processing code, followed by the objects created to parse RSS specifically:

$xml = xml_parser_create( );
$rss = new pc_RSS_parser;

xml_set_object($xml, $rss);
xml_set_element_handler($xml, 'start_element', 'end_element');
xml_set_character_data_handler($xml, 'character_data');
xml_parser_set_option($xml, XML_OPTION_CASE_FOLDING, false);

$feed = 'http://pear.php.net/rss.php';
$fp = fopen($feed, 'r') or die("Can't read RSS data.");
while ($data = fread($fp, 4096)) {
  xml_parse($xml, $data, feof($fp)) or die("Can't parse RSS data");
}       
fclose($fp);

xml_parser_free($xml);

After creating a new XML parser and an instance of the pc_RSS_parser class, configure the parser. First, bind the object to the parser; this tells the parser to call the object's methods instead of global functions. Then call xml_set_element_handler( ) and xml_set_character_data_handler( ) to specify the method names the parser should call when it encounters elements and character data. The first argument to both functions is the parser instance; the other arguments are the function names. With xml_set_element_handler( ), the middle and last arguments are the functions to call when a tag opens and closes, respectively. The xml_set_character_data_handler( ) function takes only one additional argument — the function to call when it processes character data.

Because an object has been associated with our parser, when that parser finds the string <tag>data</tag>, it calls $rss->start_element( ) when it reaches <tag>; $rss->character_data( ) when it reaches data; and $rss->end_element( ) when it reaches </tag>. The parser can't be configured to automatically call individual methods for each specific tag; instead, you must handle this yourself. However, the PEAR package XML_Transform provides an easy way to assign handlers on a tag-by-by basis.

The last XML parser configuration option tells the parser not to automatically convert all tags to uppercase. By default, the parser folds tags into capital letters, so <tag> and <TAG> both become the same element. Since XML is case-sensitive, and most feeds use lowercase element names, this feature should be disabled.

With the parser configured, feed the data to the parser:

$feed = 'http://pear.php.net/rss.php';
$fp = fopen($feed, 'r') or die("Can't read RSS data.");
while ($data = fread($fp, 4096)) {
  xml_parse($xml, $data, feof($fp)) or die("Can't parse RSS data");
}       
fclose($fp);

In order to curb memory usage, load the file in 4096-byte chunks, and feed each piece to the parser one at a time. This requires you to write the handler functions that will accommodate text arriving in multiple calls and not assume the entire string comes in all at once.

Last, while PHP cleans up any open parsers when the request ends, you can also manually close the parser by calling xml_parser_free( ) .

Now that the generic parsing is properly set up, add the pc_RSS_item and pc_RSS_parser classes, as shown in Examples Example 12-1 and Example 12-2, to handle a RSS document.

Example 12-1. pc_RSS_item

class pc_RSS_item {

  var $title = '';
  var $description = '';
  var $link = '';

  function display() {
    printf('<p><a href="%s">%s</a><br />%s</p>',
            $this->link,htmlspecialchars($this->title),
            htmlspecialchars($this->description));
  }
}

Example 12-2. pc_RSS_parser

class pc_RSS_parser {
  
  var $tag;
  var $item;
  
  function start_element($parser, $tag, $attributes) {
    if ('item' == $tag) {
      $this->item = new pc_RSS_item;
    } elseif (!empty($this->item)) {
      $this->tag = $tag;
    }
  }
  
  function end_element($parser, $tag) {
    if ('item' == $tag) {
      $this->item->display();
      unset($this->item); 
    }
  }
  
  function character_data($parser, $data) {
    if (!empty($this->item)) {
      if (isset($this->item->{$this->tag})) {
        $this->item->{$this->tag} .= trim($data);
      }
    }
  }
}  

The pc_RSS_item class provides an interface to an individual feed item. This removes the details of displaying each item from the general parsing code and makes it easy to reset the data for a new item by calling unset( ).

The pc_RSS_item::display( ) method prints out an HTML-formatted RSS item. It calls htmlspecialchars( ) to reencode any necessary entities, because expat decodes them into regular characters while parsing the document. This reencoding, however, breaks on feeds that place HTML in the title and description instead of plaintext.

Within pc_RSS_parser( ), the start_element( ) method takes three parameters: the XML parser, the name of the tag, and an array of attribute/value pairs (if any) from the element. PHP automatically supplies these values to the handler as part of the parsing process.

The start_element( ) method checks the value of $tag. If it's item, the parser's found a new RSS item, and a new pc_RSS_item object is instantiated. Otherwise, it checks to see if $this->item is empty( ); if it isn't, the parser is inside an item element. It's then necessary to record the tag's name, so that the character_data( ) method knows which property to assign its value to. If it is empty, this part of the RSS feed isn't necessary for our application, and it's ignored.

When the parser finds a closing item tag, the corresponding end_element( ) method first prints the RSS item, then cleans up by deleting the object.

Finally, the character_data( ) method is responsible for assigning the values of title, description, and link to the RSS item. After making sure it's inside an item element, it checks that the current tag is one of the properties of pc_RSS_item. Without this check, if the parser encountered an element other than those three, its value would also be assigned to the object. The { } s are needed to set the object property dereferencing order. Notice how trim($data) is appended to the property instead of a direct assignment. This is done to handle cases in which the character data is split across the 4096-byte chunks retrieved by fread( ); it also removes the surrounding whitespace found in the RSS feed.

If you run the code on this sample RSS feed:

<?xml version="1.0"?>
<rss version="0.93">
<channel>
  <title>PHP Announcements</title>
  <link>http://www.php.net/</link>
  <description>All the latest information on PHP.</description>

  <item>
    <title>PHP 5.0 Released!</title>
    <link>http://www.php.net/downloads.php</link>
    <description>The newest version of PHP is now available.</description>
  </item>
</channel>
</rss>

It produces this HTML:

<p><a href="http://www.php.net/downloads.php">PHP 5.0 Released!</a><br />
The newest version of PHP is now available.</p>

See Also

Recipe 12.4 for tree-based XML parsing with DOM; Recipe 12.12 for more on parsing RSS; documentation on xml_parser_create( ) at http://www.php.net/xml-parser-create, xml_element_handler( ) at http://www.php.net/xml-element-handler, xml_character_handler( ) at http://www.php.net/xml-character-handler, xml_parse( ) at http://www.php.net/xml-parse, and the XML functions in general at http://www.php.net/xml; the official SAX site at http://www.saxproject.org/.

Transforming XML with XSLT

Problem

You have a XML document and a XSL stylesheet. You want to transform the document using XSLT and capture the results. This lets you apply stylesheets to your data and create different versions of your content for different media.

Solution

Use PHP's XSLT extension:

$xml = 'data.xml';
$xsl = 'stylesheet.xsl';

$xslt = xslt_create( );
$results = xslt_process($xslt, $xml, $xsl);

if (!$results) {
    error_log("XSLT Error: #".xslt_errno($xslt).": ".xslt_error($xslt));
}

xslt_free($xslt);

The transformed text is stored in $results.

Discussion

XML documents describe the content of data, but they don't contain any information about how those data should be displayed. However, when XML content is coupled with a stylesheet described using XSL (eXtensible Stylesheet Language), the content is displayed according to specific visual rules.

The glue between XML and XSL is XSLT, which stands for eXtensible Stylesheet Language Transformations. These transformations apply the series of rules enumerated in the stylesheet to your XML data. So, just as PHP parses your code and combines it with user input to create a dynamic page, an XSLT program uses XSL and XML to output a new page that contains more XML, HTML, or any other format you can describe.

There are a few XSLT programs available, each with different features and limitations. PHP currently supports only the Sablotron XSLT processor, but in the future you'll be able to use other programs, such as Xalan and Libxslt. You can download Sablotron from http://www.gingerall.com. To enable Sablotron for XSLT processing, configure PHP with both --enable-xslt and --with-xslt-sablot.

Processing documents takes a few steps. First, you need to grab a handle to a new instance of an XSLT processor with xslt_create( ) . Then, to transform the files, use xslt_process( ) to make the transformation and check the results:

$xml = 'data.xml';
$xsl = 'stylesheet.xsl';

$xslt = xslt_create( );
$results = xslt_process($xslt, $xml, $xsl);

You start by defining variables to store the filenames for the XML data and the XSL stylesheet. They're the first two parameters to the transforming function, xslt_process( ). If the fourth argument is missing, as it is here, or set to NULL, the function returns the results. Otherwise, it writes the resulting data to the filename passed:

xslt_process($xslt, $xml, $xsl, 'data.html');

If you want to provide your XML and XSL data from variables instead of files, call xslt_process( ) with a fifth parameter, which allows you to substitute string placeholders for your files:

// grab data from database
$r = mysql_query("SELECT pages.page AS xml, templates.template AS xsl
                  FROM pages, templates
                  WHERE pages.id=$id AND templates.id=pages.template") 
     or die("$php_errormsg");

$obj = mysql_fetch_object($r);
$xml = $obj->xml;
$xsl = $obj->xsl;

// map the strings to args
$args = array('/_xml' => $xml,
              '/_xsl' => $xsl);

$results = xslt_process($xslt, 'arg:/_xml', 'arg:/_xsl', NULL, $args);

When reading and writing files, Sablotron supports two types of URIs. The PHP default is file:, so Sablotron looks for the data on the filesystem. Sablotron also uses a custom URI of arg:, which allows users to alternatively pass in data using arguments. That's the feature used here.

In the previous example, the data for the XML and XSL comes from a database, but, it can arrive from anywhere, such as a remote URL or POSTed data. Once you've obtained the data, create the $args array. This sets up mappings between the argument names and the variable names. The keys of the associative array are the argument names passed to xslt_process( ); the values are the variables holding the data. By convention, /_xml and /_xsl are the argument names; however, you can use others.

Then call xslt_process( ) and in place of data.xml, use arg:/_xml, with arg: being the string that lets the extension know to look in the $args array. Because you're passing in $args as the fifth parameter, you need to pass NULL as the fourth argument; this makes sure the function returns the results.

Error checking is done using xslt_error( ) and xslt_errno( ) functions:

if (!$results) {
    error_log('XSLT Error: #' . xslt_errno($xslt) . ': ' . xslt_error($xslt));
}

The xslt_error( ) function returns a formatted message describing the error, while xslt_errno( ) provides a numeric error code.

To set up your own custom error handling code, register a function using xslt_set_error_handler( ). If there are errors, that function is automatically called instead of any built-in error handler.

function xslt_error_handler($processor, $level, $number, $messages) {
    error_log("XSLT Error: #$level");
}

xslt_set_error_handler($xslt, 'xslt_error_handler');

Finally, PHP cleans up any open XSLT processors when the request ends, but here's how to manually close the processor and free its memory:

xslt_free($xslt);

See Also

Documentation on xslt_create( ) at http://www.php.net/xslt-create, xslt_process( ) at http://www.php.net/xslt-process, xslt_errno( ) at http://www.php.net/xslt-errno, xslt_error( ) at http://www.php.net/xslt-error, xslt_error_handler( ) at http://www.php.net/xslt-error-handler, and xslt_free( ) at http://www.php.net/xslt-free; XSLT, by Doug Tidwell (O'Reilly).

Sending XML-RPC Requests

Problem

You want to be an XML-RPC client and make requests of a server. XML-RPC lets PHP make function calls to web servers, even if they don't use PHP. The retrieved data is then automatically converted to PHP variables for use in your application.

Solution

Use PHP's built-in XML-RPC extension with some helper functions. As of PHP 4.1, PHP bundles the xmlrpc-epi extension. Unfortunately, xmlrpc-epi does not have any native C functions for taking a XML-RPC formatted string and making a request. However, the folks behind xmlrpc-epi have a series of helper functions written in PHP available for download at http://xmlrpc-epi.sourceforge.net/xmlrpc_php/index.php. The only file used here is the one named index.php, which is located in xmlrpc_php/. To install it, just copy that file to a location where PHP can find it in its include_path.

Here's some client code that calls a function on an XML-RPC server that returns state names:

// this is the default file name from the package
// kept here to avoid confusion over the file name
require 'utils.php';

// server settings
$host = 'betty.userland.com';
$port = 80;
$uri = '/RPC2';

// request settings
// pass in a number from 1-50; get the nth state in alphabetical order
// 1 is Alabama, 50 is Wyoming
$method = 'examples.getStateName';
$args = array(32); // data to be passed

// make associative array out of these variables
$request = compact('host', 'port', 'uri', 'method', 'args');

// this function makes the XML-RPC request
$result = xu_rpc_http_concise($request);

print "I love $result!\n";

Discussion

XML-RPC, a format created by Userland Software, allows you to make a request to a web server using HTTP. The request itself is a specially formatted XML document. As a client, you build up an XML request to send that fits with the XML-RPC specification. You then send it to the server, and the server replies with an XML document. You then parse the XML to find the results. In the Solution, the XML-RPC server returns a state name, so the code prints:

I love New York!

Unlike earlier implementations of XML-RPC, which were coded in PHP, the current bundled extension is written in C, so there is a significant speed increase in processing time. To enable this extension while configuring PHP, add --with-xmlrpc.

The server settings tell PHP which web site to contact to make the request. The $host is the hostname of the machine; $port is the port the web server is running on, which is usually port 80; and $uri is the pathname to the XML-RPC server you wish to contact. This request is equivalent to http://betty.userland.com:80/RPC2. If no port is given, the function defaults to port 80, and the default URI is the web server root, /.

The request settings are the function to call and the data to pass to the function. The method examples.getStateName takes an integer from 1 to 50 and returns a string with the name of the U.S. state, in alphabetical order. In XML-RPC, method names can have periods, while in PHP, they cannot. If they could, the PHP equivalent to passing 32 as the argument to the XML-RPC call to examples.getStateName is calling a function named examples.getStateName( ):

examples.getStateName(32);

In XML-RPC, it looks like this:

<?xml version='1.0' encoding="iso-8859-1" ?>
<methodCall>
<methodName>examples.getStateName</methodName>
<params><param><value>
   <int>32</int>
  </value>
 </param>
</params>
</methodCall>

The server settings and request information go into a single associative array that is passed to xu_rpc_http_concise( ) . As a shortcut, call compact( ) , which is identical to:

$request = array('host'   => $host,
                 'port'   => $port,
                 'uri'    => $uri,
                 'method' => $method,
                 'args'   => $args);

The xu_rpc_http_concise( ) function makes the XML-RPC call and returns the results. Since the return value is a string, you can print $results directly. If the XML-RPC call returns multiple values, xu_rpc_http_concise( ) returns an array.

There are 10 different parameters that can be passed in the array to xu_rpc_http_concise( ), but the only one that's required is host. The parameters are shown in Table 12-1.

Table 12-1. Parameters for xu_rpc_http_concise( )

Name Description
host Server hostname
uri Server URI (default /)
port Server port (default 80)
method Name of method to call
args Arguments to pass to method
debug Debug level (0 to 2: 0 is none, 2 is lots)
timeout Number of seconds before timing out the request; a value of 0 means never timeout
user Username for Basic HTTP Authentication, if necessary
pass Password for Basic HTTP Authentication, if necessary
secure Use SSL for encrypted transmissions; requires PHP to be built with SSL support (pass any true value)

See Also

Recipe 12.8 for more on XML-RPC servers; PHP helper functions for use with the xmlrpc-epi extension at http://xmlrpc-epi.sourceforge.net/; Programming Web Services with XML-RPC, by Simon St. Laurent, Joe Johnston, and Edd Dumbill (O'Reilly); more on XML-RPC at http://www.xml-rpc.com

Receiving XML-RPC Requests

Problem

You want to create an XML-RPC server and respond to XML-RPC requests. This allows any XML-RPC-enabled client to ask your server questions and you to reply with data.

Solution

Use PHP's XML-RPC extension. Here is a PHP version of the Userland XML-RPC demonstration application that returns an ISO 8601 string with the current date and time:

// this is the function exposed as "get_time( )"
function return_time($method, $args) {
   return date('Ymd\THis');
}
  
$server = xmlrpc_server_create( ) or die("Can't create server");
xmlrpc_server_register_method($server, 'return_time', 'get_time') 
    or die("Can't register method.");
  
$request = $GLOBALS['HTTP_RAW_POST_DATA'];
$options = array('output_type' => 'xml', 'version' => 'xmlrpc');

print xmlrpc_server_call_method($server, $request, NULL, $options)
    or die("Can't call method");
  
xmlrpc_server_destroy($server);

Discussion

Since the bundled XML-RPC extension, xmlrpc-epi, is written in C, it processes XML-RPC requests in a speedy and efficient fashion. Add --with-xmlrpc to your configure string to enable this extension during compile time. For more on XML-RPC, see Recipe 12.7.

The Solution begins with a definition of the PHP function to associate with the XML-RPC method. The name of the function is return_time( ) . This is later linked with the get_time( ) XML-RPC method:

function return_time($method, $args) {
   return date('Ymd\THis');
}

The function returns an ISO 8601-formatted string with the current date and time. We escape the T inside the call to date( ) because the specification requires a literal T to divide the date part and the time part. For August 21, 2002 at 3:03:51 P.M., the return value is 20020821T150351.

The function is automatically called with two parameters: the name of the XML-RPC method the server is responding to and an array of method arguments passed by the XML-RPC client to the server. In this example, the server ignores both variables.

Next, create the XML-RPC server and register the get_time( ) method:

$server = xmlrpc_server_create( ) or die("Can't create server");
xmlrpc_server_register_method($server, 'return_time', 'get_time');

We create a new server and assign it to $server, then call xmlrpc_server_register_method( ) with three parameters. The first is the newly created server, the second is the name of the method to register, and the third is the name of the PHP function to handle the request.

Now that everything is configured, tell the XML-RPC server to dispatch the method for processing and print the results to the client:

$request = $GLOBALS['HTTP_RAW_POST_DATA'];
$options = array('output_type' => 'xml', 'version' => 'xmlrpc');

print xmlrpc_server_call_method($server, $request, NULL, $options);

The client request comes in as POST data. PHP converts HTTP POST data to variables, but this is XML-RPC data, so the server needs to access the unparsed data, which is stored in $GLOBALS['HTTP_RAW_POST_DATA']. In this example, the request XML looks like this:

<?xml version="1.0" encoding="iso-8859-1"?>
<methodCall>
<methodName>get_time</methodName>
<params/></methodCall>

Thus, the server is responding to the get_time( ) method, and it expects no parameters.

We also configure the response options to output the results in XML and interpret the request as XML-RPC. These two variables are then passed to xmlrpc_server_call_method( ) along with the XML-RPC server, $server. The third parameter to this function is for any user data you wish to provide; in this case, there is none, so we pass NULL.

The xmlrpc_server_call_method( ) function decodes the variables, calls the correct function to handle the method, and encodes the response into XML-RPC. To reply to the client, all you need to do is print out what xmlrpc_server_call_method( ) returns.

Finally, clean up with a call to xmlrpc_server_destroy( ):

xmlrpc_server_destroy($server);

Using the XML-RPC client code from Recipe 12.7, you can make a request and find the time, as follows:

require 'utils.php';

$output = array('output_type' => 'xml', 'version' => 'xmlrpc');
$result = xu_rpc_http_concise(array(
                             'method'  => 'get_time',
                             'host'    => 'clock.example.com',
                             'port'    => 80,
                             'uri'     => '/time-xmlrpc.php',
                             'output' => $output));
  
print "The local time is $result.\n";
The local time is 20020821T162615.

It is legal to associate multiple methods with a single XML-RPC server. You can also associate multiple methods with the same PHP function. For example, we can create a server that replies to two methods: get_gmtime( ) and get_time( ). The first method, get_gmtime( ), is similar to get_time( ), but it replies with the current time in GMT. To handle this, you can extend get_time( ) to take an optional parameter, which is the name of a time zone to use when computing the current time.

Here's how to change the return_time( ) function to handle both methods:

function return_time($method, $args) {
    if ('get_gmtime' == $method) {
        $tz = 'GMT';
    } elseif (!empty($args[0])) {
        $tz = $args[0];
    } else {
        // use local time zone
        $tz = '';
    }

    if ($tz) { putenv("TZ=$tz"); }
    $date = date('Ymd\THis');
    if ($tz) { putenv('TZ=EST5EDT'); } // change EST5EDT to your server's zone

    return $date;
}

This function uses both the $method and $args parameters. At the top of the function, we check if the request is for get_gmtime. If so, the time zone is set to GMT. If it isn't, see if an alternate time zone is specified as an argument by checking $args[0]. If neither check is true, we keep the current time zone.

To configure the server to handle the new method, add only one new line:

xmlrpc_server_register_method($server, 'return_time', 'get_gmtime');

This maps get_gmtime( ) to return_time( ).

Here's an example of a client in action. The first request is for get_time( ) with no parameters; the second calls get_time( ) with a time zone of PST8PDT, which is three hours behind the server; the last request is for the new get_gmtime( ) method, which is four hours ahead of the server's time zone.

require 'utils.php';

$output = array('output_type' => 'xml', 'version' => 'xmlrpc');

// get_time( )
$result = xu_rpc_http_concise(array(
                             'method'  => 'get_time',
                             'host'    => 'clock.example.com',
                             'port'    => 80,
                             'uri'     => '/time.php',
                             'output' => $output));
  
print "The local time is $result.\n";

// get_time('PST8PDT')
$result = xu_rpc_http_concise(array(
                             'method'  => 'get_time',
                             'args'    => array('PST8PDT'),
                             'host'    => 'clock.example.com',
                             'port'    => 80,
                             'uri'     => '/time.php',
                             'output' => $output));
  
print "The time in PST8PDT is $result.\n";

// get_gmtime( )
$result = xu_rpc_http_concise(array(
                             'method'  => 'get_gmtime',
                             'host'    => 'clock.example.com',
                             'port'    => 80,
                             'uri'     => '/time.php',
                             'output' => $output));
  
print "The time in GMT is $result.\n";
The local time is 20020821T162615.
               The time in PST8PDT is 20020821T132615.
               The time in GMT is 20020821T202615.

See Also

Recipe 12.7 for more information about XML-RPC clients; documentation on xmlrpc_server_create( ) at http://www.php.net/xmlrpc-server-create, xmlrpc_server_register_method( ) at http://www.php.net/xmlrpc-server-register-method, xmlrpc_server_call_method( ) at http://www.php.net/xmlrpc-server-call-method, and xmlrpc_server_destroy( ) at http://www.php.net/xmlrpc-server-destroy; Programming Web Services with XML-RPC by Simon St. Laurent, Joe Johnston, and Edd Dumbill (O'Reilly); more on XML-RPC at http://www.xml-rpc.com; the original current time XML-RPC server at http://www.xmlrpc.com/currentTime.

Sending SOAP Requests

Problem

You want to send a SOAP request. Creating a SOAP client allows you to gather information from SOAP servers, regardless of their operating system and middleware software.

Solution

Use PEAR's SOAP classes. Here's some client code that uses the GoogleSearch SOAP service:

require 'SOAP/Client.php';

$query = 'php'; // your Google search terms

$soap = new SOAP_Client('http://api.google.com/search/beta2');

$params = array(
            new SOAP_Value('key',        'string',  'your google key'),
            new SOAP_Value('q',          'string',  $query),
            new SOAP_Value('start',      'int',     0),
            new SOAP_Value('maxResults', 'int',     10),
            new SOAP_Value('filter',     'boolean', false),
            new SOAP_Value('restrict',   'string',  ''),
            new SOAP_Value('safeSearch', 'boolean', false),
            new SOAP_Value('lr',         'string',  'lang_en'),
            new SOAP_Value('ie',         'string',  ''),
            new SOAP_Value('oe',         'string',  ''));

$hits = $soap->call('doGoogleSearch', $params, 'urn:GoogleSearch');

foreach ($hits->resultElements as $hit) {
    printf('<a href="%s">%s</a><br />', $hit->URL, $hit->title);
}

Discussion

The Simple Object Access Protocol (SOAP), is, like XML-RPC, a method for exchanging information over HTTP. It uses XML as its message format, which makes it easy to create and parse. As a result, because it's platform- and language-independent, SOAP is available on many platforms and in many languages, including PHP. To make a SOAP request, you instantiate a new SOAP_Client object and pass the constructor the location of the page to make the request:

$soap = new SOAP_Client('http://api.google.com/search/beta2');

Currently, two different types of communications methods are supported: HTTP and SMTP. Secure HTTP is also allowed, if SSL is built into your version of PHP. To choose one of these methods, begin your URL with http, https, or mailto.

After creating a SOAP_Client object, you use its call( ) method to call a remote function:

$query = 'php';

$params = array(
            new SOAP_Value('key',        'string',  'your google key'),
            new SOAP_Value('q',          'string',  $query),
            new SOAP_Value('start',      'int',     0),
            new SOAP_Value('maxResults', 'int',     10),
            new SOAP_Value('filter',     'boolean', false),
            new SOAP_Value('restrict',   'string',  ''),
            new SOAP_Value('safeSearch', 'boolean', false),
            new SOAP_Value('lr',         'string',  'lang_en'),
            new SOAP_Value('ie',         'string',  ''),
            new SOAP_Value('oe',         'string',  ''));

$hits = $soap->call('doGoogleSearch', $params, 'urn:GoogleSearch');

The $params array holds a collection of SOAP_Value objects. A SOAP_Value object is instantiated with three arguments: the name, type, and value of the parameter you're passing to the SOAP server. These vary from message to message, depending upon the SOAP functions available on the server.

The real action happens with the SOAP_Client::call( ) method, which takes a few arguments. The first is the method you want the server to execute; here, it's doGoogleSearch. The second argument is an array of parameters that gets passed to the function on the SOAP server. The third argument, urn:GoogleSearch, is the SOAP namespace; it allows the server to know that doGoogleSearch belongs in the GoogleSearch namespace. With namespaces, a more generally named search method doesn't cause a conflict with another more specific search method.

There's a fourth parameter that's unused here: soapAction. If you want to provide the SOAP server with a URI indicating the intent of the request, you can add one here. Unfortunately, the definition of the word "intent" varies from implementation to implementation. The current consensus is that soapAction shouldn't be used until its meaning is further clarified. The PEAR SOAP server doesn't use this field, but other vendors may assign their own meanings.

Upon successful execution, the function returns an object containing the server's response. If an error occurs, the function returns a PEAR_Error object. Google returns all sorts of information, but here we just iterate through the $resultElements array and pull out the URL and title of each hit for display:

foreach ($hits->resultElements as $hit) {
    printf('<a href="%s">%s</a><br />', $hit->URL, $hit->title);
}

This results in:

<a href="http://www.php.net/"><b>PHP</b>: Hypertext Preprocessor</a>
<a href="http://www.php.net/downloads.php"><b>PHP</b>: Downloads</a>
<a href="http://phpnuke.org/"><b>PHP</b>-Nuke</a>
<a href="http://www.phpbuilder.com/">PHPBuilder.com</a>
<a href="http://php.resourceindex.com/">The <b>PHP</b> Resource Index</a>
<a href="http://www.php.com/"><b>PHP</b>.com: Home</a>
<a href="http://www.php.org/"><b>PHP</b>.org</a>
<a href="http://php.weblogs.com/"><b>PHP</b> Everywhere:</a>
<a href="http://www.php3.org/"></a>
<a href="http://gtk.php.net/"><b>PHP</b>-GTK</a>

You can also use Web Services Definition Language (WSDL), to implement the request. With WSDL, you don't need to explicitly enumerate the parameter keys or the SOAP namespace:

require 'SOAP/Client.php';

$wsdl_url = 'http://api.google.com/GoogleSearch.wsdl';
$WSDL = new SOAP_WSDL($wsdl_url);
$soap = $WSDL->getProxy( );

$hits = $soap->doGoogleSearch('your google key',$query,0,10,
                               true,'',false,'lang_en','','');

This code is equivalent to the longer previous example. The SOAP_WSDL object takes a URL for the GoogleSearch WSDL file and automatically loads the specification from that URL. Instead of making $soap a SOAP_Client, call SOAP_WSDL::getProxy( ) to create a GoogleSearch object.

This new object has methods with the same name as the GoogleSearch SOAP methods. So, instead of passing doGoogleSearch as the first parameter to SOAP_Client::call( ), you call $soap->doGoogleSearch( ). The $params array becomes the arguments for the method, without any array encapsulation or SOAP_Value instantiations necessary. Also, because it's set in the WSDL file, the namespace doesn't need to be specified.

See Also

Recipe 12.10 for more on SOAP servers; Recipe 20.11 for an example of a SOAP client in a PHP-GTK application; PEAR's SOAP classes at http://pear.php.net/package-info.php?package=SOAP; Programming Web Services with SOAP, by Doug Tidwell, James Snell, and Pavel Kulchenko (O'Reilly); information on the Google SOAP service at http://www.google.com/apis/.

Receiving SOAP Requests

Problem

You want to create an SOAP server and respond to SOAP requests. If your server responds to SOAP requests, anyone on the Internet that has a SOAP client can make requests of your server.

Solution

Use PEAR's SOAP_Server class. Here's a server that returns the current date and time:

require 'SOAP/Server.php';

class pc_SOAP_return_time {
    var $method_namespace = 'urn:pc_SOAP_return_time';

    function return_time( ) {
        return date('Ymd\THis');
    }
}

$rt = new pc_SOAP_return_time( );

$server = new SOAP_Server;
$server->addObjectMap($rt);
$server->service($HTTP_RAW_POST_DATA);

Discussion

There are three steps to creating a SOAP server with PEAR's SOAP_Server class:

  1. Create a class to process SOAP methods and instantiate it
  2. Create an instance of a SOAP server and associate the processing object with the server
  3. Instruct the SOAP server to process the request and reply to the SOAP client

The PEAR SOAP_Server class uses objects to handle SOAP requests. A request-handling class needs a $method_namespace property that specifies the SOAP namespace for the class. In this case, it's urn:pc_SOAP_return_time. Object methods then map to SOAP procedure names within the namespace. The actual PHP class name isn't exposed via SOAP, so the fact that both the name of the class and its $method_namespace are identical is a matter of convenience, not of necessity:

class pc_SOAP_return_time {
    var $method_namespace = 'urn:pc_SOAP_return_time';

    function return_time( ) {
        return date('Ymd\THis');
    }
}

$rt = new pc_SOAP_return_time( );

Once the class is defined, you create an instance of the class to link methods with the SOAP server object. Before mapping the procedures to the class methods, however, you first must instantiate a SOAP_Server object:

$server = new SOAP_Server;
$server->addObjectMap($rt);
$server->service($GLOBALS['HTTP_RAW_POST_DATA']);

Once that's done, call SOAP_Server::addObjectMap( ) with the object to tell the SOAP server about the methods the object provides. Now the server is ready to reply to all SOAP requests within the namespace for which you've defined methods.

To tell the server to respond to the request, call SOAP_Server::service( ) and pass the SOAP envelope. Because the envelope arrives via POST, you pass $GLOBALS['HTTP_RAW_POST_DATA']. This provides the server with the complete request, because the class takes care of the necessary parsing.

To call this procedure using a PEAR SOAP client, use this code:

require 'SOAP/Client.php';
$soapclient = new SOAP_Client('http://clock.example.com/time-soap.php');
$result = $soapclient->call('return_time', array( ),
                             array('namespace' => 'urn:pc_SOAP_return_time'));
print "The local time is $result.\n";

This prints:

The local time is 20020821T132615.

To extend the method to read in parameters, you need to alter the method prototype to include parameter names and then modify the client request to include data for the additional arguments. This example modifies the SOAP procedure to accept an optional time zone argument:

class pc_SOAP_return_time {
    var $method_namespace = 'urn:pc_SOAP_return_time';

    function return_time($tz='') {
        if ($tz) { putenv("TZ=$tz"); }
        $date = date('Ymd\THis');
        if ($tz) { putenv('TZ=EST5EDT'); } // change EST5EDT to your server's zone 
        return $date
    }
}

The second parameter in the client's call now takes a tz option:

$result = $soapclient->call('return_time', array('tz' => 'PST8PDT'),
                             array('namespace' => 'urn:pc_SOAP_return_time'));

With the new settings, the server returns a time three hours behind the previous one:

20020821T202615

See Also

Recipe 12.9 for more on SOAP clients; PEAR's SOAP classes at http://pear.php.net/package-info.php?package=SOAP; Programming Web Services with SOAP (O'Reilly); the original SOAP current time application at http://www.soapware.org/currentTime.

Exchanging Data with WDDX

Problem

You want to serialize data in WDDX format for transmission or unserialize WDDX data you've received. This allows you to communicate with anyone who speaks WDDX.

Solution

Use PHP's WDDX extension. Serialize multiple variables using wddx_serialize_vars( ):

$a = 'string data';
$b = 123;
$c = 'rye';
$d = 'pastrami';
$array = array('c', 'd');

$wddx = wddx_serialize_vars('a', 'b', $array);

You can also start the WDDX packet with wddx_packet_start( ) and add data as it arrives with wddx_add_vars( ):

$wddx = wddx_packet_start('Some of my favorite things');

// loop through data
while ($array = mysql_fetch_array($r)) {
    $thing = $array['thing'];
    wddx_add_vars($wddx, 'thing');
}

$wddx = wddx_packet_end($wddx);

Use wddx_deserialize( ) to deserialize data:

// $wddx holds a WDDX packet
$vars = wddx_deserialize($wddx);

Discussion

WDDX stands for Web Distributed Data eXchange and was one of the first XML formats to share information in a language-neutral fashion. Invented by the company behind ColdFusion, WDDX gained a lot of popularity in 1999, but doesn't have much momentum at the present.

Instead, many people have begun to use SOAP as a replacement for WDDX. But WDDX does have the advantage of simplicity, so if the information you're exchanging is basic, WDDX may be a good choice. Also, due to its origins, it's very easy to read and write WDDX packets in ColdFusion, so if you need to communicate with a ColdFusion application, WDDX is helpful.

WDDX requires the expat library, available with Apache 1.3.7 and higher or from http://www.jclark.com/xml/expat.html. Configure PHP with --with-xml and --enable-wddx.

The example in the Solution produces the following XML (formatted to be easier to read):

<wddxPacket version='1.0'>
<header/>
<data>
    <struct>
        <var name='a'><string>string data</string></var>
        <var name='b'><number>123</number></var>
        <var name='c'><string>rye</string></var>
        <var name='d'><string>pastrami</string></var>
    </struct>
</data>
</wddxPacket>

Variables are wrapped inside <var> tags with the variable name assigned as the value for the name attribute. Inside there is another set of tags that indicate the variable type: string, number, dateTime, boolean, array, binary, or recordSet. Finally, you have the data itself.

You can also serialize one variable at a time using wddx_serialize_value :

// one variable
$s = wddx_serialize_value('Serialized', 'An optional comment');

This results in the following XML:

<wddxPacket version='1.0'>
<header>
    <comment>An optional comment</comment>
</header>
<data>
    <string>Serialized</string>
</data>
</wddxPacket>

See Also

Documentation on WDDX at http://www.php.net/wddx; more information at http://www.openwddx.org; Chapter 20, "Sharing Data with WDDX," from Programming ColdFusion, by Rob Brooks-Bilson (O'Reilly).

Reading RSS Feeds

Problem

You want to retrieve an RSS feed and look at the items. This allows you to incorporate newsfeeds from multiple web sites into your application.

Solution

Use the PEAR XML_RSS class. Here's an example that reads the RSS feed for the php.announce mailing list:

require 'XML/RSS.php';

$feed = 'http://news.php.net/group.php?group=php.announce&format=rss';

$rss =& new XML_RSS($feed);
$rss->parse();

print "<ul>\n";
foreach ($rss->getItems() as $item) {
    print '<li><a href="' . $item['link'] . '">' . $item['title'] . "</a></li>\n";
}
print "</ul>\n";

Discussion

RSS, which stands for RDF Site Summary, is an easy-to-use headline or article syndication format written in XML.[2] Many news web sites, such as Slashdot and O'Reilly's Meerkat, provide RSS feeds that update whenever new stories are published. Weblogs have also embraced RSS and having an RSS feed for your blog is a standard feature. The PHP web site also publishes RSS feeds for most PHP mailing lists.

Retrieving and parsing a RSS feed is simple:

$feed = 'http://news.php.net/group.php?group=php.announce&format=rss';

$rss =& new XML_RSS($feed);
$rss->parse();

This example makes $rss a new XML_RSS object and sets the feed to the RSS feed for the php.announce mailing list. The feed is then parsed by XML_RSS::parse( ) and stored internally within $rss.

RSS items are then retrieved as an associative array using XML_RSS:getItems( ) :

print "<ul>\n";

foreach ($rss->getItems() as $item) {
    print '<li><a href="' . $item['link'] . '">' . $item['title'] . "</a></li>\n";
}

print "</ul>\n";

This foreach loop creates an unordered list of items with the item title linking back to the URL associated with the complete article, as shown in Figure 12-1. Besides the required title and link fields, an item can have an optional description field that contains a brief write-up about the item.

Figure 12-1. php.announce RSS feed

php.announce RSS feed

Each channel also has an entry with information about the feed, as shown in Figure 12-2. To retrieve that data, call XML_RSS::getChannelInfo( ) :

$feed = 'http://news.php.net/group.php?group=php.announce&format=rss';
$rss =& new XML_RSS($feed);

$rss->parse();

print "<ul>\n";

foreach ($rss->getChannelInfo() as $key => $value) {
    print "<li>$key: $value</li>\n";
}

print "</ul>\n";

Figure 12-2. php.announce RSS channel information

php.announce RSS channel information

See Also

Recipe 12.5 for how to process an RSS feed and transform it to HTML; PEAR's XML_RSS class at http://pear.php.net/package-info.php?package=XML_RSS; more information on RSS at http://groups.yahoo.com/group/rss-dev/files/specification.html; O'Reilly Network's Meerkat at http://www.oreillynet.com/meerkat/ .

Notes

  1. This is why nl2br( ) outputs <br />; its output is XML-compatible.
  2. RDF stands for Resource Definition Framework. RSS also stands for Rich Site Summary.
Personal tools