PHP Cookbook/Files

From WikiContent

< PHP Cookbook
Revision as of 13:36, 7 March 2008 by Docbook2Wiki (Talk)
Jump to: navigation, search
PHP Cookbook


Contents

Introduction

The input and output in a web application usually flow between browser, server, and database, but there are many circumstances in which files are involved too. Files are useful for retrieving remote web pages for local processing, storing data without a database, and saving information that other programs need access to. Plus, as PHP becomes a tool for more than just pumping out web pages, the file I/O functions are even more useful.

PHP's interface for file I/O is similar to C's, although less complicated. The fundamental unit of identifying a file to read from or write to is a file handle . This handle identifies your connection to a specific file, and you use it for operations on the file. This chapter focuses on opening and closing files and manipulating file handles in PHP, as well as what you can do with the file contents once you've opened a file. Chapter 19 deals with directories and file metadata such as permissions.

Opening /tmp/cookie-data and writing the contents of a specific cookie to the file looks like this:

$fh = fopen('/tmp/cookie-data','w')      or die("can't open file");
if (-1 == fwrite($fh,$_COOKIE['flavor'])) { die("can't write data"); }
fclose($fh)                              or die("can't close file");

The function fopen( ) returns a file handle if its attempt to open the file is successful. If it can't open the file (because of incorrect permissions, for example), it returns false. Section 18.2 and Section 18.4 cover ways to open files.

The function fwrite( ) writes the value of the flavor cookie to the file handle. It returns the number of bytes written. If it can't write the string (not enough disk space, for example), it returns -1.

Last, fclose( ) closes the file handle. This is done automatically at the end of a request, but it's a good idea to explicitly close all files you open anyway. It prevents problems using the code in a command-line context and frees up system resources. It also allows you to check the return code from fclose( ). Buffered data might not be actually written to disk until fclose( ) is called, so it's here that "disk full" errors are sometimes reported.

As with other processes, PHP must have the correct permissions to read from and write to a file. This is usually straightforward in a command-line context but can cause confusion when running scripts within a web server. Your web server (and consequently your PHP scripts) probably runs as a specific user dedicated to web serving (or perhaps as user nobody). For good security reasons, this user often has restricted permissions on what files it can access. If your script is having trouble with a file operation, make sure the web server's user or group — not yours — has permission to perform that file operation. Some web serving setups may run your script as you, though, in which case you need to make sure that your scripts can't accidentally read or write personal files that aren't part of your web site.

Because most file-handling functions just return false on error, you have to do some additional work to find more details about that error. When the track_errors configuration directive is on, each error message is put in the global variable $php_errormsg. Including this variable as part of your error output makes debugging easier:

$fh = fopen('/tmp/cookie-data','w')      or die("can't open: $php_errormsg");
if (-1 == fwrite($fh,$_COOKIE['flavor'])) { die("can't write: $php_errormsg") };
fclose($fh)                              or die("can't close: $php_errormsg");

If you don't have permission to write to the /tmp/cookie-data, the example dies with this error output:

can't open: fopen("/tmp/cookie-data", "w") - Permission denied

There are differences in how files are treated by Windows and by Unix. To ensure your file access code works appropriately on Unix and Windows, take care to handle line-delimiter characters and pathnames correctly.

A line delimiter on Windows is two characters: ASCII 13 (carriage return) followed by ASCII 10 ( linefeed or newline). On Unix, it's just ASCII 10. The typewriter-era names for these characters explain why you can get "stair-stepped" text when printing out a Unix-delimited file. Imagine these character names as commands to the platen in a typewriter or character-at-a-time printer. A carriage return sends the platen back to the beginning of the line it's on, and a line feed advances the paper by one line. A misconfigured printer encountering a Unix-delimited file dutifully follows instructions and does a linefeed at the end of each line. This advances to the next line but doesn't move the horizontal printing position back to the left margin. The next stair-stepped line of text begins (horizontally) where the previous line left off.

PHP functions that use a newline as a line-ending delimiter (for example, fgets( )) work on both Windows and Unix because a newline is the character at the end of the line on either platform.

To remove any line-delimiter characters, use the PHP function rtrim( ) :

$fh = fopen('/tmp/lines-of-data.txt','r') or die($php_errormsg);
while($s = fgets($fh,1024)) {
    $s = rtrim($s);
    // do something with $s ... 
}
fclose($fh)                               or die($php_errormsg);

This function removes any trailing whitespace in the line, including ASCII 13 and ASCII 10 (as well as tab and space). If there's whitespace at the end of a line that you want to preserve, but you still want to remove carriage returns and line feeds, use an appropriate regular expression:

$fh = fopen('/tmp/lines-of-data.txt','r') or die($php_errormsg);
while($s = fgets($fh,1024)) {
    $s = preg_replace('/\r?\n$/','',$s);
    // do something with $s ... 
}
fclose($fh)                               or die($php_errormsg);

Unix and Windows also differ on the character used to separate directories in pathnames. Unix uses a slash (/), and Windows uses a backslash (\). PHP makes sorting this out easy, however, because the Windows version of PHP also understands / as a directory separator. For example, this code successfully prints the contents of C:\Alligator\Crocodile Menu.txt:

$fh = fopen('c:/alligator/crocodile menu.txt','r') or die($php_errormsg);
while($s = fgets($fh,1024)) {
    print $s;
}
fclose($fh)                                        or die($php_errormsg);

This piece of code also takes advantage of the fact that Windows filenames aren't case-sensitive. However, Unix filenames are.

Sorting out linebreak confusion isn't only a problem in your code that reads and writes files but in your source code as well. If you have multiple people working on a project, make sure all developers configure their editors to use the same kind of linebreaks.

Once you've opened a file, PHP gives you many tools to process its data. In keeping with PHP's C-like I/O interface, the two basic functions to read data from a file are fread( ) , which reads a specified number of bytes, and fgets( ), which reads a line at a time (up to a specified number of bytes.) This code handles lines up to 256 bytes long:

$fh = fopen('orders.txt','r') or die($php_errormsg);
while (! feof($fh)) {
    $s = fgets($fh,256);
    process_order($s);
}
fclose($fh) or die($php_errormsg);

If orders.txt has a 300-byte line, fgets( ) returns only the first 256 bytes. The next fgets( ) returns the next 44 bytes and stops when it finds the newline. The next fgets( ) moves to the next line of the file. Examples in this chapter generally give fgets( ) a second argument of 1048576: 1 MB. This is longer than lines in most text files, but the presence of such an outlandish number should serve as a reminder to consider your maximum expected line length when using fgets().

Many operations on file contents, such as picking a line at random (see Section 18.11) are conceptually simpler (and require less code) if the entire file is read into a string or array. Section 18.6 provides a method for reading a file into a string, and the file( ) function puts each line of a file into an array. The tradeoff for simplicity, however, is memory consumption. This can be especially harmful when you are using PHP as a server module. Generally, when a process (such as a web server process with PHP embedded in it) allocates memory (as PHP does to read an entire file into a string or array), it can't return that memory to the operating system until it dies. This means that calling file( ) on a 1 MB file from PHP running as an Apache module increases the size of that Apache process by 1 MB until the process dies. Repeated a few times, this decreases server efficiency. There are certainly good reasons for processing an entire file at once, but be conscious of the memory-use implications when you do.

Section 18.21 through Section 18.24 deal with running other programs from within a PHP program. Some program-execution operators or functions offer ways to run a program and read its output all at once (backticks) or read its last line of output (system( )). PHP can use pipes to run a program, pass it input, or read its output. Because a pipe is read with standard I/O functions (fgets( ) and fread( )), you decide how you want the input and you can do other tasks between reading chunks of input. Similarly, writing to a pipe is done with fputs( ) and fwrite( ), so you can pass input to a program in arbitrary increments.

Pipes have the same permission issues as regular files. The PHP process must have execute permission on the program being opened as a pipe. If you have trouble opening a pipe, especially if PHP is running as a special web server user, make sure the user is allowed to execute the program you are opening a pipe to.

Creating or Opening a Local File

Problem

You want to open a local file to read data from it or write data to it.

Solution

Use fopen( ):

$fh = fopen('file.txt','r') or die("can't open file.txt: $php_errormsg");

Discussion

The first argument to fopen( ) is the file to open; the second argument is the mode to open the file in. The mode specifies what operations can be performed on the file (reading and/or writing), where the file pointer is placed after the file is opened (at the beginning or end of the file), whether the file is truncated to zero length after opening, and whether the file is created if it doesn't exist, as shown in Table 18-1.

Table 18-1. fopen( ) file modes

Mode Readable? Writeable? File pointer Truncate? Create?
r Yes No Beginning No No
r+ Yes Yes Beginning No No
w No Yes Beginning Yes Yes
w+ Yes Yes Beginning Yes Yes
a No Yes End No Yes
a+ Yes Yes End No Yes


On non-POSIX systems, such as Windows, you need to add a b to the mode when opening a binary file, or reads and writes get tripped up on NUL (ASCII 0) characters:

$fh = fopen('c:/images/logo.gif','rb');

To operate on a file, pass the file handle returned from fopen( ) to other I/O functions such as fgets( ), fputs( ), and fclose( ).

If the file given to fopen( ) doesn't have a pathname, the file is opened in the directory of the running script (web context) or in the current directory (command-line context).

You can also tell fopen( ) to search for the file to open in the include_path specified in your php.ini file by passing 1 as a third argument. For example, this searches for file.inc in the include_path:

$fh = fopen('file.inc','r',1) or die("can't open file.inc: $php_errormsg");

See Also

Documentation on fopen( ) at http://www.php.net/fopen.

Creating a Temporary File

Problem

You need a file to temporarily hold some data.

Solution

Use tmpfile( ) if the file needs to last only the duration of the running script:

$temp_fh = tmpfile();
// write some data to the temp file
fputs($temp_fh,"The current time is ".strftime('%c'));
// the file goes away when the script ends
exit(1);

If the file needs to last longer, generate a filename with tempnam( ) , and then use fopen( ) :

$tempfilename = tempnam('/tmp','data-');
$temp_fh = fopen($tempfilename,'w') or die($php_errormsg);
fputs($temp_fh,"The current time is ".strftime('%c'));
fclose($temp_fh) or die($php_errormsg);

Discussion

The function tmpfile( ) creates a file with a unique name and returns a file handle. The file is removed when fclose( ) is called on that file handle, or the script ends.

Alternatively, tempnam( ) generates a filename. It takes two arguments: the first is a directory, and the second is a prefix for the filename. If the directory doesn't exist or isn't writeable, tempnam( ) uses the system temporary directory — the TMPDIR environment variable in Unix or the TMP environment variable in Windows. For example:

$tempfilename = tempnam('/tmp','data-');
print "Temporary data will be stored in $tempfilename";
Temporary data will be stored in /tmp/data-GawVoL
            

Because of the way PHP generates temporary filenames, the filename tempnam( ) returns is actually created but left empty, even if your script never explicitly opens the file. This ensures another program won't create a file with the same name between the time that you call tempnam( ) and the time you call fopen( ) with the filename.

See Also

Documentation on tmpfile( ) at http://www.php.net/tmpfile and on tempnam( ) at http://www.php.net/tempnam.

Opening a Remote File

Problem

You want to open a file that's accessible to you via HTTP or FTP.

Solution

Pass the file's URL to fopen( ) :

$fh = fopen('http://www.example.com/robots.txt','r') or die($php_errormsg);

Discussion

When fopen( ) is passed a filename that begins with http://, it retrieves the given page with an HTTP/1.0 GET request (although a Host: header is also passed along to deal with virtual hosts). Only the body of the reply can be accessed using the file handle, not the headers. Files can be read, not written, via HTTP.

When fopen( ) is passed a filename that begins with ftp://, it returns a pointer to the specified file, obtained via passive mode FTP. You can open files via FTP for either reading or writing, but not both.

To open URLs that require a username and a password with fopen( ), embed the authentication information in the URL like this:

$fh = fopen('ftp://username:password@ftp.example.com/pub/Index','r');
$fh = fopen('http://username:password@www.example.com/robots.txt','r');

Opening remote files with fopen( ) is implemented via a PHP feature called the URL fopen wrapper . It's enabled by default but is disabled by setting allow_url_fopen to off in your php.ini or web server configuration file. If you can't open remote files with fopen( ), check your server configuration.

See Also

Section 11.2 through Section 11.6, which discuss retrieving URLs; documentation on fopen( ) at http://www.php.net/fopen and on the URL fopen wrapper feature at http://www.php.net/features.remote-files.

Reading from Standard Input

Problem

You want to read from standard input.

Solution

Use fopen( ) to open php://stdin:

$fh = fopen('php://stdin','r') or die($php_errormsg);
while($s = fgets($fh,1024)) {
    print "You typed: $s";
}

Discussion

Section 20.4 discusses reading data from the keyboard in a command-line context. Reading data from standard input isn't very useful in a web context, because information doesn't arrive via standard input. The bodies of HTTP POST and file-upload requests are parsed by PHP and put into special variables. They can't be read on standard input, as they can in some web server and CGI implementations.

See Also

Section 20.4 for reading from the keyboard in a command-line context; documentation on fopen( ) at http://www.php.net/fopen.

Reading a File into a String

Problem

You want to load the entire contents of a file into a variable. For example, you want to determine if the text in a file matches a regular expression.

Solution

Use filesize( ) to get the size of the file, and then tell fread( ) to read that many bytes:

$fh = fopen('people.txt','r') or die($php_errormsg);
$people = fread($fh,filesize('people.txt'));
if (preg_match('/Names:.*(David|Susannah)/i',$people)) {
    print "people.txt matches.";
}
fclose($fh) or die($php_errormsg);

Discussion

To read a binary file (e.g., an image) on Windows, a b must be appended to the file mode:

$fh = fopen('people.jpg','rb') or die($php_errormsg);
$people = fread($fh,filesize('people.jpg'));
fclose($fh);

There are easier ways to print the entire contents of a file than by reading it into a string and then printing the string. PHP provides two functions for this. The first is fpassthru($fh) , which prints everything left on the file handle $fh and then closes it. The second, readfile($filename) , prints the entire contents of $filename.

You can use readfile( ) to implement a wrapper around images that shouldn't always be displayed. This program makes sure a requested image is less than a week old:

$image_directory = '/usr/local/images';

if (preg_match('/^[a-zA-Z0-9]+\.(gif|jpeg)$/',$image,$matches) &&
    is_readable($image_directory."/$image") &&
    (filemtime($image_directory."/$image") >= (time() - 86400 * 7))) {

  header('Content-Type: image/'.$matches[1]);
  header('Content-Length: '.filesize($image_directory."/$image"));

  readfile($image_directory."/$image");

} else {
  error_log("Can't serve image: $image");
}

The directory in which the images are stored, $image_directory, needs to be outside the web server's document root for the wrapper to be effective. Otherwise, users can just access the image files directly. You test the image for three things. First, that the filename passed in $image is just alphanumeric with an ending of either .gif or .jpeg. You need to ensure that characters such as .. or / are not in the filename; this prevents malicious users from retrieving files outside the specified directory. Second, use is_readable( ) to make sure you can read the file. Finally, get the file's modification time with filemtime( ) and make sure that time is after 86400 × 7 seconds ago. There are 86,400 seconds in a day, so 86400 × 7 is a week.[1] If all of these conditions are met, you're ready to send the image. First, send two headers to tell the browser the image's MIME type and file size. Then use readfile( ) to send the entire contents of the file to the user.

See Also

Documentation on filesize( ) at http://www.php.net/filesize, fread( ) at http://www.php.net/fread, fpassthru( ) at http://www.php.net/fpassthru, and readfile( ) at http://www.php.net/readfile.

Counting Lines, Paragraphs, or Records in a File

Problem

You want to count the number of lines, paragraphs, or records in a file.

Solution

To count lines, use fgets( ) . Because it reads a line at a time, you can count the number of times it's called before reaching the end of a file:

$lines = 0;

if ($fh = fopen('orders.txt','r')) {
  while (! feof($fh)) {
    if (fgets($fh,1048576)) {
      $lines++;
    }
  }
}
print $lines;

To count paragraphs, increment the counter only when you read a blank line:

$paragraphs = 0;

if ($fh = fopen('great-american-novel.txt','r')) {
  while (! feof($fh)) {
    $s = fgets($fh,1048576);
    if (("\n" == $s) || ("\r\n" == $s)) {
      $paragraphs++;
    }
  }
}
print $paragraphs;

To count records, increment the counter only when the line read contains just the record separator and whitespace:

$records = 0;
$record_separator = '--end--';

if ($fh = fopen('great-american-novel.txt','r')) {
  while (! feof($fh)) {
    $s = rtrim(fgets($fh,1048576));
    if ($s == $record_separator) {
      $records++;
    }
  }
}
print $records;

Discussion

In the line counter, $lines is incremented only if fgets( ) returns a true value. As fgets( ) moves through the file, it returns each line it retrieves. When it reaches the last line, it returns false, so $lines doesn't get incorrectly incremented. Because EOF has been reached on the file, feof( ) returns true, and the while loop ends.

This paragraph counter works fine on simple text but may produce unexpected results when presented with a long string of blank lines or a file without two consecutive linebreaks. These problems can be remedied with functions based on preg_split( ). If the file is small and can be read into memory, use the pc_split_paragraphs( ) function shown in Example 18-1. This function returns an array containing each paragraph in the file.

Example 18-1. pc_split_paragraphs( )

function pc_split_paragraphs($file,$rs="\r?\n") {
    $text = join('',file($file));
    $matches = preg_split("/(.*?$rs)(?:$rs)+/s",$text,-1,
                          PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
    return $matches;
}

The contents of the file are broken on two or more consecutive newlines and returned in the $matches array. The default record-separation regular expression, \r?\n, matches both Windows and Unix linebreaks. If the file is too big to read into memory at once, use the pc_split_paragraphs_largefile( ) function shown in Example 18-2, which reads the file in 4K chunks.

Example 18-2. pc_split_paragraphs_largefile( )

function pc_split_paragraphs_largefile($file,$rs="\r?\n") {
    global $php_errormsg;

    $unmatched_text = '';
    $paragraphs = array();

    $fh = fopen($file,'r') or die($php_errormsg);

    while(! feof($fh)) {
        $s = fread($fh,4096) or die($php_errormsg);
        $text_to_split = $unmatched_text . $s;

        $matches = preg_split("/(.*?$rs)(?:$rs)+/s",$text_to_split,-1,
                              PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);

        // if the last chunk doesn't end with two record separators, save it
         * to prepend to the next section that gets read 
        $last_match = $matches[count($matches)-1];
        if (! preg_match("/$rs$rs\$/",$last_match)) {
            $unmatched_text = $last_match;
            array_pop($matches);
        } else {
            $unmatched_text = '';
        }
        
        $paragraphs = array_merge($paragraphs,$matches);
    }
    
    // after reading all sections, if there is a final chunk that doesn't
     * end with the record separator, count it as a paragraph 
    if ($unmatched_text) {
        $paragraphs[] = $unmatched_text;
    }
    return $paragraphs;
}

This function uses the same regular expression as pc_split_paragraphs( ) to split the file into paragraphs. When it finds a paragraph end in a chunk read from the file, it saves the rest of the text in the chunk in $unmatched_text and prepends it to the next chunk read. This includes the unmatched text as the beginning of the next paragraph in the file.

See Also

Documentation on fgets( ) at http://www.php.net/fgets, on feof( ) at http://www.php.net/feof, and on preg_split( ) at http://www.php.net/preg-split.

Processing Every Word in a File

Problem

You want to do something with every word in a file.

Solution

Read in each line with fgets( ), separate the line into words, and process each word:

$fh = fopen('great-american-novel.txt','r') or die($php_errormsg);
while (! feof($fh)) {
    if ($s = fgets($fh,1048576)) {
        $words = preg_split('/\s+/',$s,-1,PREG_SPLIT_NO_EMPTY);
        // process words
    }
}
fclose($fh) or die($php_errormsg);

Discussion

Here's how to calculate average word length in a file:

$word_count = $word_length = 0;

if ($fh = fopen('great-american-novel.txt','r')) {
  while (! feof($fh)) {
    if ($s = fgets($fh,1048576)) {
      $words = preg_split('/\s+/',$s,-1,PREG_SPLIT_NO_EMPTY);
      foreach ($words as $word) {
        $word_count++;
        $word_length += strlen($word);
      }
    }
  }
}

print sprintf("The average word length over %d words is %.02f characters.",
              $word_count,
              $word_length/$word_count);

Processing every word proceeds differently depending on how "word" is defined. The code in this recipe uses the Perl-compatible regular-expression engine's \s whitespace metacharacter, which includes space, tab, newline, carriage return, and formfeed. Section 2.6 breaks apart a line into words by splitting on a space, which is useful in that recipe because the words have to be rejoined with spaces. The Perl-compatible engine also has a word-boundary assertion (\b) that matches between a word character (alphanumeric) and a nonword character (anything else). Using \b instead of \s to delimit words most noticeably treats differently words with embedded punctuation. The term 6 o'clock is two words when split by whitespace (6 and o'clock); it's four words when split by word boundaries (6, o, ', and clock).

See Also

Section 13.3 discusses regular expressions to match words; Section 1.5 for breaking apart a line by words; documentation on fgets( ) at http://www.php.net/fgets, on preg_split( ) at http://www.php.net/preg-split, and on the Perl-compatible regular expression extension at http://www.php.net/pcre.

Reading a Particular Line in a File

Problem

You want to read a specific line in a file; for example, you want to read the most recent guestbook entry that's been added on to the end of a guestbook file.

Solution

If the file fits into memory, read the file into an array and then select the appropriate array element:

$lines = file('vacation-hotspots.txt');
print $lines[2];

Discussion

Because array indexes start at 0, $lines[2] refers to the third line of the file.

If the file is too big to read into an array, read it line by line and keep track of which line you're on:

$line_counter = 0;
$desired_line = 29;

$fh = fopen('vacation-hotspots.txt','r') or die($php_errormsg);
while ((! feof($fh)) && ($line_counter <= $desired_line)) {
    if ($s = fgets($fh,1048576)) {
        $line_counter++;
    }
}
fclose($fh) or die($php_errormsg);

print $s;

Setting $desired_line = 29 prints the 30th line of the file, to be consistent with the code in the Solution. To print the 29th line of the file, change the while loop line to:

while ((! feof($fh)) && ($line_counter < $desired_line)) {

See Also

Documentation on fgets( ) at http://www.php.net/fgets and feof( ) at http://www.php.net/feof.

Processing a File Backward by Line or Paragraph

Problem

You want to do something with each line of a file, starting at the end. For example, it's easy to add new guestbook entries to the end of a file by opening in append mode, but you want to display the entries with the most recent first, so you need to process the file starting at the end.

Solution

If the file fits in memory, use file( ) to read each line in the file into an array and then reverse the array:

$lines = file('guestbook.txt');
$lines = array_reverse($lines);

Discussion

You can also iterate through an unreversed array of lines starting at the end. Here's how to print out the last 10 lines in a file, last line first:

$lines = file('guestbook.txt');
for ($i = 0, $j = count($lines); $i <= 10; $i++) {
    print $lines[$j - $i];
}

See Also

Documentation on file( ) at http://www.php.net/file and array_reverse( ) at http://www.php.net/array-reverse.

Picking a Random Line from a File

Problem

You want to pick a line at random from a file; for example, you want to display a selection from a file of sayings.

Solution

Use the pc_randomint( ) function shown in Example 18-3, which spreads the selection odds evenly over all lines in a file.

Example 18-3. pc_randomint( )

function pc_randomint($max = 1) {
  $m = 1000000;
  return ((mt_rand(1,$m * $max)-1)/$m);
}

Here's an example that uses the pc_randomint( ) function:

$line_number = 0;

$fh = fopen('sayings.txt','r') or die($php_errormsg);
while (! feof($fh)) {
    if ($s = fgets($fh,1048576)) {
        $line_number++;
        if (pc_randomint($line_number) < 1) {
            $line = $s;
        }
    }
}
fclose($fh) or die($php_errormsg);

Discussion

The pc_randomint( ) function computes a random decimal number between and $max, including 0 but excluding $max. As each line is read, a line counter is incremented, and pc_randomint( ) generates a random number between 0 and $line_number. If the number is less than 1, the current line is selected as the randomly chosen line. After all lines have been read, the last line that was selected as the randomly chosen line is left in $line.

This algorithm neatly ensures that each line in an n line file has a 1/ n chance of being chosen without having to store all n lines into memory.

See Also

Documentation on mt_rand( ) at http://www.php.net/mt-rand.

Randomizing All Lines in a File

Problem

You want to randomly reorder all lines in a file. You have a file of funny quotes, for example, and you want to pick out one at random.

Solution

Read all the lines in the file into an array with file( ) , and then shuffle the elements of the array:

$lines = file('quotes-of-the-day.txt');
$lines = pc_array_shuffle($lines);

Discussion

The pc_array_shuffle( ) function from Section 4.21 is more random than PHP's built-in shuffle( ) function, because it uses the Fisher-Yates shuffle, which equally distributes the elements throughout the array.

See Also

Section 4.20 for pc_array_shuffle( ); documentation on shuffle( ) at http://www.php.net/shuffle.

Processing Variable Length Text Fields

Problem

You want to read delimited text fields from a file. You might, for example, have a database program that prints records one per line, with tabs between each field in the record, and you want to parse this data into an array.

Solution

Read in each line and then split the fields based on their delimiter:

$delim = '|';

$fh = fopen('books.txt','r') or die("can't open: $php_errormsg");
while (! feof($fh)) {
    $s = rtrim(fgets($fh,1024));
    $fields = explode($delim,$s);
    // ... do something with the data ... 
}
fclose($fh) or die("can't close: $php_errormsg");

Discussion

To parse the following data in books.txt:

Elmer Gantry|Sinclair Lewis|1927
The Scarlatti Inheritance|Robert Ludlum|1971
The Parsifal Mosaic|Robert Ludlum|1982
Sophie's Choice|William Styron|1979

Process each record like this:

$fh = fopen('books.txt','r') or die("can't open: $php_errormsg");
while (! feof($fh)) {
    $s = rtrim(fgets($fh,1024));
    list($title,$author,$publication_year) = explode('|',$s);
    // ... do something with the data ... 
}
fclose($fh) or die("can't close: $php_errormsg");

The line length argument to fgets( ) needs to be at least as long as the longest record, so that a record doesn't get truncated.

Calling rtrim( ) is necessary because fgets( ) includes the trailing whitespace in the line it reads. Without rtrim( ), each $publication_year would have a newline at its end.

See Also

Section 1.12 discusses ways to break apart strings into pieces; Section 1.10 and Section 1.11 cover parsing comma-separated and fixed-width data; documentation on explode( ) at http://www.php.net/explode and rtrim( ) at http://www.php.net/rtrim.

Reading Configuration Files

Problem

You want to use configuration files to initialize settings in your programs.

Solution

Use parse_ini_file( ):

$config = parse_ini_file('/etc/myapp.ini');

Discussion

The function parse_ini_file( ) reads configuration files structured like PHP's main php.ini file. Instead of applying the settings in the configuration file to PHP's configuration, however, parse_ini_file( ) returns the values from the file in an array.

For example, when parse_ini_file( ) is given a file with these contents:

; physical features
eyes=brown
hair=brown
glasses=yes

; other features
name=Susannah
likes=monkeys,ice cream,reading

The array it returns is:

Array
(
    [eyes] => brown
    [hair] => brown
    [glasses] => 1
    [name] => Susannah
    [likes] => monkeys,ice cream,reading
)

Blank lines and lines that begin with ; in the configuration file are ignored. Other lines with name=value pairs are put into an array with the name as the key and the value, appropriately, as the value. Words such as on and yes as values are returned as 1, and words such as off and no are returned as the empty string.

To parse sections from the configuration file, pass 1 as a second argument to parse_ini_file( ) . Sections are set off by words in square brackets in the file:

[physical]
eyes=brown
hair=brown
glasses=yes

[other]
name=Susannah
likes=monkeys,ice cream,reading

If this file is in /etc/myapp.ini, then:

$conf = parse_ini_file('/etc/myapp.ini',1);

Puts this array in $conf:

Array
(
    [physical] => Array
        (
            [eyes] => brown
            [hair] => brown
            [glasses] => 1
        )

    [other] => Array
        (
            [name] => Susannah
            [likes] => monkeys,ice cream,reading
        )

)

Your configuration file can also be a valid PHP file that you load with require instead of parse_ini_file( ). If the file config.php contains:

<?php

// physical features
$eyes = 'brown';
$hair = 'brown';
$glasses = 'yes';

// other features
$name = 'Susannah';
$likes = array('monkeys','ice cream','reading');
?>

You can set the variables $eyes, $hair, $glasses, $name, and $likes with:

require 'config.php';

The configuration file loaded by require needs to be valid PHP — including the <?php start tag and the ?> end tag. The variables named in config.php are set explicitly, not inside an array, as in parse_ini_file( ). For simple configuration files, this technique may not be worth the extra attention to syntax, but it is useful for embedding logic in the configuration file:

<?php

$time_of_day = (date('a') == 'am') ? 'early' : 'late';

?>

The ability to embed logic in configuration files is a good reason to make the files PHP code, but it is helpful also to have all the variables set in the configuration file inside an array. Upcoming versions of PHP will have a feature called namespaces , which is the ability to group variables hierarchically in different bunches; you can have a variable called $hair in two different namespaces with two different values. With namespaces, all the values in a configuration file can be loaded into the Config namespace so they don't interfere with other variables.

See Also

Documentation on parse_ini_file( ) at http://www.php.net/parse-ini-file; information about namespaces and other upcoming PHP language features is available at http://www.php.net/ZEND_CHANGES.txt.

Reading from or Writing to a Specific Location in a File

Problem

You want to read from (or write to) a specific place in a file. For example, you want to replace the third record in a file of 80-byte records, so you have to write starting at the 161st byte.

Solution

Use fseek( ) to move to a specific number of bytes after the beginning of the file, before the end of the file, or from the current position in the file:

fseek($fh,26);           // 26 bytes after the beginning of the file
fseek($fh,26,SEEK_SET);  // 26 bytes after the beginning of the file
fseek($fh,-39,SEEK_END); // 39 bytes before the end of the file
fseek($fh,10,SEEK_CUR);  // 10 bytes ahead of the current position
fseek($fh,0);            // beginning of the file

The rewind( ) function moves to the beginning of a file:

rewind($fh);             // the same as fseek($fh,0)

Discussion

The function fseek( ) returns 0 if it can move to the specified position, otherwise it returns -1. Seeking beyond the end of the file isn't an error for fseek( ). Contrastingly, rewind( ) returns 0 if it encounters an error.

You can use fseek( ) only with local files, not HTTP or FTP files opened with fopen( ). If you pass a file handle of a remote file to fseek( ), it throws an E_NOTICE error.

To get the current file position, use ftell( ) :

if (0 === ftell($fh)) {
  print "At the beginning of the file.";
}

Because ftell( ) returns false on error, you need to use the === operator to make sure that its return value is really the integer 0.

See Also

Documentation on fseek( ) at http://www.php.net/fseek, ftell( ) at http://www.php.net/ftell, and rewind( ) at http://www.php.net/rewind.

Removing the Last Line of a File

Problem

You want to remove the last line of a file; for example, someone's added a comment to the end of your guestbook. You don't like it, so you want to get rid of it.

Solution

If the file is small, you can read it into an array with file( ) and then remove the last element of the array:

$lines = file('employees.txt');
array_pop($lines);
$file = join('',$lines);

Discussion

If the file is large, reading it into an array requires too much memory. Instead, use this code, which seeks to the end of the file and works backwards, stopping when it finds a newline:

$fh = fopen('employees.txt','r') or die("can't open: $php_errormsg");
$linebreak = $beginning_of_file = 0;

$gap = 80;
$filesize = filesize('employees.txt');
fseek($fh,0,SEEK_END);

while (! ($linebreak || $beginning_of_file)) {
    // save where we are in the file 

    $pos = ftell($fh);

    /* move back $gap chars, use rewind() to go to the beginning if
     * we're less than $gap characters into the file */ 
    if ($pos < $gap) {
        rewind($fh);
    } else {
        fseek($fh,-$gap,SEEK_CUR);
    }

    // read the $gap chars we just seeked back over 
    $s = fread($fh,$gap) or die($php_errormsg);

    /* if we read to the end of the file, remove the last character
     * since if it's a newline, we should ignore it */
    if ($pos + $gap >= $filesize) {
        $s = substr_replace($s,'',-1);
    }

    // move back to where we were before we read $gap chars into $s 
    if ($pos < $gap) {
        rewind($fh);
    } else {
        fseek($fh,-$gap,SEEK_CUR);
    }
    
    // is there a linebreak in $s ? 
    if (is_integer($lb = strrpos($s,"\n"))) {
        $linebreak = 1;
        // the last line of the file begins right after the linebreak 
        $line_end = ftell($fh) + $lb + 1;
    } 

    // break out of the loop if we're at the beginning of the file 
    if (ftell($fh) == 0) { $beginning_of_file = 1; }

}
if ($linebreak) {
    rewind($fh);
    $file_without_last_line = fread($fh,$line_end) or die($php_errormsg);
}
fclose($fh) or die("can't close: $php_errormsg");

This code starts at the end of the file and moves backwards in $gap character chunks looking for a newline. If it finds one, it knows the last line of the file starts right after that newline. This position is saved in $line_end. After the while loop, if $linebreak is set, the contents of the file from the beginning to $line_end are read into $file_without_last_line.

The last character of the file is ignored because if it's a newline, it doesn't indicate the start of the last line of the file. Consider the 10-character file whose contents are asparagus\n. It has only one line, consisting of the word asparagus and a newline character. This file without its last line is empty, which the previous code correctly produces. If it starts scanning with the last character, it sees the newline and exits its scanning loop, incorrectly printing out asparagus without the newline.

See Also

Section 18.15 discusses fseek( ) and rewind( ) in more detail; documentation on array_pop( ) at http://www.php.net/array-pop, fseek( ) at http://www.php.net/fseek, and rewind( ) at http://www.php.net/rewind.

Modifying a File in Place Without a Temporary File

Problem

You want to change a file without using a temporary file to hold the changes.

Solution

Read the file into memory, make the changes, and rewrite the file. Open the file with mode r+ (rb+, if necessary, on Windows) and adjust its length with ftruncate( ) after writing out changes:

// open the file for reading and writing 
$fh = fopen('pickles.txt','r+')         or die($php_errormsg);

// read the entire file into $s
$s = fread($fh,filesize('pickles.txt')) or die($php_errormsg);

// ... modify $s ...

// seek back to the beginning of the file and write the new $s
rewind($fh);
if (-1 == fwrite($fh,$s))                { die($php_errormsg); }

// adjust the file's length to just what's been written
ftruncate($fh,ftell($fh))               or die($php_errormsg);

// close the file
fclose($fh)                             or die($php_errormsg);

Discussion

The following code turns text emphasized with asterisks or slashes into text with HTML <b> or <i> tags:

$fh = fopen('message.txt','r+')         or die($php_errormsg);

// read the entire file into $s
$s = fread($fh,filesize('message.txt')) or die($php_errormsg);

// convert *word* to <b>word</b>
$s = preg_replace('@\*(.*?)\*@i','<b>$1</b>',$s);
// convert /word/ to <i>word</i>
$s = preg_replace('@/(.*?)/@i','<i>$1</i>',$s);

rewind($fh);
if (-1 == fwrite($fh,$s))                { die($php_errormsg); }
ftruncate($fh,ftell($fh))               or die($php_errormsg);
fclose($fh)                             or die($php_errormsg);

Because adding HTML tags makes the file grow, the entire file has to be read into memory and then processed. If the changes to a file make each line shrink (or stay the same size), the file can be processed line by line, saving memory. This example converts text marked with <b> and <i> to text marked with asterisks and slashes:

$fh = fopen('message.txt','r+')         or die($php_errormsg);

// figure out how many bytes to read
$bytes_to_read = filesize('message.txt');

// initialize variables that hold file positions
$next_read = $last_write = 0;

// keep going while there are still bytes to read
while ($next_read < $bytes_to_read) {
    
    /* move to the position of the next read, read a line, and save
     * the position of the next read */
    fseek($fh,$next_read);
    $s = fgets($fh,1048576)             or die($php_errormsg);
    $next_read = ftell($fh);

    // convert <b>word</b> to *word*
    $s = preg_replace('@<b[^>]*>(.*?)</b>@i','*$1*',$s);
    // convert <i>word</i> to /word/ 
    $s = preg_replace('@<i[^>]*>(.*?)</i>@i','/$1/',$s);

    /* move to the position where the last write ended, write the
     * converted line, and save the position for the next write */
    fseek($fh,$last_write);
    if (-1 == fwrite($fh,$s))            { die($php_errormsg); }
    $last_write = ftell($fh);
}
  
// truncate the file length to what we've already written 
ftruncate($fh,$last_write)              or die($php_errormsg);

// close the file
fclose($fh)                             or die($php_errormsg);

See Also

Section 11.10 and Section 11.11 for additional information on converting between ASCII and HTML; Section 18.15 discusses fseek( ) and rewind( ) in more detail; documentation on fseek( ) at http://www.php.net/fseek, rewind( ) at http://www.php.net/rewind, and ftruncate( ) at http://www.php.net/ftruncate.

Flushing Output to a File

Problem

You want to force all buffered data to be written to a filehandle.

Solution

Use fflush( ) :

fwrite($fh,'There are twelve pumpkins in my house.');
fflush($fh);

This ensures that "There are twelve pumpkins in my house." is written to $fh.

Discussion

To be more efficient, system I/O libraries generally don't write something to a file when you tell them to. Instead, they batch the writes together in a buffer and save all of them to disk at the same time. Using fflush( ) forces anything pending in the write buffer to be actually written to disk.

Flushing output can be particularly helpful when generating an access or activity log. Calling fflush( ) after each message to log file makes sure that any person or program monitoring the log file sees the message as soon as possible.

See Also

Documentation on fflush( ) at http://www.php.net/fflush.

Writing to Standard Output

Problem

You want to write to standard output.

Solution

Use echo or print:

print "Where did my pastrami sandwich go?";
echo  "It went into my stomach.";

Discussion

While print( ) is a function, echo is a language construct. This means that print( ) returns a value, while echo doesn't. You can include print( ) but not echo in larger expressions:

// this is OK
(12 == $status) ? print 'Status is good' : error_log('Problem with status!');

// this gives a parse error
(12 == $status) ? echo 'Status is good' : error_log('Problem with status!');

Use php://stdout as the filename if you're using the file functions:

$fh = fopen('php://stdout','w') or die($php_errormsg);

Writing to standard output via a file handle instead of simply with print( ) or echo is useful if you need to abstract where your output goes, or if you need to print to standard output at the same time as writing to a file. See Section 18.20 for details.

You can also write to standard error by opening php://stderr:

$fh = fopen('php://stderr','w');

See Also

Section 18.20 for writing to many filehandles simultaneously; documentation on echo at http://www.php.net/echo and on print( ) at http://www.php.net/print.

Writing to Many Filehandles Simultaneously

Problem

You want to send output to more than one file handle; for example, you want to log messages to the screen and to a file.

Solution

Wrap your output with a loop that iterates through your filehandles, as shown in Example 18-4.

Example 18-4. pc_multi_fwrite( )

function pc_multi_fwrite($fhs,$s,$length=NULL) {
  if (is_array($fhs)) {
    if (is_null($length)) {
      foreach($fhs as $fh) {
        fwrite($fh,$s);
      }
    } else {
      foreach($fhs as $fh) {
        fwrite($fh,$s,$length);
      }
    }
  }
}

Here's an example:

$fhs['file'] = fopen('log.txt','w') or die($php_errormsg);
$fhs['screen'] = fopen('php://stdout','w') or die($php_errormsg);

pc_multi_fwrite($fhs,'The space shuttle has landed.');

Discussion

If you don't want to pass a length argument to fwrite( ) (or you always want to), you can eliminate that check from your pc_multi_fwrite( ). This version doesn't accept a $length argument:

function pc_multi_fwrite($fhs,$s) {
  if (is_array($fhs)) {
    foreach($fhs as $fh) {
      fwrite($fh,$s);
    }
  }
}

See Also

Documentation on fwrite( ) at http://www.php.net/fwrite.

Escaping Shell Metacharacters

Problem

You need to incorporate external data in a command line, but you want to escape out special characters so nothing unexpected happens; for example, you want to pass user input as an argument to a program.

Solution

Use escapeshellarg( ) to handle arguments:

system('ls -al '.escapeshellarg($directory));

Use escapeshellcmd( ) to handle program names:

system(escapeshellcmd($ls_program).' -al');

Discussion

The command line is a dangerous place for unescaped characters. Never pass unmodified user input to one of PHP's shell-execution functions. Always escape the appropriate characters in the command and the arguments. This is crucial. It is unusual to execute command lines that are coming from web forms and not something we recommend lightly. However, sometimes you need to run an external program, so escaping commands and arguments is useful.

escapeshellarg( ) surrounds arguments with single quotes (and escapes any existing single quotes). To print the process status for a particular process:

system('/bin/ps '.escapeshellarg($process_id));

Using escapeshellarg( ) ensures that the right process is displayed even if it has an unexpected character (e.g., a space) in it. It also prevents unintended commands from being run. If $process_id contains:

1; rm -rf /

then:

system("/bin/ps $process_id")

not only displays the status of process 1, but it also executes the command rm -rf / . However:

system('/bin/ps '.escapeshellarg($process_id)) 

runs the command /bin/ps 1; rm -rf, which produces an error because "1-semicolon-space-rm-space-hyphen-rf" isn't a valid process ID.

Similarly, escapeshellcmd( ) prevents unintended command lines from execution. This code runs a different program depending on the value of $which_program:

system("/usr/local/bin/formatter-$which_program");

For example, if $which_program is pdf 12, the script runs /usr/local/bin/formatter-pdf with an argument of 12. But, if $which_program is pdf 12; 56, the script runs /usr/local/bin/formatter-pdf with an argument of 12, but then also runs the program 56, which is an error. To successfully pass the arguments to formatter-pdf , you need escapeshellcmd( ):

system(escapeshellcmd("/usr/local/bin/formatter-$which_program"));

This runs /usr/local/bin/formatter-pdf and passes it two arguments: 12; and 56.

See Also

Documentation on system( ) at http://www.php.net/system, escapeshellarg( ) at http://www.php.net/escapeshellarg, and escapeshellcmd( ) at http://www.php.net/escapeshellcmd.

Passing Input to a Program

Problem

You want to pass input to an external program run from inside a PHP script. You might, for example, use a database that requires you to run an external program to index text and want to pass text to that program.

Solution

Open a pipe to the program with popen( ) , write to the pipe with fputs( ) or fwrite( ), then close the pipe with pclose( ) :

$ph = popen('program arg1 arg2','w')          or die($php_errormsg);
if (-1 == fputs($ph,"first line of input\n"))  { die($php_errormsg); }
if (-1 == fputs($ph,"second line of input\n")) { die($php_errormsg); }
pclose($ph)                                   or die($php_errormsg);

Discussion

This example uses popen( ) to call the nsupdate command, which submits Dynamic DNS Update requests to name servers:

$ph = popen('/usr/bin/nsupdate -k keyfile')               or die($php_errormsg);
if (-1 == fputs($ph,"update delete test.example.com A\n")) { die($php_errormsg); }
if (-1 == fputs($ph,"update add test.example.com 5 A 192.168.1.1\n"))
                                                           { die($php_errormsg); }
pclose($ph)                                               or die($php_errormsg);

Two commands are sent to nsupdate via popen( ). The first deletes the test.example.com A record, and the second adds a new A record for test.example.com with the address 192.168.1.1.

See Also

Documentation on popen( ) at http://www.php.net/popen and pclose( ) at http://www.php.net/pclose; Dynamic DNS is described in RFC 2136 at http://www.faqs.org/rfcs/rfc2136.html.

Reading Standard Output from a Program

Problem

You want to read the output from a program; for example, you want the output of a system utility such as route(8) that provides network information.

Solution

To read the entire contents of a program's output, use the backtick (') operator:

$routing_table = `/sbin/route`;

To read the output incrementally, open a pipe with popen( ):

$ph = popen('/sbin/route','r') or die($php_errormsg);
while (! feof($ph)) {
    $s = fgets($ph,1048576)    or die($php_errormsg);
}
pclose($ph)                    or die($php_errormsg);

Discussion

The backtick operator (which is not available in safe mode), executes a program and returns all its output as a single string. On a Linux system with 448 MB of RAM, this command:

$s = `/usr/bin/free`;

puts this multiline string in $s:

             total       used       free     shared    buffers     cached
Mem:        448620     446384       2236          0      68568     163040
-/+ buffers/cache:     214776     233844
Swap:       136512          0     136512

If a program generates a lot of output, it is more memory-efficient to read from a pipe one line at a time. If you're printing formatted data to the browser based on the output of the pipe, you can print it as you get it. This example prints information about recent Unix system logins formatted as an HTML table. It uses the /usr/bin/last command:

// print table header
print<<<_HTML_
<table>
<tr>
 <td>user</td><td>login port</td><td>login from</td><td>login time</td>
 <td>time spent logged in</td>
</tr>
_HTML_;

// open the pipe to /usr/bin/last
$ph = popen('/usr/bin/last','r') or die($php_errormsg);
while (! feof($ph)) {
    $line = fgets($ph,80) or die($php_errormsg);

    // don't process blank lines or the info line at the end
    if (trim($line) && (! preg_match('/^wtmp begins/',$line))) {
        $user = trim(substr($line,0,8));
        $port = trim(substr($line,9,12));
        $host = trim(substr($line,22,16));
        $date = trim(substr($line,38,25));
        $elapsed = trim(substr($line,63,10),' ()');
        
        if ('logged in' == $elapsed) {
            $elapsed = 'still logged in';
            $date = substr_replace($date,'',-5);
        }
        
        print "<tr><td>$user</td><td>$port</td><td>$host</td>";
        print "<td>$date</td><td>$elapsed</td></tr>\n";
    }
}
pclose($ph) or die($php_errormsg);

print '</table>';

See Also

Documentation on popen( ) at http://www.php.net/popen, pclose( ) at http://www.php.net/pclose, and the backtick operator at http://www.php.net/language.operators.execution; safe mode is documented at http://www.php.net/features.safe-mode.

Reading Standard Error from a Program

Problem

You want to read the error output from a program; for example, you want to capture the system calls displayed by strace(1) .

Solution

Redirect standard error to standard output by adding 2>&1 to the command line passed to popen( ) . Read standard output by opening the pipe in r mode:

$ph = popen('strace ls 2>&1','r') or die($php_errormsg);
while (!feof($ph)) {
    $s = fgets($ph,1048576)       or die($php_errormsg);
}
pclose($ph)                       or die($php_errormsg);

Discussion

In both the Unix sh and the Windows cmd.exe shells, standard error is file descriptor 2, and standard output is file descriptor 1. Appending 2>&1 to a command tells the shell to redirect what's normally sent to file descriptor 2 (standard error) over to file descriptor 1 (standard output). fgets( ) then reads both standard error and standard output.

This technique reads in standard error but doesn't provide a way to distinguish it from standard output. To read just standard error, you need to prevent standard output from being returned through the pipe. This is done by redirecting it to /dev/null on Unix and NUL on Windows:

// Unix: just read standard error
$ph = popen('strace ls 2>&1 1>/dev/null','r') or die($php_errormsg);

// Windows: just read standard error
$ph = popen('ipxroute.exe 2>&1 1>NUL','r') or die($php_errormsg);

See Also

Documentation on popen( ) at http://www.php.net/popen; see your popen(3) manpage for details about the shell your system uses with popen( ); for information about shell redirection, see the Redirection section of the sh(1) manpage on Unix systems; on Windows, see the entry on redirection in the Command Reference section of your system help.

Locking a File

Problem

You want to have exclusive access to a file to prevent it from being changed while you read or update it. If, for example, you are saving guestbook information in a file, two users should be able to add guestbook entries at the same time without clobbering each other's entries.

Solution

Use flock( ) to provide advisory locking:

$fh = fopen('guestbook.txt','a')         or die($php_errormsg);
flock($fh,LOCK_EX)                       or die($php_errormsg);
fwrite($fh,$_REQUEST['guestbook_entry']) or die($php_errormsg);
fflush($fh)                              or die($php_errormsg);
flock($fh,LOCK_UN)                       or die($php_errormsg);
fclose($fh)                              or die($php_errormsg);

Discussion

The file locking flock( ) provides is called advisory file locking because flock( ) doesn't actually prevent other processes from opening a locked file, it just provides a way for processes to voluntarily cooperate on file access. All programs that need to access files being locked with flock( ) need to set and release locks to make the file locking effective.

There are two kinds of locks you can set with flock( ): exclusive locks and shared locks. An exclusive lock , specified by LOCK_EX as the second argument to flock( ), can be held only by one process at one time for a particular file. A shared lock , specified by LOCK_SH, can be held by more than one process at one time for a particular file. Before writing to a file, you should get an exclusive lock. Before reading from a file, you should get a shared lock.

To unlock a file, call flock( ) with LOCK_UN as the second argument. It's important to flush any buffered data to be written to the file with fflush( ) before you unlock the file. Other processes shouldn't be able to get a lock until that data is written.

By default, flock( ) blocks until it can obtain a lock. To tell it not to block, add LOCK_NB to the second argument:

$fh = fopen('guestbook.txt','a')         or die($php_errormsg);
$tries = 3;
while ($tries > 0) {
    $locked = flock($fh,LOCK_EX | LOCK_NB);
    if (! $locked) {
        sleep(5);
        $tries--;
    } else {
        // don't go through the loop again 
        $tries = 0;
    }
}
if ($locked) {
    fwrite($fh,$_REQUEST['guestbook_entry']) or die($php_errormsg);
    fflush($fh)                              or die($php_errormsg);
    flock($fh,LOCK_UN)                       or die($php_errormsg);
    fclose($fh)                              or die($php_errormsg);
} else {
    print "Can't get lock.";
}  

When the lock is nonblocking, flock( ) returns right away even if it couldn't get a lock. The previous example tries three times to get a lock on guestbook.txt, sleeping five seconds between each try.

Locking with flock( ) doesn't work in all circumstances, such as on some NFS implementations. Also, flock( ) isn't supported on Windows 95, 98, or ME. To simulate file locking in these cases, use a directory as a exclusive lock indicator. This is a separate empty directory whose presence indicates that the data file is locked. Before opening a data file, create a lock directory and then delete the lock directory when you're finished working with the data file. Otherwise, the file access code is the same, as shown here:

$fh = fopen('guestbook.txt','a')         or die($php_errormsg);

// loop until we can successfully make the lock directory 
$locked = 0;
while (! $locked) {
    if (@mkdir('guestbook.txt.lock',0777)) {
        $locked = 1;
    } else {
        sleep(1);
    }
}

if (-1 == fwrite($fh,$_REQUEST['guestbook_entry'])) {
    rmdir('guestbook.txt.lock');
    die($php_errormsg);
}
if (! fclose($fh)) {
    rmdir('guestbook.txt.lock');
    die($php_errormsg);
}
rmdir('guestbook.txt.lock')              or die($php_errormsg);

A directory is used instead of a file to indicate a lock because the mkdir( ) function fails to create a directory if it already exists. This gives you a way, in one operation, to check if the lock indicator exists and create it if it doesn't. Any error trapping after the directory is created, however, needs to clean up by removing the directory before exiting. If the directory is left in place, no future processes can get a lock by creating the directory.

If you use a file as a lock indicator, the code to create it looks like:

$locked = 0;
while (! $locked) {
    if (! file_exists('guestbook.txt.lock')) {
        touch('guestbook.txt.lock');
        $locked = 1;
    } else {
        sleep(1);
    }
}

This might fail under heavy load because you check for the lock's existence with file_exists( ) and then create the lock with touch( ) . After one process calls file_exists( ), another might call touch( ) before the first calls touch( ). Both processes would then think they've got exclusive access to the file when neither does. With mkdir( ) there's no gap between the checking for existence and creation, so the process that makes the directory is ensured exclusive access.

See Also

Documentation on flock( ) at http://www.php.net/flock.

Reading and Writing Compressed Files

Problem

You want to read or write compressed files.

Solution

Use PHP's zlib extension to read or write gzip'ed files. To read a compressed file:

$zh = gzopen('file.gz','r') or die("can't open: $php_errormsg");
while ($line = gzgets($zh,1024)) {
    // $line is the next line of uncompressed data, up to 1024 bytes 
}
gzclose($zh) or die("can't close: $php_errormsg");

Here's how to write a compressed file:

$zh = gzopen('file.gz','w') or die("can't open: $php_errormsg");
if (-1 == gzwrite($zh,$s))   { die("can't write: $php_errormsg"); }
gzclose($zh)                or die("can't close: $php_errormsg");

Discussion

The zlib extension contains versions of many file-access functions, such as fopen( ), fread( ), and fwrite( ) (called gzopen( ) , gzread( ), gzwrite( ), etc.) that transparently compress data when writing and uncompress data when reading. The compression algorithm that zlib uses is compatible with the gzip and gunzip utilities.

For example, gzgets($zp,1024) works like fgets($fh,1024). It reads up to 1023 bytes, stopping earlier if it reaches EOF or a newline. For gzgets( ), this means 1023 uncompressed bytes.

However, gzseek( ) works differently than fseek( ). It only supports seeking a specified number of bytes from the beginning of the file stream (the SEEK_SET argument to fseek( )). Seeking forward (from the current position) is only supported in files opened for writing (the file is padded with a sequence of compressed zeroes). Seeking backwards is supported in files opened for reading, but it is very slow.

The zlib extension also has some functions to create compressed strings. The function gzencode( ) compresses a string and gives it the correct headers and formatting to be compatible with gunzip . Here's a simple gzip program:

$in_file = $_SERVER['argv'][1];
$out_file = $_SERVER['argv'][1].'.gz';

$ifh = fopen($in_file,'rb')  or die("can't open $in_file: $php_errormsg");
$ofh = fopen($out_file,'wb') or die("can't open $out_file: $php_errormsg");

$encoded = gzencode(fread($ifh,filesize($in_file)))
                             or die("can't encode data: $php_errormsg");

if (-1 == fwrite($ofh,$encoded)) { die("can't write: $php_errormsg"); }
fclose($ofh)                 or die("can't close $out_file: $php_errormsg");
fclose($ifh)                 or die("can't close $in_file: $php_errormsg");

The guts of this program are the lines:

$encoded = gzencode(fread($ifh,filesize($in_file)))
                             or die("can't encode data: $php_errormsg);
if (-1 == fwrite($ofh,$encoded)) { die("can't write: $php_errormsg"); }

The compressed contents of $in_file are stored in $encoded and then written to $out_file with fwrite( ).

You can pass a second argument to gzencode( ) that indicates compression level. Set no compression with 0 and maximum compression with 9. The default level is 1. To adjust the simple gzip program for maximum compression, the encoding line becomes:

$encoded = gzencode(fread($ifh,filesize($in_file)),9)
                             or die("can't encode data: $php_errormsg);

You can also compress and uncompress strings without the gzip-compatibility headers by using gzcompress( ) and gzuncompress( ).

See Also

Section 18.27 for a program that extracts files from a ZIP archive; documentation on the zlib extension at http://www.php.net/zlib; you can download zlib at http://www.gzip.org/zlib/; the zlib algorithm is detailed in RFCs 1950 (http://www.faqs.org/rfcs/rfc1950.html) and 1951 (http://www.faqs.org/rfcs/rfc1951.html).

Program: Unzip

The unzip.php program, shown in Example 18-5, extracts files from a ZIP archive. It uses the pc_mkdir_parents( ) function which is defined in Section 19.11. The program also requires PHP's zip extension to be installed. You can find documentation on the zip extension at http://www.php.net/zip.

This program takes a few arguments on the command line. The first is the name of the ZIP archive it should unzip. By default, it unzips all files in the archive. If additional command-line arguments are supplied, it only unzips files whose name matches any of those arguments. The full path of the file inside the ZIP archive must be given. If turtles.html is in the ZIP archive inside the animals directory, unzip.php must be passed animals/turtles.html, not just turtles.html, to unzip the file.

Directories are stored as 0-byte files inside ZIP archives, so unzip.php doesn't try to create them. Instead, before it creates any other file, it uses pc_mkdir_parents( ) to create all directories that are parents of that file, if necessary. For example, say unzip.php sees these entries in the ZIP archive:

animals (0 bytes)
animals/frogs/ribbit.html (2123 bytes)
animals/turtles.html   (1232 bytes)

It ignores animals because it is 0 bytes long. Then it calls pc_mkdir_parents( ) on animals/frogs, creating both animals and animals/frogs, and writes ribbit.html into animals/frogs. Since animals already exists when it reaches animals/turtles.html, it writes out turtles.html without creating any additional directories.

Example 18-5. unzip.php

// the first argument is the zip file
$in_file = $_SERVER['argv'][1];

// any other arguments are specific files in the archive to unzip
if ($_SERVER['argc'] > 2) {
    $all_files = 0;
    for ($i = 2; $i < $_SERVER['argc']; $i++) {
        $out_files[$_SERVER['argv'][$i]] = true;
    }
} else {
    // if no other files are specified, unzip all files
    $all_files = true;
}

$z = zip_open($in_file) or die("can't open $in_file: $php_errormsg");
while ($entry = zip_read($z)) {
    
    $entry_name = zip_entry_name($entry);

    // check if all files should be unzipped, or the name of
    // this file is on the list of specific files to unzip
    if ($all_files || $out_files[$entry_name]) {

        // only proceed if the file is not 0 bytes long
        if (zip_entry_filesize($entry)) {
            $dir = dirname($entry_name);

            // make all necessary directories in the file's path
            if (! is_dir($dir)) { pc_mkdir_parents($dir); }

            $file = basename($entry_name);

            if (zip_entry_open($z,$entry)) {
                if ($fh = fopen($dir.'/'.$file,'w')) {
                    // write the entire file
                    fwrite($fh,
                           zip_entry_read($entry,zip_entry_filesize($entry)))
                        or error_log("can't write: $php_errormsg");
                    fclose($fh) or error_log("can't close: $php_errormsg");
                } else {
                    error_log("can't open $dir/$file: $php_errormsg");
                }
                zip_entry_close($entry);
            } else {
                error_log("can't open entry $entry_name: $php_errormsg");
            }
        }
    }
}

See Also

Section 18.26 for reading and writing zlib compressed files; Section 19.11 for the pc_mkdir_parents( ) function; documentation on the zip extension at http://www.php.net/zip .

Notes

  1. When switching between standard time and daylight saving time, there are not 86,400 seconds in a day. See Section 3.11 for details.
Personal tools