Титульная страница
Менеджмент
Цель системы качества
Управление качеством
Методология моделирования процесса (IDEF)
Стандарты менеджмента
Cистема менеджмента качества ISO 9001
Total Quality Management
Экологический менеджмент
Качество в лабораториях ISO 17025
Аудит систем менеджмента качества
Отраслевые стандарты
Качество в IT
Качество в фармацевтике
Безопасность на производстве OHSAS 18001
Оформление документации
Ссылки

Chapter 9: Perl API Reference Guide
The Apache Request Object
Other Core Perl API Classes
Configuration Classes
- The Apache::ModuleConfig Class
- The Apache::CmdParms Class
The Apache::File Class
Special Global Variables, Subroutines and Literals

Chapter 9: Perl API Reference Guide

This chapter is the definitive list of all the Perl API classes and method calls. They are organized functionally by class starting with the Apache request object, and moving onward through Apache::SubRequest, Apache::Server, Apache::Connection, Apache::URI, Apache::Util, Apache::Log, and other classes.

At the end of this chapter we discuss the Apache::File class, which provides advanced functionality for HTTP/1.1 requests, and a discussion of the various ``magic'' globals, subroutines and literals that mod_perl recognizes.

The Apache Request Object

The Apache request object implements a huge number of methods. To help you find the method you're looking for, we've broken them down into eight broad categories:

Client Request Methods: These methods have to do with retrieving information about the current request, such as the fetching the requested URI, learning the request document's filename, or reading incoming HTTP headers.
Server Response Methods: These methods are concerned with setting outgoing information, such as setting outgoing headers and controlling the document language and compression.
Sending Data to the Client: These are methods for sending document content data to the client.
Server Core Functions: These are methods that control key aspects of transaction processing but are not directly related to processing browser data input or output. For example, the subrequest API is covered in this section.
Server Configuration Methods: These are methods for retrieving configuration information about the server.
Logging: These are methods for logging error messages and warnings to the server error log.
Access Control Methods: These are methods for controlling access to restricted documents and for authenticating remote users.
mod_perl-Specific Methods: These are methods that use special features of mod_perl which have no counterpart in the C API. They include such things as the gensym() method for generating anonymous filehandles, and set_handlers() for altering the list of subroutines that will handle the current request.

Should you wish to subclass the Apache object in order to add application-specific features, you'll be pleased to find that it's easy to do so. Please see Chapter 7, Subclassing the Apache Class for instructions.

Client Request Methods

This section covers the request object methods that are used to query or modify the incoming client request. These methods allow you to retrieve such information as the URI the client has requested, the request method in use, the content of any submitted HTML forms, and various items of information about the remote host.

args()

The args() method returns the contents of the URI query string (that part of the request URI that follows the ``?'' mark, if any). When called in a scalar context, args() returns the entire string. When called in a list context, the method returns a list of parsed key/value pairs.

 my $query = $r->args;
 my %in = $r->args;

One trap to be wary of: if the same argument name is present several times (as can happen with a multi-selection list in a fill-out form), assignment of args() to a hash will discard all but the last argument. To avoid this, you'll need to use the more complex argument processing scheme described in the next chapter.

connection()

This method returns an object blessed into the Apache::Connection class. See The Apache::Connection Class for information on what you can do with this object once you get it.

 my $c = $r->connection;

content()

When the client request method is POST, which generally occurs when the remote client is submitting the contents of a fill-out form, the $r->content method returns the submitted information, but only if the request content type is of type application/x-www-form-urlencoded. When called in a scalar context, the entire string is returned. When called in a list context, a list of parsed name=value pairs are returned.

To handle other types of PUT or POSTed content, you'll need to use a module such CGI.pm or Apache::Request or use the read() method and parse the data yourself. Ways of doing this as well as a module that simplifies the task, are described in the next chapter.

NOTE: you can only call content() once. If you call the method more than once, it will return undef (or an empty list) after the first try.

filename()

The filename() method sets or returns the result of the URI translation phase. During the URI translation phase, your handler will call this method with the physical path to a file in order to set the filename. During later phases of the transaction, calling this method with no arguments returns its current value.

Examples:

 my $fname = $r->filename;
 unless (open(FH, $fname)) {
 die "can't open $fname $!";
 }

 my $fname = do_translation($r->uri);
 $r->filename($fname);

finfo()

During the default translation phase, Apache walks along the components of the requested URI trying to determine where the physical file path ends and the additional path information begins (this is described at greater length at the beginning of Chapter 4). In the course of this walk, Apache makes the system stat() call one or more times to read the directory information along the path. When the translation phase is finished, the stat() information for the translated filename is cached in the request record, where it can be recovered using the finfo() method. If you need to stat() the file, you can take advantage of this cached stat structure rather than repeating this expensive system call.

When finfo() is called, it points the cached stat information into Perl's special filehandle _ which Perl uses to cache its own stat operations. You can then perform file test operations directly on this filehandle rather than on the file itself, which would incur the penalty of another stat() system call. For convenience, finfo() returns a reference to the _ filehandle, so file tests can be done directly on the return value of finfo().

The following three examples all result with the same value for $size. However the first two avoid the overhead of the implicit stat() performed by the last.

 my $size = -s $r->finfo;

 $r->finfo;
 my $size = -s _;

 my $size = -s $r->filename;

It is possible for a module to be called upon to process a URL that does not correspond to a physical file. In this case, the stat() structure will contain the result of testing for a nonexistent file, and Perl's various file test operations will all return false.

The Apache::Util package contains a number of routines that are useful for manipulating the contents of the stat structure. For example, the ht_time() routine turns Unix timestamps into HTTP-compatible human readable strings. See the Apache::Util manpage and The Apache::Util Class section later in this chapter for more details.

Example:

 use Apache::Util qw(ht_time);

 if(-d $r->finfo) {
 printf "%s is a directory\n", $r->filename;
 }
 else {
 printf "Last Modified: %s\n", ht_time((stat _)[9]);
 }

get_client_block()

setup_client_block()

should_client_block()

The get_, setup_, and should_client_block methods are lower-level ways to read the data sent by the client in POST and PUT requests. This protocol exactly mirrors the C language API described in Chapter 10 and provides for timeouts and other niceties. Although the Perl API supports them, Perl programmers should generally use the simpler read() method instead.

get_remote_host()

This method can be used to look up the remote client's DNS hostname or simply return its IP address. When a DNS lookup is successful, its result is cached and returned on subsequent calls to get_remote_host() to avoid costly multiple lookups. This cached value can also be retrieved with the Apache::Connection object's remote_host() method.

This method takes an optional argument. The type of lookup performed by this method is affected by this argument as well as the value of the HostNameLookups directive. Possible arguments to this method, whose symbolic names can be imported from the Apache::Constants module using the :remotehost import tag, are one of:

REMOTE_HOST

If this argument is specified, Apache will try to look up the DNS name of the remote host. This lookup may fail if the Apache configuration directive HostNameLookups is set to Off or the hostname cannot be determined by a DNS lookup, in which case the function will return undef.

REMOTE_NAME

When called with this argument, the method will return the DNS name of the remote host if possible, or the dotted decimal representation of the client's IP address otherwise. This is the default lookup type when no argument is specified.

REMOTE_NOLOOKUP

When this argument is specified, get_remote_host() will not perform a new DNS lookup (even if the HostNameLookups directive says so). If a successful lookup was done earlier in the request, the cached hostname will be returned. Otherwise, the method returns the dotted decimal representation of the client's IP address. This is really the same as REMOTE_NAME, but the call returns instantly with the information that is available.

REMOTE_DOUBLE_REV

This argument will trigger a double reverse DNS lookup regardless of the setting of the HostNameLookups directive. Apache will first call the DNS to return the hostname that maps to the IP number of the remote host. It will then make another call to map the returned hostname back to an IP address. If the new IP address that is returned matches the original one, then the method returns the hostname. Otherwise it returns undef. The reason for this baroque procedure is that standard DNS lookups are susceptible to DNS spoofing in which a remote machine temporarily assumes the apparent identity of a trusted host. Double reverse DNS lookups make spoofing much harder, and are recommended if you are using the hostname to distinguish between trusted clients and untrusted ones. However double reverse DNS lookups are also twice as expensive.

In recent versions of Apache, double-reverse name lookups are always performed for the name-based access checking implemented by mod_access.

Examples:

 my $remote_host = $r->get_remote_host;

 # same as above
 use Apache::Constants qw(:remotehost);
 my $remote_host = $r->get_remote_host(REMOTE_NAME);

 # double-reverse DNS lookup
 use Apache::Constants qw(:remotehost);
 my $remote_host = $r->get_remote_host(REMOTE_DOUBLE_REV) || "nohost";

get_remote_logname()

This method returns the login name of the remote user, or undef if that information could not be determined. This generally only works if the remote user is logged into a Unix or VMS host, and that machine is running the identd daemon (which implements a protocol known as RFC 1413).

The success of the call also depends on the status of the IdentityCheck configuration directive. Since identity checks can adversely impact Apache's performance, this directive is off by default.

Example:

 my $remote_logname = $r->get_remote_logname;

headers_in()

When called in a list context, the headers_in() method returns a list of key/value pairs corresponding to the client request headers. When called in a scalar context, it returns a hash reference tied to the Apache::Table class. This class provides methods for manipulating several of Apache's internal key/value table structures, and to all extents and purposes acts just like an ordinary hash table. However, it also provides object methods for dealing correctly with multi-valued entries. See The Apache::Table Class for details.

Examples:

 my %headers_in = $r->headers_in;
 my $headers_in = $r->headers_in;

Once you have copied the headers to a hash, you can refer to them by name. See Table 9.1 for a list of incoming headers that you may need to use. For example, you can view the length of the data that the client is sending by retrieving the key ``Content-length'':

 %headers_in = $r->headers_in;
 my $cl = $headers_in{'Content-length'};

You'll need to be aware that browsers are not required to be consistent in their capitalization of header field names. For example, some may refer to ``Content-Type'' and others to ``Content-type''. The Perl API copies the field names into the hash as is, and like any other Perl hash, the keys are case-sensitive. This is a potential trap.

For these reasons it's better to call headers_in() in a scalar context and use the returned tied hash. Since Apache::Table sits on top of the C table API, lookup comparisons are performed in a case-insensitive manner. The tied interface also allows you to add or change the value of a header field, in case you want to modify the request headers seen by handlers downstream. This code fragment shows the tied hash being used to get and set fields:

 my $headers_in = $r->headers_in;
 my $ct = $headers_in->{'Content-Length'};
 $headers_in->{'User-Agent'} = 'Block this robot';

It is often convenient to refer to header fields without creating an intermediate hash or assigning a variable to the Apache::Table reference. This is the usual idiom:

 my $cl = $r->headers_in->{'Content-Length'};

Certain request header fields such as ``Accept,'' ``Cookie'' and several other request fields are multivalued. When you retrieve their values, they will be packed together into one long string separated by commas. You will need to parse the individual values out yourself. Individual values can include parameters which will be separated by semicolons. Cookies are common examples of this:

 Set-Cookie: SESSION=1A91933A; domain=acme.com; expires=Wed, 21-Oct-1998 20:46:07 GMT

A few clients send headers with the same key on multiple lines. In this case you can use the Apache::Table::get() method to retrieve all of the values at once.

For full details on the various incoming headers, see the documents at http://www.w3.org/Protocols. Non-standard headers, such as those that exxperimental browsers transmit, can also be retrieved with this method call.

Table 9.1: Incoming HTTP Request Headers

 Field Description

 Accept MIME types that client accepts
 Accept-encoding Compression methods that client accepts
 Accept-language Language(s) that client accepts
 Authorization Used by various authorization/authentication schemes
 Connection Connection options, such as I<Keep-alive>
 Content-length Length, in bytes, of data to follow
 Content-type MIME type of data to follow
 Cookie Client-side Data
 From E-mail address of the requesting user (deprecated)
 Host Virtual host to retrieve data from
 If-modified-since Return document only if modified since specified
 If-none-match Return document if it has changed
 Referer URL of document that linked to the requested one
 User-agent Name and version of the client software

header_in()

The header_in() method (singular, not plural) is used to get or set the value of a client incoming request field. If the given value is undef, the header will be removed from the list of header fields:

 my $cl = $r->header_in('Content-length');
 $r->header_in($key, $val); #set the value of header '$key' 
 $r->header_in('Content-length' => undef); #remove the header

The key lookup is done in a case insensitive manner. The header_in() method predates the Apache::Table class, but remains for backwards compatibility and as a bit of a shortcut to using the headers_in method.

header_only()

If the client issues a HEAD request it wants to receive the HTTP response headers only. Content handlers should check for this by calling header_only() before generating the document body. The method will return true in the case of a HEAD request, and false in the case of other requests. Alternatively, you could examine the string value returned by method() directly, although this would be less portable if the HTTP protocol were some day expanded to support more than one header-only request method.

Example:

 # generate the header & send it
 $r->send_http_header;
 return OK if $r->header_only;

 # now generate the document...

Do not try to check numeric value returned by method_number() to identify a header request. Internally, Apache uses the M_GET number for both HEAD and GET methods.

method()

This method will return the string version of the request method, such as GET, HEAD or POST. Passing an argument will change the method, which is occasionally useful for internal redirects (Chapter 4) and for testing authorization restriction masks (Chapter 6).

Examples:

 my $method = $r->method;
 $r->method('GET');

If you update the method, you probably want to update the method_number accordingly as well.

method_number()

This method will return the request method number, internal constants defined by the Apache API. The method numbers are available to Perl programmers from the Apache::Constants module by importing the :methods set. The relevant constants include M_GET, M_POST, M_PUT and M_DELETE. Passing an argument will set this value, mainly of use for internal redirects and for testing authorization restriction masks. If you update the method number, you probably want to update the method() accordingly as well.

Note that there isn't an M_HEAD constant. This is because Apache sets the method number to M_GET when it receives a HEAD request and sets header_only() to return true.

Example:

 use Apache::Constants qw(:methods);

 if ($r->method_number == M_POST) {
 # change the request method
 $r->method_number(M_GET);
 $r->method("GET");
 $r->internal_redirect('/new/place'); 
 }

There is no particular advantage of using method_number() over method() for Perl programmers, other than being very slightly more efficient.

parsed_uri()

When Apache parses the incoming request, it will turn the request URI into a predigested uri_components structure. The parsed_uri() method will return an object blessed into the Apache::URI class, which provides methods for fetching and setting various parts of the URI. See The Apache::URI Class for details.

Example:

 use Apache::URI ();
 my $uri = $r->parsed_uri;
 my $host = $uri->hostname;

path_info()

The path_info() method will return what is left in the path after the URI translation phase. Apache's default translation method, described at the beginning of the next chapter, uses a simple directory-walking algorithm to decide what part of the URI is the file, and what part is the additional path information.

If you provide an argument to path_info(), you can change the value of the additional path information.

Examples:

 my $path_info = $r->path_info;
 $r->path_info("/some/additional/information");

Note that in most cases, changing the path_info() requires you to sync the uri() with the update. In this example, we calculate the original uri minus any path info, change the existing path info, then properly update the uri:

 my $path_info = $r->path_info;
 my $uri = $r->uri;
 my $orig_uri = substr $uri, 0, length($uri) - length($path_info);
 $r->path_info($new_path_info);
 $r->uri($orig_uri . $r->path_info);

protocol

The $r->protocol method will return a string identifying the protocol that the client speaks. Typical values will be ``HTTP/1.0'' or ``HTTP/1.1''.

 my $protocol = $r->protocol;

This method is read-only.

proxyreq()

The proxyreq() method returns true if the current HTTP request is for a proxy URI; that is, the actual document resides on a foreign server somewhere and the client wishes Apache to fetch the document on its behalf. This method is mainly intended for use during the filename translation phase of the request. See Chapter 7 for examples.

Example:

 sub handler {
 my $r = shift;
 return DECLINED unless $r->proxyreq;
 # do something interesting...
 }

read()

The read() method provides Perl API programmers with a simple way to get at the data submitted by the browser in POST and PUT requests. It should be used when the information submitted by the browser is not in the application/x-www-form-urlencoded format that the content() method knows how to handle.

Call read() with a scalar variable to retrieve the read data, and the length of the data to read. Generally you will want to ask for the entire data sent by the client, which can be recovered from the incoming Content-length field:*

 my $buff;
 $r->read($buff, $r->header_in('Content-length'));

Internally, Perl sets up a timeout in case the client breaks the connection prematurely. The exact value of the timeout is set by the Timeout directive in the server configuration file. If a timeout does occur, the script will be aborted.

Within a handler you may also recover client data by simply reading from STDIN using Perl's read(), getc() and readline (<>) functions. This works because the Perl API ties STDIN to Apache::read() before entering handlers.

footnote: *At the time of this writing, HTTP/1.1 requests which do not have a Content-Length header, such as one that uses chunked encoding, are not properly handled by this API.

server()

This method returns a reference to an Apache::Server object, from which you can retrieve all sorts of information about low-level aspects of the server's configuration. See The Apache::Server Class for details.

Example:

 my $s = $r->server;

the_request()

This method returns the unparsed request line sent by the client. the_request() is primarily used by log handlers, since others handlers will find it more convenient to use methods that return the information in preparsed form.

This method is read-only.

Example:

 my $request_line = $r->the_request;
 print LOGFILE $request_line;

Note that the_request() is functionally equivalent to this code fragment:

 my $request_line = join ' ', $r->method, $r->uri, $r->protocol;

uri()

The uri() method returns the URI requested by the browser. You may also pass this method a string argument in order to set the URI seen by handlers further down the line, something that a translation handler might want to do.

Examples:

 my $uri = $r->uri;
 $r->uri("/something/else");

Server Response Methods

This section covers the API methods used to build and query the outgoing server response message. These methods allow you to set the type and length of the outgoing document, set HTTP cookies, assign the document a language or compression method, and set up authorization and authentication schemes.

Most of the methods in this section are concerned with setting the values of the outgoing HTTP response header fields. We give a list of all of the fields you are likelyt o use in Table 9.2. For a comprehensive list, see the HTTP/1.0 and HTTP/1.1 specifications found at http://www.w3.org/Protocols.

Table 9.2: Response Header Fields

 Field Description

 Allowed The methods allowed by this URI, such as POST
 Content-Encoding The compression method of this data
 Content-Language The language in which this document is written
 Content-Length Length, in bytes, of data to follow
 Content-Type MIME type of this data
 Date The current date (GMT)
 Expires Date the document expires
 Last-Modified Date the document was last modified
 Link The URL of this document's "parent," if any
 Location The location of the document in redirection responses
 ETag Opaque ID for this version of the document
 Message-Id The ID of this document, if any
 MIME-Version The version of MIME used (currently 1.0)
 Pragma Hints to the browser, such as "no-cache"
 Public The requests that this URL responds to (rarely used)
 Server Name and version of the server software
 Set-Cookie Give the browser a client-side cookie
 WWW-Authenticate Used in the various authorization schemes
 Vary Criteria that can be used to select this document

bytes_sent()

This method will retrieve the number of bytes of information sent by the server to the client, excluding the length of the HTTP headers. It is only of value after the send_http_header() method (see below) has been called. This method is normally used by log handlers to record and summarize network usage. See Chapter 7 for examples.

Example:

 my $bytes_sent = $r->bytes_sent;

cgi_header_out()

This method is similar to the header_out() function. Given a key/value pair, it sets the corresponding outgoing HTTP response header field to the indicated value, replacing whatever was there before. However, unlike header_out(), which blindly sets the field to whatever you tell it, cgi_header_out() recognizes certain special keys and takes the appropriate action. This is used to emulate the magic fields recognized by Apache's own mod_cgi CGI-handling routines.

Table 9.3 lists the headers that trigger special actions by cgi_header_out().

Table 9.3: Special Actions Triggered by cgi_header_out()

 Header | Actions
 -----------------------------------------------------------------------------
 Content-Type | Set $r->content_type to the given value
 Status | Set $r->status to the integer value in the string
 | Set $r->status_line to the given value
 Location | Set Location in the headers_out table to the given value
 | and perform an internal redirect if URI is relative
 Content-Length | Set Content-Length in the headers_out table to the
 | given value 
 Transfer-Encoding | Set Transfer-Encoding in the headers_out table to
 | the given value
 Last-Modified | Parse the string date, feeding the time value to
 | ap_update_mtime() and invoke ap_set_last_modified()
 Set-Cookie | Call ap_table_add() to support multiple Set-Cookie headers
 Other | Call ap_table_merge() with given key and value

You generally can use the Apache::Table or header_out() methods to achieve the results you want. cgi_header_out() is provided for those who wish to create a CGI emulation layer, such as Apache::Registry. Those who are designing such a system should also look at send_cgi_header(), described below in Sending Data to the Client.

content_encoding()

This method gets or sets the document encoding. Content encoding fields are strings like ``gzip'' or ``compress'', and indicate that the document has been compressed or otherwise encoded. Browsers that handle the particular encoding scheme can decode or decompress the document on the fly.

Getting or setting content_encoding() is equivalent to using headers_out() or header_out() to change the value of the ``Content-encoding'' header. Chapters 4 and 7 give examples of querying and manipulating the content encoding field.

Examples:

 my $enc = $r->content_encoding;
 if($r->filename =~ /\.gz$/) {
 $r->content_encoding("gzip");
 }

content_languages()

The content_languages() method gets or sets the ``Content-language'' HTTP header field. Called without arguments it returns an array reference consisting of two-letter language identifiers. For example ``en'' for English, and ``no'' for Norwegian. You can also pass it a array reference to set the list of languages to a new value. This method can be used to implement support for multi-language documents. See the Apache::MIME module in Chapter 7 for an example.

content_languages() is a convenient interface to the lower-level header_out and headers_out methods.

Examples:

 my $languages = $r->content_languages;
 $r->content_languages(['en']);

content_type()

This method corresponds to the Content-type header field, which tells the browser the MIME type of the returned document. Common MIME types include ``text/plain'', ``text/html'' and ``image/gif''. content_type() can be used either to get or set the current value of this field. It is important to use content_type() to set the content type rather than calling headers_out() or header_out() to change the outgoing HTTP header directly. This is because a copy of the content type is kept in the request record, and other modules and core protocol components will consult this value rather than the outgoing headers table.

Examples:

 my $ct = $r->content_type;
 $r->content_type('text/plain');

custom_response()

When a handler returns a code other than OK, DECLINED or DONE, Apache aborts processing and throws an error. When an error is thrown, application programs can catch it and replace Apache's default processing with their own custom error handling routines by using the ErrorDocument configuration directive. The arguments to ErrorDocument are the status code to catch and a custom string, static document, or CGI script to invoke when the error occurs.

The module-level interface to Apache's error handling system is custom_response(). Like the directive, the method call takes two arguments.* The first argument is a valid response code from Table 3.1. The second is either a string to return in response to the error, or a URI to invoke to handle the request. This URI can be a static document, a CGI script, or even a content handler in an Apache module. Chapters 4 and 6 have more extensive coverage of the error handling system.

Examples:

 use Apache::Constants qw(:common);
 $r->custom_response(AUTH_REQUIRED, "sorry, I don't know you.");
 $r->custom_response(SERVER_ERROR, "/perl/server_error_handler.pl");

footnote: Of course, the method actually takes 3 arguments, the first of which is request_rec object, but you know what we mean.

err_headers_out()

Apache actually keeps two sets of outgoing response headers, one set to use when the transaction is successful, and another to use in the case of a module returning an error code. Although maintaining a dual set of headers may seem redundant, it makes custom error handlers much easier to write, as we shall see in the next chapter. err_headers_out() is equivalent to headers_out(), but it gets and sets values in the table of HTTP header response fields that are sent in the case of an error.

Unlike ordinary header fields, error fields are sent to the browser even when the module aborts or returns a error status code. This allows modules to do such things as setting cookies when errors occur, or implementing custom authorization schemes. Error fields also persist across internal redirects when one content handler passes the buck to another. This feature is necessary to support the ErrorDocument mechanism.

Examples:

 my %err_headers_out = $r->err_headers_out;
 my $err_headers_out = $r->err_headers_out;
 $r->err_headers_out->{'X-Odor'} = "Something's rotten in Denmark";

err_header_out()

Like header_in() and header_out() methods, err_header_out() predates the Apache::Table class. It can be used to get or set a single field in the error headers table. As with the other header methods, the key lookups are done in a case insensitive manner. Its syntax is identical to header_out():

Example:

 my $loc = $r->err_header_out('Location');
 $r->err_header_out(Location => 'http://www.modperl.com/');
 $r->err_header_out(Location => undef);

headers_out()

headers_out() provides modules with the ability to get or set any of the outgoing HTTP response header fields. When called in a list context, the headers_out() returns a list of key/value pairs corresponding to the current server response headers. The capitalization of the field names is not canonicalized prior to copying them into the list. browser.

When called in a scalar context, this method returns a hash reference tied to the Apache::Table class. This class provides an interface to the underlying headers_out data structure. Fetching a key from the tied hash will retrieve the corresponding HTTP field in a case insensitive fashion, and assigning to the hash will change the value of the header so that it is seen by other handlers further down the line, and ultimately affects the header that is sent to the browser.

The headers that are set with headers_out() are cleared when an error occurs, and do not persist across internal redirects (in which a module hands off its content-handling responsibility to a different URI). To create headers that persist across errors and internal redirects, use err_headers_out(), described below.

Examples:

 my %headers_out = $r->headers_out;
 my $headers_out = $r->headers_out;
 $headers_out->{Cookie} = 'SESSION_ID=3918823';

The ``Content-type'', ``Content-encoding'' and ``Content-language'' response fields have special meaning to the Apache server and its modules. These fields occupy their own slots of the request record itself and should always be accessed using their dedicated methods rather than the generic headers_out() method. If you forget, and use headers_out() instead, Apache and other modules may not recognize your changes, leading to confusing results. In addition, the ``Pragma: no-cache'' idiom, used to tell browsers not to cache the document, should be set indirectly using the no_cache() method.

The many features of the Apache::Table class are described in more detail in its own section.

header_out()

Before the Apache::Table class was written, header_out() was used to get or set the value of an individual HTTP field. Like the header_in() method, header_out() predates the Apache::Table class, but remains for backwards compatibility and as a bit of a shortcut to using the headers_in method.

If passed a single argument, header_out() returns the value of the corresponding field from the outgoing HTTP response header. If passed a key/value pair, header_out() stably changes the value of the corresponding header field. A field can be removed entirely by passing undef as its value. The key lookups are done in a case insensitive manner.

Examples:

 my $loc = $r->header_out('Location');
 $r->header_out(Location => 'http://www.modperl.com/');
 $r->header_out(Location => undef);

handler()

The handler method gets or sets the name of the module that is responsible for the content generation phase of the current request. For example, for requests to run CGI scripts, this will be the value ``cgi-script.'' Ordinarily this value is set in the configuration file using the SetHandler or AddHandler directives. However your handlers can set this value during earlier phases of the transaction, typically the MIME type checking or fixup phases.

Chapter 7 gives examples of how to use handler() to create a handler that dispatches to other modules based on the document's type.

Example:

 my $handler = $r->handler;
 if($handler eq "cgi-script") {
 warn "shame on you. Fixing.\n"
 $r->handler('perl-script');
 }

handler() cannot be used to set handlers for anything but the response phase. Use set_handlers() or push_handlers() to change the handlers for other phases (see mod_perl Specific Methods).

no_cache()

The no_cache() method gets or sets a boolean flag that indicates that the data being returned is volatile. Browsers that respect this flag will avoid writing the document out to the client-side cache. Setting this flag to true will cause Apache to emit an ``Expires'' field with the same date and time as the original request.

Examples:

 $current_flag = $r->no_cache(); 
 $r->no_cache(1); # set no-cache to true

request_time()

This method returns the time at which the request started, expressed as a Unix timestamp in seconds since the start of an arbitrary period called the ``epoch''.* You can pass this to Perl's localtime() function to get a human readable string, or to any of the available time and date handling Perl modules to manipulate it in various ways.

Unlike most of the other methods, this one is read only.

Example:

 my $date = scalar localtime $r->request_time;
 warn "request started at $date";

footnote: *In case you were wondering, the epoch began at 00:00:00 GMT on January 1, 1970, and is due to end in 2038. There's probably a good explanation for this choice.

status()

The status() method allows you to get or set the status code of the outgoing HTTP response. Usually you will set this value indirectly by returning the status code as the handler's function result. However, there are rare instances when you want to trick Apache into thinking that the module returned an OK status code, but actually send the browser a non-OK status.

Call the method with no arguments to retrieve the current status code. Call it with a numeric value to set the status. Constants for all the standard status codes can be found in Apache::Constants.

Examples:

 use Apache::Constants qw(:common);

 my $rc = $r->status;
 $r->status(SERVER_ERROR);

status_line()

status_line() is used to get or set the error code and the human-readable status message that gets sent to the browser. Ordinarily you should use status() to set the numeric code and let Apache worry about translating this into a human readable string. However, if you want to generate an unusual response line, you can use this method to set the line. To be successful, the response line must begin with one of the valid HTTP status codes.

Example:

 my $status_line = $r->status_line;
 $r->status_line("200 Bottles of Beer on the Wall");

If you update the status line, you probably want to update status() accordingly as well.

Sending Data to the Client

The methods in this section are invoked by content handlers to send header and document body data to the waiting browser. Non-content handlers should not call these methods.

print()

The Apache C API provides several functions for sending formatted data to the client. However, Perl is more flexible in its string handling functions, so only one method, print() is needed.

The print() method is similar to Perl's built-in print() function except that all the data you print eventually winds up being displayed on the user's browser. Like the built-in print() this method will accept a variable number of strings to print out. However, the Apache print() method does not accept a filehandle argument for obvious reasons.

Like the read() method, print() sets a timeout so that if the client connection is broken the handler won't hang around indefinitely trying to send data. If a timeout does occur, the script will be aborted.

The method also checks the Perl autoflush global $|. If the variable is non-zero, print() will flush the buffer after every command, rather than after every line. This is consistent with the way the built-in print() works.

Example:

 $r->print("hello" , " ", "world!");

An interesting feature of the Apache Perl API is that the STDOUT filehandle is tied to Apache so that if you use the built-in print() to print to standard output, the data will be redirected to the request object's print() method. This allows CGI scripts to run unmodified under Apache::Registry, and also allows one content handler's output to be transparently ``chained'' to another handler's input. The TieHandle Interface section later in this chapter goes into more detail on how filehandles can be tied to the Perl API, and Chapter 4 has more to say about chained handlers.

Example:

 print "hello world!"; # automatically invokes Apache::print()

There is also an optimization built into print(). If any of the arguments to the method are scalar references to strings, they are automatically dereferenced for you. This avoids needless copying of large strings when passing them to subroutines.

Example:

 $a_large_string = join '', <GETTYSBURG_ADDRESS>;
 $r->print(\$a_large_string);

printf()

The printf() method works just the like the built-in function of the same name, except that the data is sent to the client. Calling the built-in printf() on STDOUT will indirectly invoke this method because STDOUT is tied.

Example:

 $r->printf("Hello %s", $r->connection->user);

rflush()

For efficiency's sake, Apache usually buffers the data printed by the handler and sends it to the client only when its internal buffers fill (or the handler is done). The rflush() method causes Apache to flush and send its buffered outgoing data immediately. You may wish to do this if you have a long-running content handler and you wish the client to see the data start to appear sooner.

Don't call rflush() if you don't need to, as it causes a performance hit.* This method is also called automatically after each print() if the Perl global variable $| is set to non-zero.

Example:

 $r->rflush;

footnote: *If you are wondering why this method has an r prefix, it is carried over from the C API I/O methods (described in Chapter 10), all of which have an ap_r prefix. This is the only I/O method from the group for which there is a direct Perl interface. If you find that the r prefix is not pleasing to the eye, this is no accident. It is indended to discourage the use of rflush() due to the perfomance implications.

send_cgi_header()

As we mentioned in the section on cgi_header_out(), the mod_cgi module scans for and takes special action on certain header fields emitted by CGI scripts. Developers who wish to develop a CGI emulation layer can take advantage of send_cgi_header(). It accepts a single string argument formatted like a CGI header, parses it into fields, and passes the parsed fields to cgi_header_out(). cgi_header_out() then calls send_http_header() to send the completed header to the browser.

Don't forget to put a blank line at the end of the headers, just as a CGI script would:

 $r->send_cgi_header(<<EOF);
 Status: 200 Just Fine
 Content-type: text/html
 Set-Cookie: open=sesame

EOF

You're welcome to use this method even if you aren't emulating the CGI environment, since it provides a convenient one-shot way to set and send the entire HTTP header, however, there is a performance hit associated with parsing the header string.

As an aside, this method is used to implement the behavior of the PerlSendHeader directive. When this directive is set to ``On'', mod_perl scans the first lines of text printed by the content handler until it finds a blank line. Everything above the blank line is then sent to send_cgi_header().

send_fd()

Given an open filehandle, filehandle glob or glob reference as argument, this method sends the contents of the file to the client. Internally the Perl interface extracts the file descriptor from the filehandle and uses that directly, which is generally faster than calling the higher-level Perl methods. The confusing naming of this method (it takes a filehandle, not a file descriptor) is to be consistent with the naming of the corresponding C API function call.

This method is generally used by content handlers that wish to send the browser the unmodified contents of a file.

Example:

 my $fh = Apache::gensym(); # generate a new filehandle name
 open($fh, $r->filename) || return NOT_FOUND;
 $r->send_fd($fh);
 close($fh);

send_http_header()

This method formats the outgoing response data into a proper HTTP response and sends it to the client. The header is constructed from values previously set by calls to content_type(), content_encoding(), content_language(), status_line(), and headers_out(). Naturally, this method should be called before any other methods for sending data to the client.

Because setting the document's MIME type is such a common operation, the Perl version of this API call allows you to save a few keystrokes by specifying the content type as an optional argument to send_http_header(). This is exactly equivalent to calling content_type() followed by send_http_header().

Examples:

 $r->send_http_header;
 $r->send_http_header('text/plain');

A content type passed to send_http_header() will override any previous calls to content_type().

Server Core Functions

This section covers the API methods that are available for your use during the processing of a request, but are not directly related to incoming or outgoing data.

chdir_file()

Given a filename as argument, change from the current directory to the directory in which the file is contained. This is a convenience routine for modules that implement scripting engines, since it is common to run the script from the directory in which it lives. The current directory will remain here, unless your module changes back to the previous directory. As there is significant overhead associated with determining the current directory, we suggest using the $Apache::Server::CWD variable or the server_root_relative() method if you wish to return to the previous directory afterward.

Example:

 $r->chdir_file($r->filename);

child_terminate()

Calling this method will cause the current child process to shutdown gracefully after the current transaction is completed and the logging and cleanup phases are done. This method is not available on Win32 systems.

Example:

 $r->child_terminate;

hard_timeout()

kill_timeout()

reset_timeout()

soft_timeout()

The timeout API governs the interaction of Apache with the client. At various points during the request/response cycle a browser that is no longer responding can be timed out so that it doesn't continue to hold the connection open. Timeouts are primarily of concern to C API programmers, as mod_perl handles the details of timeouts internally for read and write methods. However, these calls are included in the Perl API for completeness.

The hard_timeout() method initiates a ``hard'' timeout. If the client read or write operation takes longer than the time specified by Apache's Timeout directive, then the current handler will be aborted immediately and Apache will immediately enter the logging phase. hard_timeout() takes a single string argument which should contain the name of your module or some other identification. This identification will be incorporated into the error message that is written to the server error log when the timeout occurs.

soft_timeout(), in contrast, does not immediately abort the current handler. Instead, when a timeout occurs control returns to the handler, but all reads and write operations are replaced with no-ops so that no further data can be sent or received to the client. In addition, the Apache::Connection object's aborted() method will return true. Like hard_timeout() you should pass this method the name of your module in order to be able to identify the source of the timeout in the error log.

The reset_timeout() method can be called to set a previously initiated timer back to zero. It is usually used between a series of read or write operations in order to avoid killing the timeout and restarting it completely.

Finally, the kill_timeout() method is called to cancel a previously initiated timeout. It is generally called when a series of I/O operations are completely done.

The examples below will give you the general idea of how these four methods are used. Remember, however, that in the Perl API these methods are not really necessary because they are called internally by the read() and print() methods.

 # typical hard_timeout() usage
 $r->hard_timeout("Apache::Example while reading data");
 while (... read data loop ...) {
 ...
 $r->reset_timeout;
 }
 $r->kill_timeout;

 # typical soft_timeout() usage
 $r->soft_timeout("Apache::Example while reading data");
 while (... read data loop ...) {
 ...
 $r->reset_timeout;
 }
 $r->kill_timeout;

internal_redirect()

Unlike a full HTTP redirect in which the server tells the browser to look somewhere else for the requested document, the internal_redirect() method tells Apache to return a different URI without telling the client. This is a lot faster than a full redirect.

The required argument is an absolute URI path on the current server. The server will process the URI as if it were a whole new request, running the URI translation, MIME type checking, and other phases before invoking the appropriate content handler for the new URI. The content handler that eventually runs is not necessarily the same as the one that invoked internal_redirect(). This method should only be called within a content handler.

Do not use internal_redirect() to redirect to a different server. You'll need to do a full redirect for that. Both redirection techniques are described in more detail in the next chapter.

Example:

 $r->internal_redirect("/new/place");

Apache implements its ErrorDocument feature as an internal redirect, so many of the techniques that apply to internal redirects also apply to custom error handling.

internal_redirect_handler()

This method does the same thing as internal_redirect(), but arranges for the content handler used to process the redirected URI to be the same as the current content handler.

Example:

 $r->internal_redirect_handler("/new/place");

is_initial_req()

There are several instances in which an incoming URI request can trigger one or more secondary internal requests. An internal request is triggered when internal_redirect() is called explicitly, and also happens behind the scenes when lookup_file() and lookup_uri() are called.

With the exception of the logging phase, which is run just once for the primary request, secondary requests are run through each of the transaction processing phases, and the appropriate handlers are called each time. There may be times when you don't want a particular handler running on a subrequest or internal redirect, either to avoid performance overhead or to avoid infinite recursion. The is_initial_req() method will return a true value if the current request is the primary one, and false if the request is the result of a subrequest or an internal redirect.

Example:

 return DECLINED unless $r->is_initial_req;

is_main()

This method can be used to distinguish between subrequests triggered by handlers and the ``main'' request triggered by a browser's request for a URI or an internal redirect. is_main() returns a true value for the primary request and for internal redirects, and false for subrequests. Notice that this is slightly different from is_initial_req(), which returns false for internal redirects as well as subrequests.

is_main() is commonly used to prevent infinite recursion when a handler gets reinvoked after it has made a subrequest.

 return DECLINED unless $r->is_main;

Like is_initial_req() this is a read-only method.

last()

main()

next()

prev()

When a handler is called in response to a series of internal redirects, ErrorDocuments or subrequests, it is passed an ordinary-looking request object and can usually proceed as if it were processing a normal request. However, if a module has special needs, it can use these methods to walk the chain to examine the request objects passed to other requests in the series.

main() will return the request object of the parent request, the top of the chain. last() will return the last request in the chain. prev() and next() will return the previous and next requests in the chain, respectively. Each of these methods will return a reference to an object belonging to the Apache class, or undef if the request doesn't exist.

The prev() method is handy inside an ErrorDocument handler to get at the information from the request that triggered the error. For example, this code fragment will find the URI of the failed request:

 my $failed_uri = $r->prev->uri;

The last() method is mainly used by logging modules. Since Apache may have performed several subrequests while attempting to resolve the request, the last object will always point to the final result.

Example:

 my $bytes_sent = $r->last->bytes_sent;

Should your module wish log all internal requests, the next() method will come in handy. Example:

 sub My::logger {
 my $r = shift;

 my $first = $r->uri;
 my $last = $r->last->uri;
 warn "first: $first, last: $last\n";

 for (my $rr = $r; $rr; $rr = $rr->next) {
 my $uri = $rr->uri;
 my $status = $rr->status;
 warn "request: $uri, status: $status\n";
 }

 return OK;
 }

Assuming the requested URI was /, which was mapped to /index.html by the DirectoryIndex configuration, the example above would output these messages to the ErrorLog:

 first: /, last: /index.html
 request: /, status: 200
 request: /index.html, status: 200

The next() and main() methods are rarely used, but are included for completeness. Handlers that need to determine whether they are in the main request should call $r->is_main() rather than !$r->main(), as the former is marginally more efficient.

location()

If the current handler was triggered by a Perl*Handler directive within a <Location> section, this method will return the path indicated by the <Location> directive.

For example, given this <Location> section:

 <Location /images/dynamic_icons>
 SetHandler perl-script
 PerlHandler Apache::Icon
 </Location>

then location() will return /images/dynamic_icons.

This method is handy for converting the current document's URI into a relative path. Example:

 my $base = $r->location;
 (my $relative = $r->uri) =~ s/^$base//;

lookup_file()

lookup_uri()

lookup_file() and lookup_uri() invoke Apache subrequests. A subrequest is treated exactly like an ordinary request, except that the post read request, header parser, response generation and logging phases are not run. This allows modules to pose ``what-if'' questions to the server. Subrequests can be used to learn the MIME type mapping of an arbitrary file, map a URI to a filename, or find out whether a file is under access control. After a successful lookup, the response phase of the request can optionally be invoked.

Both methods take a single argument corresponding to an absolute filename or a URI path respectively. lookup_uri() performs the URI translation on the provided URI, passing the request to the access control and authorization handlers, if any, and then proceeds to the MIME type checking phase. lookup_file() behaves similarly, but bypasses the initial URI translation phase and treats its argument as a physical file path.

Both methods return an Apache::SubRequest object, which is identical for all intents and purposes to a plain old Apache request object, as it inherits all methods from the Apache class. You can call the returned object's content_type(), filename() and other methods to retrieve the information left there during subrequest processing.

The subrequest mechanism is extremely useful, and there are many practical examples of using it in Chapters 4, 5 and 6. The following code snippets show how to use subrequests to look up the content type of a file and a URI:

 my $subr = $r->lookup_file('/home/http/htdocs/images/logo.tif');
 my $ct = $subr->content_type;

 my $ct = $r->lookup_uri('/images/logo.tif')->content_type;

In the lookup_uri() example, /images/logo.tif will be passed through the same series of Alias, ServerRoot and URI rewriting translations that the URI would be subjected to if it were requested by a browser.

If you need to pass certain HTTP header fields to the subrequest, such as a particular value of Accept, you can do so by calling headers_in() before invoking lookup_uri() or lookup_file()

It is often a good idea to check the status of a subrequest in case something went wrong. If the subrequest was successful, the status value will be that of HTTP_OK. Example:

 use Apache::Constants qw(:common HTTP_OK);
 my $subr = $r->lookup_uri("/path/file.html");
 my $status = $subr->status;

 unless ($status == HTTP_OK) {
 die "subrequest failed with status: $status";
 }

notes()

There are times when handlers need to communicate among themselves in a way that goes beyond setting the values of HTTP header fields. To accommodate this, Apache maintains a ``notes'' table in the request record. This table is simply a list of key/value pairs. One handler can add its own key/value entry to the notes table, and later the handler for a subsequent phase can retrieve the note. Notes are maintained for the life of the current request, and are deleted when the transaction is finished.

When called with two arguments this method sets a note. When called with a single argument, it retrieves the value of that note. Both the keys and the values must be simple strings.

Examples:

 $r->notes('CALENDAR' => 'Julian');
 my $cal = $r->notes('CALENDAR');

When called in a scalar context with no arguments, a hash reference tied to the Apache::Table class will be returned. Example:

 my $notes = $r->notes;
 my $cal = $notes->{CALENDAR};

This method comes in handy for communication between a module written in Perl and one written in C. For example, the logging API saves error messages under a key named ``error-notes'', which could be used by ErrorDocuments to provide a more informative error message.

The LogFormat directive, part of the standard mod_log_config module, can incorporate notes into log messages using the formatting character %n. See the Apache documentation for details.

subprocess_env()

The subprocess_env() method is used to examine and change the Apache environment table. Like other table-manipulation functions, this method has a variety of behaviors depending on the number of arguments it is called with and the context in which it is called. Call the method with no arguments in a scalar context to return a hash reference tied to the Apache::Table class:

 my $env = $r->subprocess_env;
 my $docroot = $env->{'DOCUMENT_ROOT'};

Call the method with a single argument to retrieve the current value of the corresponding entry in the environment table, or undef if no entry by that name exists:

 my $doc_root = $r->subprocess_env("DOCUMENT_ROOT");

You may also call the method with a key/value pair to set the value of an entry in the table:

 $r->subprocess_env(DOOR => "open");

Finally, if you call subprocess_env() in a void context with no arguments, it will reinitialize the table to contain the standard variables that Apache adds to the environment before invoking CGI scripts and server-side include files:

 $r->subprocess_env;

Changes made to the environment table only persist for the length of the request. The table is cleared out and reinitialized at the beginning of every new transaction.

In the Perl API, the primary use for this method is to set environment variables for other modules to see and use. For example, a fixup handler could use this call to set up environment variables that are later recognized by mod_include and incorporated into server-side include pages. You do not ordinarily need to call subprocess_env() to read environment variables, because mod_perl automatically copies the environment table into the Perl %ENV array before entering the response handler phase.

A potential confusion arises when a Perl API handler needs to launch a subprocess itself using system(), backticks, or a piped open. If you need to pass environment variables to the subprocess, set the appropriate keys in %ENV just as you would in an ordinary Perl script. subprocess_env() is only required if you need to change the environment in a subprocess launched by a different handler or module.

register_cleanup()

The register_cleanup() method registers a subroutine that will be called after the logging stage of a request. This is much the same as installing a cleanup handler with the PerlCleanupHandler directive. See Chapter 7 for some practical examples of using register_cleanup().

The method expects a code reference argument:

 sub callback {
 my $r = shift;
 my $uri = $r->uri;
 warn "process $$ all done with $uri\n";
 }
 $r->register_cleanup(\&callback);

Server Configuration Methods

Several methods give you access to the Apache server's configuration settings. You can inspect the configuration, and in many cases, change it dynamically. The most commonly-needed configuration information can be obtained directly from the methods given in this section. More esoteric information can be obtained via the Apache::Server object returned by the request object's server() method. See The Apache::Server Class for details.

dir_config()

The dir_config() method and the PerlSetVar configuration directive together are the primary way of passing configuration information to Apache Perl modules.

The PerlSetVar directive can occur in the main part of a configuration file, in a <VirtualHost>, <Directory>, <Location> or <Files> section, or in a .htaccess file. It takes a key/value pair separated by whitespace.

In the following two examples, the first directive sets a key named ``Gate'' to a value of ``open''. The second sets the same key to a value of ``wide open and beckoning''. Notice how quotes are used to protect arguments that contain whitespace:

 PerlSetVar Gate open
 PerlSetVar Gate "wide open and beckoning"

Configuration files can contain any number of PerlSetVar directives. If multiple directives try to set the same key, the usual rules of directive precedence apply. A key defined in a .htaccess file has precedence over a key defined in a <Directory>, <Location>, or <Files> section, which in turn has precedence over a key defined in a <VirtualHost> section. Keys defined in the main body of the configuration file have the lowest precedence of all.

Configuration keys set with PerlSetVar can be recovered within Perl handlers using dir_config(). The interface is simple. Called with the name of a key, dir_config() looks up the key and returns its value if found, or undef otherwise.

Example:

 my $value = $r->dir_config('Gate');

If called in a scalar context with no arguments, dir_config() returns a hash reference tied to the Apache::Table class. See The Apache::Table Class for details.

 my $dir_config = $r->dir_config; 
 my $value = $dir_config->{'Gate'};

Only scalar values are allowed in configuration variables set by PerlSetVar. If you want to pass an array or hash, separate the items by a character that doesn't appear elsewhere in the string and call split() to break the retrieved variable into its components.

document_root()

The document_root() method returns the value of the document root directory. The value of the document root is set by the server configuration directive DocumentRoot, and usually varies between different virtual hosts. Apache uses the document root to translate the URI into a physical pathname unless a more specific translation rule, such as Alias, applies.

Example:

 my $doc_root = $r->document_root;

If you are used to using the environment variable DOCUMENT_ROOT within your CGI scripts in order to resolve URIs into physical pathnames, be aware that there's a much better way to do this in the Apache API. Perform a subrequest with the URI you want to resolve, and then call the returned object's filename() method. This works correctly even when the URI is affected by Alias directives or refers to user-maintained virtual directories:

 my $image = $r->lookup_uri('/~fred/images/cookbook.gif')->filename;

If you're interested in fetching the physical file corresponding to the current request, call the current request object's filename() method:

 my $file = $r->filename;

get_server_port()

This method returns the port number on which the server is listening.

Example:

 my $port = $r->get_server_port;

If UseCanonicalName is configured to be On (the default), this method will return the value of the Port configuration directive. If no Port directive is present, the default port 80 is returned. If UseCanonicalName is Off and the client sent a Host header, then the method returns the actual port specified here, regardless of the value of the Port directive.

get_server_name()

This read-only method returns the name of the server handling the request.

Example:

 my $name = $r->get_server_name;

This method is sensitive to the value of the UseCanonicalName configuration directive. If UseCanonicalName is On (the default), the method will always return the value of the current ServerName configuration directive. If UseCanonicalName is Off, then this method will return the value of the incoming request's Host header if present, or the value of the ServerName directive otherwise. These values can be different if the server has several different DNS names.

The lower-level server_name() method in the Apache::Server class always acts as if UseCanonicalName were on.

server_root_relative()

Called without any arguments, the server_root_relative() method returns the currently-configured ServerRoot directory (in which Apache's binaries, configuration files and logs commonly reside). If you pass this method a relative pathname, it will resolve the relative pathname to an absolute one based on the value of the server root. This is the preferred way to locate configuration and log files that are stored beneath the server root.

Examples:

 # return ServerRoot
 my $ServerRoot = $r->server_root_relative;

 # return $ServerRoot/logs/my.log
 my $log = $r->server_root_relative("logs/my.log");

The server_root_relative method can also be invoked without a request object by calling it directly from the Apache class. The example below, which might be found at the beginning of a Perl startup file, first imports the Apache module, and then uses server_root_relative() to add a site-specific library directory to the search path. It does this in a BEGIN {} block to ensure that this code is evaluated first. It then loads a local module named My::App, which presumably will be found in the site-specific directory.

 #!/usr/bin/perl
 # modify the search path
 BEGIN {
 use Apache():
 use lib Apache->server_root_relative("lib/my_app");
 }
 use My::App ();

Logging Methods

This section covers request object methods that generate entries in the server error log. They are handy for debugging and error reporting. Prior to Apache 1.3, the error logging API was a very simple one that didn't distinguish between different levels of severity. Apache now has a more versatile logging API similar to the Unix syslog system.* Each entry is associated with a severity level from low (``debug'') to high (``critical''). By adjusting the value of the LogLevel directive, the webmaster can control which error messages are recorded to the error log file.

First we cover the interface to the earlier API. Later we'll discuss the Apache::Log class, which implements the 1.3 interface.

footnote: *In fact, the loglevel API now provides direct syslog support. See the Apache documentation for the ErrorLog directive, which explains how to enable logging via syslog.

Pre 1.3 API Methods

log_error()

The log_error() messages writes a nicely timestamped error message to the server error log. It takes one or more string arguments, concatenates them into a line, and writes out the result. This method log at the ``error'' log level according the newer API.

For example, this code:

 $r->log_error("Can't open index.html $!");

results in the following ErrorLog entry:

 [Tue Jul 21 16:28:51 1998] [error] Can't open index.html No such file or directory

log_reason()

The log_reason() method behaves like log_error() but generates additional information about the request that can help with the post-mortem. The format of the entries this method produces is:

 [$DATE] [error] access to $URI failed for $HOST, reason: $MESSAGE

where $DATE is the time and date of the request, $URI is the requested URI, $HOST is the remote host, and $MESSAGE is a message that you provide. For example, this code fragment:

 $r->log_reason("Can't open index.html $!");

might generate the following entry in the error log:

 [Tue Jul 21 16:30:47 1998] [error] access to /perl/index.pl
 failed for w15.yahoo.com, reason: Can't open index.html No such file
 or directory

The argument to log_reason() is the message you wish to display in the error log. If you provide an additional second argument, it will be displayed rather than the URI of the request. This is usually used to display the physical path of the requested file:

 $r->log_reason("Can't open file $!", $r->filename);

This type of log message is most often used by content handlers that need to open and process the requested file before transmitting it to the browser, such as server-side include systems.

warn()

warn() is similar to log_error(), but on post-1.3.0 versions of Apache it will result in the logging of a message only when LogLevel is set to warn or higher.

Example:

 $r->warn("Attempting to open index.html");

as_string()

The as_string() method is a handy debugging aid for working out obscure problems with HTTP headers. It formats the current client request and server response fields into a HTTP header, and returns it as a multi-line string. The request headers will come first, followed by a blank line, followed by the response. For example, here is an example of using as_string() within a call to warn() and the output it might produce:

 $r->warn("HTTP dump:\n", $r->as_string);

 [Tue Jul 21 16:51:51 1998] [warn] HTTP dump:
 GET /perl/index.pl HTTP/1.0
 User-Agent: lwp-request/1.32
 Host: localhost:9008

 200 OK
 Connection: close
 Content-Type: text/plain

The Apache::Log Class

Apache version 1.3 introduced the notion of a log level. There are eight log levels, ranging in severity from emerg to debug. When modules call the new API logging routines, they provide the severity level of the message. You can control which messages appear in the server error logging by adjusting a new LogLevel directive. Messages greater than or equal to the severity level given by LogLevel appear in the error log. Messages below the cutoff are discarded.

The Apache::Log API provides eight methods named for each of the severity levels. Each acts like the request object's error_log() method, except that it logs the provided message using the corresponding severity level.

In order to use the new logging methods, you must use Apache::Log in the Perl startup file or at within your module. You must then fetch an Apache::Log object by calling the log() method of either an Apache ($r->log()) or an Apache::Server object ($r->server->log(). Both objects have access to the same methods described below. However, the object returned from the $r->log() provides some additional functionality. It will include the client IP address, in dotted decimal form, with the log message. In addition, the message will be saved in the request's notes table, under a key named ``error-notes''. It is the equivalent of the C language API's ap_log_rerror() function (Chapter 10).

The methods described below can be called with one or more string arguments or a subroutine reference. If a subroutine reference is used, it is expect to return a string which will be used in the log message. The subroutine will only be invoked if the LogLevel is set to the given level or higher. This is most useful to provide verbose debugging information during development, while saving CPU cycles during production.

log()

The log() method returns an object blessed into the Apache::Log class. log() is implemented both for the Apache class and for the Apache::Server class.

Example:

 use Apache::Log ();
 my $log = $r->log; # messages will include client ip address
 my $log = $r->server->log; # message will not include client ip address

emerg()

This logs the provided message at the emergency log level, a level ordinarily reserved for problems that render the server unusable.

 $log->emerg("Cannot open lock file!");

alert()

This logs the message using the alert level, which is intended for problems that require immediate attention.

 $log->alert("getpwuid: couldn't determine user name from uid");

crit()

This logs the message at the critical level, intended for severe conditions.

 $log->crit("Cannot open configuration database!");

error()

This logs the message at the error level, a catchall for non-critical error conditions.

 $log->error("Parse of script failed: $@");

warn()

The warn level is intended for warnings that may or may not require someone's attention.

 $log->warn("No database host specified, using default");

notice()

notice() is used for normal but significant conditions.

 $log->notice("Cannot connect to master database, trying slave $host");

info()

This method is used for informational messages.

 $log->info("CGI.pm version is old, consider upgrading") if
 $CGI::VERSION < 2.42;

debug()

This logs messages at the debug level, the lowest of them all. It is used for messages you wish to print during development and debugging. The debug level will also include the filename and line number of the caller in the log message.

 $log->debug("Reading configuration from file $fname");

 $log->debug(sub {
 "The request: " . $r->as_string;
 });

Access Control Methods

The Apache API provides several methods that are used for access control, authentication and authorization. We gave complete examples of using these methods in Chapter 6.

allow_options()

The allow_options() method gives module writers access to the per-directory Options configuration. It returns a bitmap in which a bit is set to one if the corresponding option is enabled. The Apache::Constants module provides symbolic constants for the various options when you import the tab :options. You will typically perform a bitwise AND (&) on the options bitmap to check which ones are enabled.

For example, a script engine such as Apache::Registry or Apache::SSI might want to check if it's OK to execute a script in the current location using this code:

 use Apache::Constants qw(:common :options);

 unless($r->allow_options & OPT_EXECCGI) {
 $r->log_reason("Options ExecCGI is off in this directory",
 $r->filename);
 return FORBIDDEN;
 }

A full list of option constants can be found in the Apache::Constants manual page.

auth_name()

This method will return the current value of the per directory configuration directive AuthName, which is used in conjunction with password-protected directories. AuthName declares an authorization ``realm'', which is intended as a high-level grouping of an authentication scheme and a URI tree to which it applies.

If the requested file or directory is password protected, auth_name() will return the realm name. An authentication module can then use this realm name to figure out which database to authenticate the user against. This method can also be used to set the value of the realm for use by later handlers.

Examples:

 my $auth_name = $r->auth_name();
 $r->auth_name("Protected Area");

auth_type()

Password-protected files and directories will also have an authorization type, which is usually one of ``Basic'' or ``Digest.'' The authorization type is set with the configuration directive AuthType and retrieved with the API method auth_type(). Here's an example from a hypothetical authentication handler that can only authenticate using the Basic method:

 my $auth_type = $r->auth_type;
 unless (lc($auth_type) eq "basic") {
 $r->warn(__PACKAGE__, " can't handle AuthType $auth_type");
 return DECLINED;
 }

The differences between Basic and Digest authentication are discussed in Chapter 6.

get_basic_auth_pw()

The get_basic_auth_pw() method returns a two-element list. If the current request is protected with Basic authentication, the first element of the returned list will be OK and the second will be the plaintext password entered by the user. Other possible return codes include DECLINED, SERVER_ERROR and AUTH_REQUIRED, the meaning of each is described in Chapter 6.

Example:

 my($ret, $sent_pw) = $r->get_basic_auth_pw;

You can get the username part of the pair by calling $r->connection->user as described in The Apache::Connection Class.

note_basic_auth_failure()

If a URI is protected by Basic authentication and the browser fails to provide a valid username/password combination (or none at all), authentication handlers are expected to call the note_basic_auth_failure() method. This sets up the outgoing HTTP headers in such a way that the user will be (re)challenged to provide his username and password for the current security realm. For example:

 my($ret, $sent_pw) = $r->get_basic_auth_pw; 
 unless($r->connection->user and $sent_pw) {
 $r->note_basic_auth_failure;
 $r->log_reason("Both a username and password must be provided");
 return AUTH_REQUIRED;
 }

Although it would make sense for note_basic_auth_failure() to return a status code of AUTH_REQUIRED, it actually returns no value.

requires()

This method returns information about each of the require directives currently in force for the requested URI. Since there may be many require directives, this method returns an array reference. Each item in the array is a hash that contains information about a different require directive. The format of this data structure is described in detail in Chapter 6, under A Gender-Based Authorization Module.

satisfies()

Documents can be under access control (e.g., access limited by hostname or password) and authentication/authorization control (password protection) simultaneously. The satisfy directive determines how Apache combines the two types of restriction. If Satisfy All is specified, Apache will not grant access to the requested document unless both the access control and authentication/authorization rules are satisfied. If Satisfy Any is specified, the remote user is allowed to retrieve the document if he meets the requirements of either one of the restrictions.

Authorization and access control modules gain access to this configuration variable through the satisfies() method. It will return one of the three constants SATISFY_ALL, SATISFY_ANY or SATISFY_NOSPEC. The latter is returned when there is no applicable satisfy directive at all. These constants can be imported by requesting the ``:satisfy'' tag from Apache::Constants.

The following code fragment illustrates an access control handler that checks the status of the satisfy directive. If the current document is forbidden by access control rules the code checks whether satisfy any is in effect, and if so, whether authentication is also required (using the some_auth_required() method call described next). Unless both these conditions are true, the handler logs an error message. Otherwise it just returns the result code, knowing that any error logging will be performed by the authentication handler.

 use Apache::Constants qw(:common :satisfy);

 if ($ret == FORBIDDEN) {
 $r->log_reason("Client access denied by server configuration")
 unless $r->satisfies == SATISFY_ANY && $r->some_auth_required;
 return $ret;
 }

some_auth_required()

If the configuration for the current request requires some form of authentication or authorization, this method returns true. Otherwise it returns an undef value.

Example:

 unless ($r->some_auth_required) {
 $r->log_reason("I won't go further unless the user is authenticated");
 return FORBIDDEN;
 }

mod_perl Specific Methods

There are a handful of Perl API methods for which there is no C language counterpart. Those who are only interested in learning the C API can skip this section

exit()

It is common to come across Perl CGI scripts that use the Perl builtin exit() function to leave the script prematurely. Calling exit() from within a CGI script, which owns its process, is harmless, but calling exit() from within mod_perl would have the unfortunate effect of making the entire child process exit unceremoniously, in most cases before completing the request or logging the transaction. On Win32 systems, calling exit() will make the whole server quit. Oops!

For this reason mod_perl's version of this function call, Apache::exit(), does not cause the process to exit. Instead, it calls Perl's croak() function to halt script execution, but does not log a message to the ErrorLog. If you really want the child server process to exit, call Apache::exit() with an optional status argument of DONE (available in Apache::Constants). The child process will be shut down, but only after it has had a chance to properly finish handling the current requests.

In scripts running under Apache::Registry, Perl's built-in exit() is overridden by Apache::exit() so that legacy CGI scripts don't inadvertently shoot themselves in the foot. In Perl versions 5.005 and higher, exit() is overridden everywhere, including within handlers. In versions of mod_perl built with Perl 5.004 handlers can still inadvertently invoke the built-in exit(), so you should be on the watch for this mistake. One way to avoid it is to explicitly import the ``exit'' symbol when you load the Apache module.

Here are various examples of exit():

 $r->exit;
 Apache->exit;
 $r->exit(0);
 $r->exit(DONE);

 use Apache 'exit'; #this override's Perl's builtin
 exit;

If a handler needs direct access to the Perl builtin version of exit() after it has imported Apache's version, it should call CORE::exit().

gensym()

This function creates an anonymous glob and returns a reference to it for use as a safe file or directory handle. Ordinary bareword filehandles are prone to namespace clashes. The IO::File class avoids this, but some users have found that the IO::File carries too much overhead. Apache::gensym avoids this overhead but still avoids namespace clashes.

 my $fh = Apache->gensym;
 open $fh, $r->filename or die $!;
 $r->send_fd($fh);
 close $fh;

Because of its cleanliness most of the examples in this book use the Apache::File interface for reading and writing files (See The Apache::File Class). If you wish to squeeze out a bit of overhead, you may wish to use Apache::gensym() with Perl's builtin open() function instead.

current_callback()

If a module wishes to know what handler is currently being run, it can find out with the current_callback method. This method is most useful to PerlDispatchHandlers who wish to only take action for certain phases.

 if($r->current_callback eq "PerlLogHandler") {
 $r->warn("Logging request");
 }

get_handlers()

The get_handlers method will return an array reference containing the list of all handlers that are configured to handle the current request. This method take a single argument specifying which handlers to return.

 my $handlers = $r->get_handlers('PerlAuthenHandler');

set_handlers()

If you would like to change the list of handlers configured for the current request, you can change it with set_handlers(). This method takes two arguments, the name of the handler you wish to change, and an array reference pointing to one or more references to the handler subroutines you want to run for that phase. If any handlers were previously defined, such as with a Perl*Handler directive, they are replaced by this call. You can provide a second argument of undef if you with to remove all handlers for that phase.

Examples:

 $r->set_handlers(PerlAuthenHandler => [\&auth_one, \&auth_two]);
 $r->set_handlers(PerlAuthenHandler => undef);

push_handlers()

The push_handlers() method is used to add a new Perl handler routine to the current request's handler ``stack''. Instead of replacing the list of handlers, it just appends a new handler to the list. Each handler is run in turn until one returns an error code. You'll find more information about using stacked handlers and examples in Chapters 4, 6 and 7.

This method takes two arguments, the name of the phase you want to manipulate, and a reference to the subroutine you want to handle that phase.

Example:

 $r->push_handlers(PerlLogHandler => \&my_logger);

module()

If you need to find out if a Perl module has already been loaded, the module() method will tell you. Pass it the package name of the module you're interested in. It will return a true value if the module is loaded.

Example:

 do { #something } if Apache->module('My::Module');

This method can also be used to test if a C module is loaded. In this case, pass it the filename of the module, just as you would use with the IfModule directive. It will return a true value if the module is loaded.

Example:

 do { #something } if Apache->module('mod_proxy.c');

define()

Apache version 1.3.1 added a -D command line switch that can be used to pass the server parameter names for conditional configuration with the IfDefine directive. These names exist for the lifetime of the server and can be accessed at any time by Perl modules using the define method. Example:

 if(Apache->define("SSL")) {
 #the server was started with -DSSL
 }

post_connection()

This is simply an alias for the register_cleanup() method described in the Server Core Functions section.

request()

The Apache->request() class method returns a reference to the current request object, if any. Handlers that use the vanilla Perl API will not need to call this method because the request object is passed to them in their argument list. However, some modules may not have a subroutine entry point and therefore need a way to gain access the request object. For example, CGI.pm uses this method to provide proper mod_perl support.

Called with no arguments, request() returns the stored Apache request object. It may also be called with a single argument to set the stored request object. This is what Apache::Registry does before invoking a script.

Example:

 my $r = Apache->request; # get the request
 Apache->request($r); # set the request

Actually, it's a little known fact that Apache::Registry scripts can access the request object directly via @_. This is slightly faster than using Apache->request, but has the disadvantage of being obscure. This technique is demonstrated in Subclassing the Apache Class.

httpd_conf()

The httpd_conf() method allows you to pass new directives to Apache at startup time. Pass it a multi-line string containing the configuration directive(s) that you wish Apache to process. Using string interpolation, you can use this method to dynamically configure Apache according to arbitrarily complex rules.

httpd_conf() can only be called during server startup, usually from within a Perl startup file. Because there is no request method at this time, you must invoke httpd_conf() directly through the Apache class.

Example:

 my $ServerRoot = '/local/web';
 Apache->httpd_conf(<<EOF);
 Alias /perl $ServerRoot/perl
 Alias /cgi-bin $ServerRoot/cgi-bin
 EOF

Should a syntax error occur, Apache will log an error and the server will exit, just as it would if the error was present in the httpd.conf configuration file. A more sophisticated way of configuring Apache at startup time via <Perl> sections is discussed in Chapter 9.

Other Core Perl API Classes

The vast bulk of the functionality of the Perl API is contained in the Apache object. However, a number of auxiliary classes, including Apache::Table, Apache::Connection, and Apache::Server provide additional methods for accessing and manipulating the state of the server. This section discusses these classes.

The Apache TIEHANDLE Interface

In the CGI environment, the standard input and standard output file descriptors are redirected so that data read and written is passed through Apache for processing. In the Apache module API, handlers ordinarily use the Apache read() and print() methods to communicate with the client. However, as a convenience, mod_perl ties the STDIN and STDOUT filehandles to the Apache class prior to invoking Perl API modules. This allows handlers to read from standard input and write to standard output exactly as if they were in the CGI environment.

The Apache class supports the full TIEHANDLE interface, as described in perltie(1). STDIN and STDOUT are already tied to Apache by the time your handler is called. If you wish to tie your own input or output filehandle, you may do so by calling tie() with the request object as the function's third parameter:

 tie *BROWSER, 'Apache', $r;
 print BROWSER 'Come out, come out, wherever you are!';

Of course, it is better not hard code the Apache class name, as $r might be blessed into a subclass:

 tie *BROWSER, ref $r, $r;

The Apache::SubRequest Class

The Apache methods lookup_uri() and lookup_file() return a request record object blessed into the Apache::SubRequest class. The Apache::SubRequest class is a subclass of Apache, and inherits most of its methods from there. Here are two examples of fetching subrequest objects:

 my $subr = $r->lookup_file($filename);
 my $subr = $r->lookup_uri($uri);

The Apache::SubRequest class adds a single new method, run().

run()

When a subrequest is created, the URI translation, access checks, and MIME checking phases are run, but unlike a real request, the content handler for the response phase is not actually run. If you would like to invoke the content handler, the run() method will do it:

 my $status = $subr->run;

When you invoke the subrequest's response handler in this way, it will do everything a response handler is supposed to, including sending the HTTP headers and the document body. run() returns the content handler's status code as its function result. If you are invoking the subrequest run() method from within your own content handler, you must not send the HTTP header and document body yourself, as this would be appended to the bottom of the information that has already been sent. Most handlers that invoke run() will immediately return its status code, pretending to Apache that they handled the request themselves:

 my $status = $subr->run;
 return $status;

The Apache::Server Class

The Apache::Server class provides the Perl interface to the C API server_rec data structure, which contains lots of low-level information about the server configuration. Within a handler, the current Apache::Server object can be obtained by calling the Apache request object's server() method. At Perl startup time (such as within a startup script or a module loaded with PerlModule) you can fetch the server object by invoking Apache->server directly. By convention, we use the variable $s for server objects.

Examples:

 #at request time
 sub handler {
 my $r = shift;
 my $s = $r->server;
 ....
 }

 #at server startup time, e.g. PerlModule or PerlRequire
 my $s = Apache->server;

This section discusses the various methods that are available to you via the server object. They correspond closely to the fields of the server_rec structure, which we revisit in Chapter 10.

is_virtual()

This method returns true if the current request is being applied to a virtual server. This is a read-only method.

Example:

 my $is_virtual = $s->is_virtual;

log()

The log() method retrieves an object blessed into the Apache::Log class. You can then use this object to access the full-featured logging API. See The Apache::Log class for the details.

Example:

 use Apache::Log ();
 my $log = $s->log;

The Apache::Server::log() method is identical in most respects to the Apache::log() method discussed earlier. The difference is that messages logged with Apache::log() will include the IP address of the browser and add the messages to the notes table under a key named ``error-notes''. See the description of notes() under Server Core Functions.

port()

Returns the port on which this (virtual) server is listening. If no port is explicitly listed in the server configuration file (that is, the server is listening on the default port 80) this method will return 0. Use the higher-level Apache::get_server_port() method if you wish to avoid this pitfall.

Example:

 my $port = $r->server->port || 80;

This method is read-only.

server_admin()

This method returns the e-mail address of the person responsible for this server as configured by the ServerAdmin directive.

Example:

 my $admin = $s->server_admin;

This method is read-only.

server_hostname()

Returns the (virtual) hostname used by this server, as set by the ServerName directive.

Example:

 my $hostname = $s->server_hostname;

This method is read-only.

names()

If this server is configured to use virtual hosts, the names() method will return the names by which the current virtual host is recognized as specified by the ServerAlias directives (including wild-carded names). The function result is an array reference containing the host names. If no alias names are present or the server is not using virtual hosts, this will return a reference to an empty list.

Example:

 my $s = $r->server;
 my $names = $s->names;

next()

Apache maintains a linked list of all configured virtual servers, which can be accessed with the next method.

Example:

 for(my $s = Apache->server; $s; $s = $s->next) {
 printf "Contact %s regarding problems with the %s site\n",
 $s->server_admin, $s->server_hostname;
 }

log_error()

This method is the same as the Apache::log_error() method, except that it's available through the Apache::Server object. This allows you to use it in Perl startup files and other places where the request object isn't available. Example:

 my $s = Apache->server;
 $s->log_error("Can't open config file $!");

warn()

This method is the same as the Apache::warn() method, but it's available through the Apache::Server object. This allows you to use it in Perl startup files and other places where the request object isn't available. Example:

 my $s = Apache->server;
 $s->warn("Can't preload script $file $!");

The Apache::Connection Class

The Apache::Connection class provides a Perl interface to the C language conn_rec data structure, which provides various low-level details about the network connection back to the client. Within a handler, the connection object can be obtained by calling the Apache request object's connection() method. The connection object is not available outside of handlers for the various request phases because there is no connection established in those cases. By convention, we use the variable $c for connection objects.

Example:

 sub handler {
 my $r = shift;
 my $c = $r->connection;
 ...
 }

In this section we discuss the various methods that are available through the connection. They correspond closely to the fields of the C API conn_rec structure discussed at in Chapter 10.

aborted()

This method returns true if the client has broken the connection prematurely. This can happen if the remote user's computer has crashed, a network error has occurred, or, more trivially, if the user pressed the ``stop'' button before the request or response was fully transmitted. However, this value is only set if the timeout was set with soft_timeout().

Example:

 if($c->aborted) {
 warn "uh,oh, the client has gone away!";
 }

auth_type()

If authentication was used to access a password protected document, this method returns the type of authentication that was used, currently either ``Basic'' or ``Digest.'' This method is different from the request object's auth_type() method, which we discussed earlier, because the latter returns the value of the AuthType configuration directive, in other words the type of authentication the server would like to use. The connection object's auth_type() method returns a value only when authentication was successfully completed, undef otherwise:

Example:

 if($c->auth_type ne 'Basic') {
 warn "phew, I feel a bit better";
 }

This method is read-only.

local_addr()

This method returns a packed SOCKADDR_IN structure in the same format as returned by the Perl Socket module's pack_sockaddr_in() function. This packed structure contains the port and IP address at the server's side of the connection. This is set by the server when the connection record is created so it is always defined.

Example:

 use Socket ();

 sub handler {
 my $r = shift;
 my $local_add = $r->connection->local_addr;
 my($port, $ip) = Socket::unpack_sockaddr_in($local_add);
 ...
 }
 
For obvious reasons, this method is read-only.

remote_addr()

This method returns a packed SOCKADDR_IN structure for the port and IP address at the client's side of the connection. This is set by the server when the connection record is created so it is always defined.

Among other things, the information returned by this method and local_addr() can be used to perform RFC1413 ident lookups on the remote client even when the configuration directive IdentityCheck is turned off. Using Jan-Pieter Cornet's Net::Ident module for example:

 use Net::Ident qw(lookupFromInAddr);
 ...
 my $remoteuser = lookupFromInAddr ($c->local_addr,
 $c->remote_addr, 2);

remote_host()

This method returns the hostname of the remote client. It only returns the name if the HostNameLookups directive is set to On and the DNS lookup was successful -- that is, the DNS contains a reverse name entry for the remote host. If hostname based access control is in use for the given request, a double-reverse lookup will occur regardless of the HostNameLookups setting, in which case, the cached hostname will be returned. If unsuccessful, the method returns undef.

It is almost always better to use the high-level get_remote_host() method available from the Apache request object (see above). The high level method returns the dotted IP address of the remote host if its DNS name isn't available, and it caches the results of previous lookups, avoiding overhead if you call the method multiple times.

Example:

 my $remote_host = $c->remote_host || "nohost";
 my $remote_host = $r->get_remote_host(REMOTE_HOST); # better

This method is read-only.

remote_ip()

This method returns the dotted decimal representation of the remote client's IP address. It is set by the server when the connection record is created and is always defined.

Example:

 my $remote_ip = $c->remote_ip;

The remote_ip() can also be changed, which is helpful if your server is behind a proxy such as the squid acelerator. By using the X-Forwarded-For header sent by the proxy, the remote_ip can be set to this value so logging modules include the address of the real client. The only subtle point is that X-Forwarded-For may be multi-valued in the case of a single request that has been forwarded across multiple proxies. It's safest to choose the last IP address in the list, since this corresponds to the original client.

Example:

 my $header = $r->headers_in->{'X-Forwarded-For'};
 if( my $ip = (split /,\s*/, $header)[-1] ) {
 $r->connection->remote_ip($ip);
 }

remote_logname()

This method returns the login name of the remote user, provided that the configuration directive IdentityCheck is set to On and the remote user's machine is running an identd daemon. If one or both of these conditions is false, the method returns undef.

It is better to use the high level get_remote_logname() method which is provided by the request object. When the high level method is called the result is cached and reused if called again. This is not true of remote_logname().

Example:

 my $remote_logname = $c->remote_logname || "nobody";
 my $remote_logname = $r->get_remote_logname; # better

user()

When Basic authentication is in effect, user() returns the name that the remote user provided when prompted for his username and password. The password itself can be recovered from the request object by calling get_basic_auth_pw().

Example:

 my $username = $c->user;

The Apache::Table Class

The HTTP message protocol is simple in large part due to its consistent use of the key/value paradigm in its request and response header fields. Because much of an external module's work is getting and setting these header fields, Apache provides a simple yet powerful interface called the table structure. Apache tables are keyed case-insensitive lookup tables. API function calls allow you to obtain the list of defined keys, iterate through them, get the value of a key, and set key values. Since many HTTP header fields are potentially multi-valued, Apache also provides functionality for getting, setting and merging the contents of multi-valued fields.

The five C data structures listed below are implemented as tables. This list is likely to grow in the future.

headers_in
headers_out
err_headers_out
notes
subprocess_env

As discussed in The Apache Request Record the Perl API provides five method calls named headers_in(), headers_out(), err_headers_out, notes() and subprocess_env() that retrieve these tables. The Perl manifestation of the Apache table API is the Apache::Table class. It provides a TIEHASH interface that allows transparent access to its methods via a tied hash reference, as well as API methods that can be called directly.

The TIEHASH interface is easy to use. Simply call one of the methods listed above in a scalar context to return a tied hash reference. For example:

 my $table = $r->headers_in;

The returned object can now be used to get and set values in the headers_in table by treating it as an ordinary hash reference, but the keys are looked up case insensitively. Examples:

 my $type = $table->{'Content-type'};
 my $type = $table->{'CONTENT-TYPE'}; # same thing
 $table->{'Expires'} = 'Sat, 08 Aug 1998 01:39:20 GMT';

If the field you are trying to access is multi-valued, then the tied hash interface suffers the limitation that fetching the key will only return the first defined value of the field. You can get around this by using the object-oriented interface to access the table (we show an example of this below), or use the each operator to access each key and value sequentially. The following code snippet shows one way to fetch all the Set-Cookie fields in the outgoing HTTP header:

 while (my($key, $value) = each %{$r->headers_out}) {
 push @cookies, $value if lc($key) eq 'set-cookie';
 }

When you treat an Apache::Table objects as a hash reference, you are accessing its internal get() and set() methods (among others) indirectly. To gain access to the full power of the table API, you can invoke these methods directly by using the method call syntax.

Here is the list of publicly available methods in Apache::Table, along with brief examples of usage.

add()

The add() method will add a key/value pair to the table. Because Apache tables can contain multiple instances of a key, you may call add() multiple times with different values for the same key. Instead of the new value of the key replacing the previous one, it will simply be appended to the list. This is useful for multi-valued HTTP header fields such as Set-Cookie. The outgoing HTTP header will contain multiple instances of the field.

 my $out = $r->headers_out;
 for my $cookie (@cookies) {
 $out->add("Set-Cookie" => $cookie);
 }

Another way to add multiple values is to pass an array reference as the second argument. This code has the same effect as the previous example:

 my $out = $r->headers_out;
 $out->add("Set-Cookie" => \@cookies);

clear()

This method wipes the current table clean, discarding its current contents. It's unlikely that you would want to perform this on a public table, but here's an example that clears the notes table:

 $r->notes->clear;

do()

This method provides a way to iterate through an entire table item by item. Pass it a reference to a code subroutine to be called once for each table entry. The subroutine should accept two arguments corresponding to the key and value respectively, and should return a true value. The routine can return a false value to terminate the iteration prematurely.

This example dumps the contents of the headers_in field to the browser:

 $r->headers_in->do(sub {
 my($key, $value) = @_;
 $r->print("$key => $value\n");
 1;
 });

For another example of do(), see listing 7.12 from the previous chapter, where we use it to transfer the incoming headers from the incoming Apache request to an outgoing LWP HTTP::Request object.

get()

Probably the most frequently-called method, the get() function returns the table value at the given key. For multi-valued keys, get() implements a little syntactic sugar. Called in a scalar context, it returns the first value in the list. Called in an array context, it returns all values of the multi-valued key.

 my $ua = $r->headers_in->get('User-Agent');
 my @cookies = $r->headers_in->get('Cookie');

get() is the underlying method that is called when you use the tied hash interface to retrieve a key. However the ability to fetch a multi-valued key as an array is only available when you call get() directly using the object-oriented interface.

merge()

merge() behaves like add() but each time it is called the new value is merged into the previous one, creating a single HTTP header field containing multiple comma-delimited values.

In the HTTP protocol a comma separated list of header values is equivalent to the same values specified by repeated header lines. Some clients are buggy enough that it is worthwhile for the server to control the merging explicitly and avoid merging headers that cause trouble (like Set-Cookie).

merge() works like add(). You can either merge a series of entries one at a time:

 my @languages = qw(en fr de);
 foreach (@languages) {
 $r->headers_out->merge("Content-Language" => $_);
 }

or merge a bunch of entries in a single step by passing an array reference:

 $r->headers_out->merge("Content-Language" => \@languages);

new()

The new() method is available to create an Apache::Table object from scratch. It requires an Apache object to allocate the table and optionally, the number of entries to intially allocate. Note that just like the other Apache::Table objects returned by API methods, references cannot be used as values, only strings. Examples:

 my $tab = Apache::Table->new($r); #default, allocates 10 entries

 my $tab = Apache::Table->new($r, 20); #allocate 20 entries

set()

set() takes a key/value pair and updates the table with it, creating the key if it didn't exist before, or replacing its previous value(s) if it did. The resulting header field will be single-valued. Internally this method is called when you assign a value to a key using the tied hash interface.

Here's an example of using set() to implement an HTTP redirect:

 $r->headers_out->set(Location => 'http://www.modperl.com/');

unset()

This method can be used to remove a key and its contents. If there are multiple entries with the same key, they will all be removed.

Example:

 $r->headers_in->unset('Referer');

The Apache::URI Class

Apache version 1.3 introduced a utility module for parsing URIs, manipulating their contents and ``unparsing'' them back into string form. Since this functionality is part of the server C API, Apache::URI offers a lightweight alternative to the URI::URL module that ships with the libwww-perl package.*

An Apache::URI object is returned when you call the request object's parsed_uri() method. You may also call the Apache::URI parse() constructor to parse an arbitrary string and return a new Apache::URI object.

Example:

 use Apache::URI ();
 my $parsed_uri = $r->parsed_uri;

footnote

*At the time of this writing, URI::URL was scheduled to be replaced by URI.pm, which will be distributed separately from the libwww-perl package.

fragment()

This method returns or sets the fragment component of the URI. You know this as the part that follows the hash mark (``#'') in links. The fragment component is generally used only by clients and some Web proxies.

Examples:

 my $fragment = $uri->fragment;
 $uri->fragment('section_1');

hostinfo()

This method gets or sets the remote host information, which usually consists of a hostname and port number in the format hostname:port. Some rare URIs, such as those used for non-anonymous FTP, attach a username and password to this information, for use in accessing private resources. In this case, the information returned is in the format username:password@hostname:port.

This method returns the host information when called without arguments, or sets the information when called with a single string argument.

Examples:

 my $hostinfo = $uri->hostinfo;
 $uri->hostinfo('www.modperl.com:8000');

hostname()

This method returns or sets the hostname component of the URI object.

 my $hostname = $uri->hostname;
 $uri->hostname('www.modperl.com');

parse()

The parse() method is a constructor used create a new Apache::URI object from a URI string. Its first argument is an Apache request object, and the second is a string containing an absolute or relative URI. In the case of a relative URI, the parse() method uses the request object to determine the location of the current request and resolve the relative URI. Example:

 my $uri = Apache::URI->parse($r, 'http://www.modperl.com/');

If the URI argument is omitted, the parse() method will construct a fully qualified URI from $r object, including the scheme, hostname, port, path and query string. Example:

 my $self_uri = Apache::URI->parse($r);

password()

This method gets or sets the password part of the hostinfo component:

 my $password = $uri->password;
 $uri->password('rubble');

path()

This method returns or sets the path component of the URI object.

 my $path = $uri->path;
 $uri->path('/perl/hangman.pl');

path_info

After the ``real path'' part of the URI comes the ``additional path information''. This component of the URI is not defined by the official URI RFC, because it is an internal concept of Web servers that need to do something with the part of the path information that is left over after translating the rest into a valid filename.

path_info() gets or sets the additional path information portion of the URI, using the current request object to determine what part of the path is real and what part is additional.

Example:

 $uri->path_info('/foo/bar');

Warning: the unparse() method does not take the additional path information into account. It returns the URI minus the additional information.

port()

This method returns or sets the port component of the URI object.

 my $port = $uri->port;
 $uri->port(80);

query()

This method gets or sets the query string component of the URI, in other words, the part after the ``?'':

Examples:

 my $query = $uri->query;
 $uri->query('one+two+three');

rpath()

This method returns the ``real path,'' that is the path() minus the path_info().

Example:

 my $path = $uri->rpath();

scheme()

This method returns and/or sets the scheme component of the URI. This is the part that identifies the URI's protocol, such as http or ftp. Called without arguments, the current scheme is retrieved. Called with a single string argument, the current scheme is set.

Examples:

 my $scheme = $uri->scheme;
 $uri->scheme('http');

unparse()

This method returns the string representation of the URI. Relative URIs are resolved into absolute ones.

 my $string = $uri->unparse;

user()

This method gets or sets the username part of the hostinfo component:

 my $user = $uri->user;
 $uri->user('barney');

The Apache::Util Class

The Apache API provides several utility functions that are used by various standard modules. The Perl API makes these available as function calls in the Apache::Util package.

Although there is nothing here that doesn't already exist in some existing Perl module, these C versions are considerably faster than their corresponding Perl functions and avoid the memory bloat of pulling in yet another Perl package.

To make these functions available to your handlers, import the Apache::Util module with an import tag of ``:all'':

 use Apache::Util qw(:all);

escape_uri()

This function encodes all unsafe characters in a URI into %XX hex escape sequences. This is equivalent to the URI::Escape::uri_escape() function form the LWP package. Example:

 use Apache::Util qw(escape_uri);
 my $escaped = escape_uri($url);

escape_html()

This function replaces unsafe HTML character sequences (``<'', ``>'' and ``&'') with their entity representations. This is equivalent to the HTML::Entities::encode() function. Example:

 use Apache::Util qw(escape_html);
 my $display_html = escape_html("<h1>Header Level 1 Example</h1>");

ht_time()

This function produces dates in the format required by the HTTP protocol. You will usually call it with a single argument, the number of seconds since the ``epoch''. The current time expressed in these units is returned by the Perl built-in time() function.

You may also call ht_time() with optional second and third arguments. The second argument, if present, is a format string that follows the same conventions as the strftime() function in the POSIX library. The default format is %a, %d %b %Y %H:%M:%S %Z, where ``%Z'' is an Apache extension that always expands to ``GMT''. The optional third argument is a flag that selects whether to express the returned time in GMT or using the local timezone. A true value (the default) selects GMT, which is what you will want in nearly all cases.

Unless you have a good reason to use a non-standard time format, you should content yourself with the one-argument form of this function. The function is equivalent to the LWP package's HTTP::Date::time2str() function when passed a single argument.

Examples:

 use Apache::Util qw(ht_time);
 my $str = ht_time(time);
 my $str = ht_time(time, "%d %b %Y %H:%M %Z"); # 06 Nov 1994 08:49 GMT
 my $str = ht_time(time, "%d %b %Y %H:%M %Z",0); # 06 Nov 1994 13:49 EST

parsedate()

This function is the inverse of ht_time(), parsing HTTP dates and returning the number of seconds since the epoch. You can then pass this value to Time::localtime (or another of Perl's date-handling modules) and extract the date fields that you want.

The parsedate() recognizes and handles date strings in any of three standard formats:

 Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, the modern HTTP format
 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, the old obsolete HTTP format
 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format

Example:

 use Apache::Util qw(parsedate);
 my $secs;
 if (my $if_modified = $r->headers_in->{'If-modified-since'}) {
 $secs = parsedate $if_modified;
 }

size_string()

This function converts the given file size into a formatted string. The size given in the string will be in units of bytes, kilobytes or megabytes, depending on the size of the file. This function formats the string just as the C ap_send_size() API function does, but returns the string rather than sending it directly to the client. The ap_send_size() function is used in mod_autoindex to display the size of files in automatic directory listings, and by mod_include to implement the fsize directive.

This example shows size_string() being used to get the formatted size of the currently requested file:

 use Apache::Util qw(size_string);
 my $size = size_string -s $r->finfo;

unescape_uri()

This function decodes all %XX hex escape sequences in the given uri. It is equivalent to the URI::Escape::uri_unescape() function from the LWP package.

Example:

 use Apache::Util qw(unescape_uri);
 my $unescaped = unescape_uri($safe_url);

unescape_uri_info()

This function is similar to unescape_uri() but is specialized to remove escape sequences from the query string portion of the URI. The main difference is that it translates the ``+'' character into spaces as well as recognizing and translating the hex escapes.

Example:

 use Apache::Util qw(unescape_info);
 $string = $r->uri->query;
 my %data = map { unescape_uri_info($_) } split /[=&]/, $string, -1;

This would correctly translate the query string ``name=Fred+Flintstone&town=Bedrock'' into the hash:

 data => 'Fred Flintstone',
 town => 'Bedrock'

The mod_perl Class

Among the packages installed by the Perl API is a tiny one named, simply enough, ``mod_perl.'' You can query this class to determine what version of mod_perl is installed and what features it makes available.

import()

If your Apache Perl API modules depend on version-specific features of mod_perl, you can use the import() method to require that a certain version of mod_perl be installed. The syntax is simple:

 use mod_perl 1.16; # require version 1.16 or higher

When mod_perl is built, you can control which handlers and other features are enabled. import() can also be used to check for the presence of individual features.

 #require Authen and Authz handlers to be enabled
 use mod_perl qw(PerlAuthenHandler PerlAuthzHandler);

Here is the list of features that you can check for:

PerlDispatchHandler
PerlChildInitHandler
PerlChildExitHandler
PerlPostReadRequestHandler
PerlTransHandler
PerlHeaderParserHandler
PerlAccessHandler
PerlAuthenHandler
PerlAuthzHandler
PerlTypeHandler
PerlFixupHandler
PerlHandler
PerlLogHandler
PerlInitHandler
PerlCleanupHandler
PerlStackedHandlers
PerlMethodHandlers
PerlDirectiveHandlers
PerlSections
PerlSSI

hook()

The hook() function can be used at runtime to determine whether the current mod_perl installation provides support for a certain feature. This is the internal function that import() uses to check for configured features. This function is not exported, so you have to refer to it using its fully-qualified name, mod_perl::hook(). hook() recognizes the same list of features that import() does.

Example:

 use mod_perl ();
 unless(mod_perl::hook('PerlAuthenHandler')) {
 die "PerlAuthenHandler is not enabled!";
 }

The Apache::Constants Class

All of the HTTP status codes are defined in the httpd.h file, along with server specific status codes such as OK, DECLINED and DONE. The Apache::Constants class provides access to these codes as constant subroutines. As there are many of these constants, they are not all exported by default. By default, only those listed in the :common export tag are exported. A variety of export tags are defined, allowing you to bring in various sets of constants to suit your needs. You are also free to bring in individual constants, just as you can with any other Perl module.

Here are the status codes listed by export tag group:

:common

This tag imports the most commonly used constants.

 OK
 DECLINED
 DONE
 NOT_FOUND
 FORBIDDEN
 AUTH_REQUIRED
 SERVER_ERROR

:response

This tag imports the common response codes, plus these response codes:

 DOCUMENT_FOLLOWS
 MOVED
 REDIRECT
 USE_LOCAL_COPY
 BAD_REQUEST
 BAD_GATEWAY
 RESPONSE_CODES
 NOT_IMPLEMENTED
 CONTINUE
 NOT_AUTHORITATIVE

CONTINUE and NOT_AUTHORITATIVE are aliases for DECLINED.

:methods

These are the method numbers, commonly used with the Apache method_number method.

 METHODS
 M_GET 
 M_PUT 
 M_POST 
 M_DELETE 
 M_CONNECT 
 M_OPTIONS 
 M_TRACE 
 M_PATCH
 M_PROPFIND
 M_PROPPATCH
 M_MKCOL
 M_COPY
 M_MOVE
 M_LOCK
 M_UNLOCK
 M_INVALID

Each of the M_* constants corresponds to an integer value, where M_GET..M_UNLOCK is 0..14. The METHODS constant is the number of M_* constants, 15 at the time of this writing. This is designed to accommodate support for other request methods.

 for (my $i = 0; $i < METHODS; $i++) {
 ...
 }

:options

These constants are most commonly used with the Apache allow_options method:

 OPT_NONE
 OPT_INDEXES
 OPT_INCLUDES 
 OPT_SYM_LINKS
 OPT_EXECCGI
 OPT_UNSET
 OPT_INCNOEXEC
 OPT_SYM_OWNER
 OPT_MULTI
 OPT_ALL

:satisfy

These constants are most commonly used with the Apache satisfy() method:

 SATISFY_ALL
 SATISFY_ANY
 SATISFY_NOSPEC

:remotehost

These constants are most commonly used with the Apache get_remote_host method:

 REMOTE_HOST
 REMOTE_NAME
 REMOTE_NOLOOKUP
 REMOTE_DOUBLE_REV

:http

This is the full set of HTTP response codes: (NOTE: this list is not definitive. See the Apache source code for the most up to date listing).

 HTTP_OK
 HTTP_MOVED_TEMPORARILY
 HTTP_MOVED_PERMANENTLY
 HTTP_METHOD_NOT_ALLOWED 
 HTTP_NOT_MODIFIED
 HTTP_UNAUTHORIZED
 HTTP_FORBIDDEN
 HTTP_NOT_FOUND
 HTTP_BAD_REQUEST
 HTTP_INTERNAL_SERVER_ERROR
 HTTP_NOT_ACCEPTABLE 
 HTTP_NO_CONTENT
 HTTP_PRECONDITION_FAILED
 HTTP_SERVICE_UNAVAILABLE
 HTTP_VARIANT_ALSO_VARIES

:server

These are constants related to the version of the Apache server software:

 MODULE_MAGIC_NUMBER
 SERVER_VERSION
 SERVER_BUILT

:config

These are constants most commonly used with configuration directive handlers:

 DECLINE_CMD

:types

These are constants which define internal request types:

 DIR_MAGIC_TYPE

:override

These constants are used to control and test the context of configuration directives.

 OR_NONE
 OR_LIMIT
 OR_OPTIONS
 OR_FILEINFO
 OR_AUTHCFG
 OR_INDEXES
 OR_UNSET
 OR_ALL
 ACCESS_CONF
 RSRC_CONF

:args_how

These are the constants which define configuration directive prototypes.

 RAW_ARGS
 TAKE1
 TAKE2
 TAKE12
 TAKE3
 TAKE23
 TAKE123
 ITERATE
 ITERATE2
 FLAG
 NO_ARGS

As you may notice, the list above is shorter than what is defined in Apache's include/httpd.h header file. The missing constants are available as subroutines via Apache::Constants, they are just not exportable by default. The less frequently used constants were left out of this list to keep memory consumption at a reasonable level.

There are two options if you need to access a constant that is not exportable by default. One is simply to use the fully qualifed subroutine name, for example:

 return Apache::Constants::HTTP_MULTIPLE_CHOICES();

Or, use the export method in a server startup file to add exportable names. Example:

 #startup script
 Apache::Constants->export(qw( HTTP_MULTIPLE_CHOICES ));

 #runtime module
 use Apache::Constants qw(:common HTTP_MULTIPLE_CHOICES);

 ...
 return HTTP_MULTIPLE_CHOICES;

While the HTTP constants are generally used a return codes from handler subroutines, it is also possible to use the builtin die() function to jump out of a handler with a status code that will be propagated back to Apache. Example:

 unless (-r _) {
 die FORBIDDEN;
 }

Configuration Classes

Two classes, Apache::ModuleConfig and Apache::CmdParms, provide access to the custom configuration directive API.

The Apache::ModuleConfig Class

Most Apache Perl API modules use the simple PerlSetVar directive to declare per-directory configuration variables. However, with a little more effort, you can create entirely new configuration directives. This process is discussed in detail in Chapter 8.

Once the configuration directives have been created, they can be retrieved from within handlers using the Apache::ModuleConfig->get() class method. get() returns the current command configuration table as an Apache table blessed into the Apache::Table class. get() takes one or two arguments. The first argument can be the current request object to retrieve per-directory data or an Apache::Server object to retrieve per-server data. The second, optional, argument is the name of the module whose configuration table you are interested in. If not specified, this argument defaults to the current package, which is usually what you want.

Here's an example:

 use Apache::ModuleConfig ();
 ...
 sub handler {
 my $r = shift;
 my $cfg = Apache::ModuleConfig->get($r);
 my $printer = $cfg->{'printer-address'};
 ...
 }

The Apache::CmdParms Class

The Apache::CmdParms class provides a Perl interface to the Apache cmd_parms data structure. When Apache encounters a directive, it invokes a command handler that is responsible for processing the directive's arguments. The Apache::CmdParms object is passed to the responsible handler and contains information that may be useful when processing these arguments.

An example of writing a directive handler is given in Chapter 8. In this section, we just summarize the methods that Apache::CmdParms makes available.

path()

If the configuration directive applies to a certain <Location>, <Directory> or <Files> section, the path() method returns the path or filename pattern to which the section applies.

Example:

 my $path = $parms->path;

server()

This method returns an object blessed into the Apache::Server class. This is the same Apache::Server object which is retrieved at request time via the Apache method named server(). See above. Example:

 my $s = $parms->server;

cmd()

This method returns an object blessed into the Apache::Command class. The Apache::Module package from CPAN must be installed to access Apache::Command methods. Example:

 use Apache::Module ();
 ...
 my $name = $parms->cmd->name;

info()

If the directive handler has stashed any info in the cmd_data slot, this method will return that data. This is generally somewhat static information, normally used to reuse a common configuration function. For example, the fancy directory indexer, mod_autoindex and its family of AddIcon* directives uses this technique quite effectively to manipulate the directive arguments.

 my $info = $parms->info;

limited()

The methods present in the current Limit configuration are converted into a bitmask, which is returned by this method. For example:

 # httpd.conf
 <Limit GET POST>
 SomeDirective argument_1 argument_2
 </Limit>

 # Perl module
 use Apache::Constants qw(:methods);

 sub SomeDirective ($$$$) {
 my($parms, $cfg, @args) = @_;
 my $method_mask = $parms->limited;
 if($method_mask & (1 << M_POST)) {
 ...
 }
 }

override()

This method converts current value of the AllowOverride directive into a bitmask and returns it. You can then import the Apache::Constants :override tag to retrieve the values of individual bits in the mask. Modules don't generally need to check this value, the internal configuration functions take care of the required context checking.

Example:

 use Apache::Constants qw(:override);

 my $override_mask = $parms->override; 
 if($override_mask & OR_ALL) {
 #this directive is allowed anywhere in the configuration files
 }

getline()

If the directive handler needs to read from the configuration file directly, it may do so with the getline() method. The first line returned in the example below is the line immediately following the line on which the directive appeared. It's up to your handler to decide when to stop reading lines; in the example below we use pattern matching.

Reading from the configuration file directly is normally done when a directive is declared with a prototype of RAW_ARGS. With this prototype, arguments are not parsed by Apache, that job is left up to the directive handler. Let's say you need to implement a configuration container, in the same format as the standard <Directory> and <Location> directives:

 <Container argument>
 ....
 </Container>

Here is a directive handler to parse it:

 sub Container ($$$*) {
 my($parms, $cfg, $arg, $fh) = @_;
 $arg =~ s/>//;

 while($parms->getline($line)) {
 last if $line =~ m:</Container>:i;
 ...
 }
 }

There is an alternative to using the getline method when the RAW_ARGS prototype is used, a tied filehandle which is passed as the directive handler's last argument. Perl's builtin read and getc functions may be used on this filehandle, along with the <> readline operator:

 sub Container ($$$*) {
 my($parms, $cfg, $arg, $fh) = @_;
 $arg =~ s/>//;

 while(defined(my $line = <$fh>)) {
 last if $line =~ m:</Container>:i;
 ...
 }
 }

The Apache::File Class

The Perl API includes class named Apache::File, which, when loaded, provides advanced functions for opening and manipulating files at the server side.

Apache::File does two things. First, it provides an object-oriented interface to filehandles similar to Perl's standard IO::File class. While the Apache::File module does not provide all the functionality of IO::File, its methods are approximately twice as fast as the equivalent IO::File methods. Secondly, when you use Apache::File, it adds several new methods to the Apache class which provide support for handling files under the HTTP/1.1 protocol.

Like IO::File, the main advantage of accessing filehandles through Apache::File's object-oriented interface is the ability to create new anonymous filehandles without worrying about namespace collision. Furthermore, you don't have to close the filehandle explicitly before exiting the subroutine that uses it; this is done automatically when the filehandle object goes out of scope.

Example:

 {
 use Apache::File;
 my $fh = Apache::File->new($config);
 # no need to close
 }

However, Apache::File is still not as fast as using Perl's native open() and close() functions. If you wish to get the highest performance possible, you should use open() and close() in conjunction with the standard Symbol::gensym or Apache::gensym functions.

Example:

 { # using standard Symbol module
 use Symbol 'gensym';
 my $fh = gensym;
 open $fh, $config;
 close $fh;
 }

 { # Using Apache::gensym() method
 my $fh = Apache->gensym;
 open $fh, $config;
 close $fh;
 }

A little known feature of Perl is that when lexically defined variables go out of scope, any indirect filehandle stored in them is automatically closed. So in fact there's really no reason to perform an explicit close() on the filehandles in the two examples above unless you want to test the close operation's return value. As always with Perl, there's more than one way to do it.

Apache::File Methods

These are methods associated directly with Apache::File objects. They form a subset of what's available from the Perl IO::File and FileHandle classes.

new()

This method creates a new filehandle, returning the filehandle object on success, undef on failure. If an additional argument is given, it will be passed to the open() method automatically.

Examples:

 use Apache::File ();
 my $fh = Apache::File->new;

 my $fh = Apache::File->new($filename) or die "Can't open $filename $!";

open()

Given an Apache::File object previously created with new(), this method opens a file and associates it with the object. The open() method accepts the same types of arguments as the standard Perl open() function, including support for file modes.

Examples:

 $fh->open($filename);

 $fh->open(">$out_file");

 $fh->open("|$program");

close()

The close() method is equivalent to the Perl builtin close function, returns true upon success, false upon failure.

 $fh->close or die "Can't close $filename $!";

tmpfile()

The tmpfile() method is responsible for opening up a unique temporary file. It is similar to the tmpnam() function in the POSIX module, but doesn't come with all the memory overhead that loading POSIX does. It will choose a suitable temporary directory (which must be writable by the Web server process). It then generates a series of filenames using the current process ID and the $TMPNAM package global. Once a unique name is found, it is opened for writing, using flags that will cause the file to be created only if it does not already exist. This prevents race conditions in which the function finds what seems to be an unused name, but someone else claims the same name before it can be created.

As an added bonus, tmpfile() calls the register_cleanup() method behind the scenes to make sure the file is unlinked after the transaction is finished.

Called in a list context, tmpfile() returns the temporary file name and a filehandle opened for reading and writing. In a scalar context only the filehandle is returned.

Example:

 my($tmpnam, $fh) = Apache::File->tmpfile;

 my $fh = Apache::File->tmpfile;

Apache Methods added by Apache::File

When a handler pulls in Apache::File, the module adds a number of new methods to the Apache request object. These methods are generally of interest to handlers that wish to serve static files from disk or memory using the features of the HTTP/1.1 protocol that provide increased performance through client-side document caching.

To take full advantage of the HTTP/1.1 protocol, your content handler will test the meets_conditions() method before sending the body of a static document. This avoids sending a document that is already cached and up to date on the browser's side of the connection. You will then want to call set_content_length() and update_mtime() in order to make the outgoing HTTP headers correctly reflect the correct size and modification time of the requested file. Finally, you may want to call set_etag() in order to set the file's ``entity tag'' when communicating with HTTP/1.1-compliant browsers.

In the section following this one, we demonstrate these methods fully by writing a pure Perl replacement for the http_core module's default document retrieval handler.

discard_request_body()

The majority of GET method handlers do not deal with incoming client data, unlike POST and PUT handlers. However, according to the HTTP/1.1 specification, any method, including GET can include a request body. The discard_request_body() method tests for the existence of a request body and if present, simply throws away the data. This discarding is especially important when persistent connections are being used, so that the request body will not be attached to the next request. If the request is malformed, an error code will be returned, which the module handler should propagate back to Apache.

Example:

 if ((my $rc = $r->discard_request_body) != OK) {
 return $rc;
 }

meets_conditions()

In the interest of HTTP/1.1 compliance, the meets_conditions() method is used to implement ``conditional GET'' rules. These rules include inspection of client headers, including If-Modified-Since, If-Unmodified-Since, If-Match and If-None-Match. Consult RFC 2068 section 9.3 (which you can find at http://www.w3.org/Protocols) if you are interested in the nitty gritty details.

As far as Apache modules are concerned, they need only check the return value of this method before sending a request body. If the return value is anything other than OK, the module should return from the handler with that value. A common return value other than OK is HTTP_NOT_MODIFIED, which is sent when the document is already cached on the client side, and has not changed since it was cached.

 if((my $rc = $r->meets_conditions) != OK) {
 return $rc;
 }
 #else ... go and send the response body ...

mtime()

This method returns the last modified time of the requested file, expressed as seconds since the epoch. The last modified time may also be changed using this method, although update_mtime() method is better suited to this purpose.

Example:

 my $date_string = localtime $r->mtime;

set_content_length()

This method sets the outgoing Content-length header based on its argument, which should be expressed in byte units. If no argument is specified, the method will use the size returned by $r->filename. This method is a bit faster and more concise than setting Content-length in the headers_out table yourself. Examples:

 $r->set_content_length;
 $r->set_content_length(-s $r->finfo); #same as above
 $r->set_content_length(-s $filename);

set_etag()

This method is used to set the outgoing ETag header corresponding to the requested file. ETag is an opaque string that identifies the currrent version of the file and changes whenever the file is modified. This string is tested by the meets_conditions() method if the client provide an If-Match or If-None-Match header.

 $r->set_etag;

set_last_modified()

This method is used to set the outgoing Last-Modified header from the value returned by $r->mtime. The method checks that the specified time is not in the future. In addition, using set_last_modified() is faster and more concise than setting Last-Modified in the headers_out table yourself.

You may provide an optional time argument, in which case the method will first call the update_mtime() to set the file's last modification date. It will then set the outgoing Last-Modified header as before.

Examples:

 $r->update_mtime((stat $r->finfo)[9]);
 $r->set_last_modified;

 $r->set_last_modified((stat $r->finfo)[9]); #same as the two lines above

update_mtime()

Rather than setting the request record mtime field directly, you can use the update_mtime() method to change the value of this field. It will only be updated if the new time is more recent than the current mtime. If no time argument is present, the default is the last modified time of $r->filename.

Example:

 $r->update_mtime;
 $r->update_mtime((stat $r->finfo)[9]); #same as above
 $r->update_mtime(time);

Using Apache::File to Send Static Files

Apache's http_core module already has a default handler to send files straight from disk to the client. Such files include static HTML, plain text, compressed archives and image files in a number of different formats. A bare bones handler in Perl only requires a few lines of code as Listing 9.1 shows. After the standard preamble, the handler() function attempts to open $r->filename. If the file cannot be opened, the handler simply assumes file permission problems and returns FORBIDDEN. Otherwise, the entire contents of the file are passed down the HTTP stream using the request object's send_fd() method. It then does a little tidying up by calling close() on the filehandle and returns OK so that Apache knows the response has been sent.

Listing 9.1 A simple, but flawed way to send static files

 package Apache::EchoFile;
 
 use strict;
 use Apache::Constants qw(:common);
 use Apache::File ();
 
 sub handler {
 my $r = shift;
 my $fh = Apache::File->new($r->filename) or return FORBIDDEN;
 $r->send_fd($fh);
 close $fh;
 return OK;
 }
 
 1;
 __END__

While this works well in most cases, there is more involved in sending a file over HTTP than you might think. To fully support the HTTP/1.1 protocol, one has to handle the PUT and OPTIONS methods, handle GET requests that contain a request body, and provide support for ``If-Modified-Since'' requests.

Listing 9.2 is the Apache::SendFile module, a Perl version of the http_core module default handler. It starts off as before by loading the Apache::Constants module. However it brings in more constants than usual. The :response group pulls in the constants we normally see using the :common tag, plus a few more including the NOT_IMPLEMENTED constant. The :methods group brings in the method number constants including M_INVALID, M_OPTIONS, M_PUT and M_GET. The :http tag imports a few of the less commonly used status codes, including HTTP_METHOD_NOT_ALLOWED.

We next bring in the Apache::File module in order to open and read the contents of the file to be sent and to load the HTTP/1.1-specific file handling methods.

The first step we take upon entering the handler() function is to call the discard_request_body() method. Unlike HTTP/1.0, where only POST and PUT requests may contain a request body, in HTTP/1.1 any method may include a body. We have no use for it, so we throw it away to avoid potential problems.

We now check the request method by calling the request object's method_number() method. Like the http_core handler, we only handle GET requests (method numbers M_GET). For any other type of request we return an error, but in each case the error is slightly different. For the method M_INVALID, which is set when the client specifies a request that Apache doesn't understand, we return an error code of NOT_IMPLEMENTED. For M_OPTIONS, which is sent by an HTTP/1.1 client that is seeking information about the capabilities of the server, we return DECLINED in order to allow Apache's core to handle the request (it sends a list of allowed methods).

The PUT method is applicable even if the resource doesn't exist, but we don't support it, so we return HTTP_METHOD_NOT_ALLOWED in this case. At this point we test for existence of the requested file by applying the -e file test to the cached stat() information returned by the request object's finfo() method. If the file does not exist, we log an error message and return NOT_FOUND. Finally, we specifically check for a request method of M_GET and again return HTTP_METHOD_NOT_ALLOWED if this is not the case.

Provided the request has passed all these checks, we attempt to open the requested file with Apache::File. If the file cannot be opened, the handler logs an error message and returns FORBIDDEN.

At this point, we know that the request method is valid and the file exists and is accessible. But this doesn't mean we should actually send the file because the client may have cached it previously and has asked us to transmit it only if it has changed. The update_mtime(), set_last_modified() and set_etag() methods together set up the HTTP/1.1 headers that indicate when the file was changed and assign it a unique ``entity tag'' that changes when the file changes.

We then call the meets_conditions() method to find out if the file has already been cached by the client. If this is the case, or some other condition set by the client fails, meets_conditions() returns a response code other than OK, which we propagate back to Apache. Apache then does whatever is appropriate.

Otherwise we call the set_content_length() method to set the outgoing Content-length header to the length of the file, then call send_http_header() to send the client the full set of HTTP headers. The return value of header_only() is tested to determine whether the client has requested the header only; if the method returns false, then the client has requested the body of the file as well as the headers, and we send the file contents using the send_fd() method. Lastly, we tidy up by closing the filehandle and returning OK.

The real default handler found in http_core.c actually does a bit more work than this. It includes logic for sending files from memory via mmap() if USE_MMAP_FILES is defined, along with support for HTTP/1.1 byte ranges and Content-MD5.

After reading through this you'll probably be completely happy to return DECLINED when the appropriate action for your module is just to return the unmodified contents of the requested file!

Listing 9.2 A 100% pure Perl implementation of the default http_core content handler

 package Apache::SendFile;
 
 use strict;
 use Apache::Constants qw(:response :methods :http);
 use Apache::File ();
 use Apache::Log ();
 
 sub handler {
 my $r = shift;
 if ((my $rc = $r->discard_request_body) != OK) {
 return $rc;
 }
 
 if ($r->method_number == M_INVALID) {
 $r->log->error("Invalid method in request ", $r->the_request);
 return NOT_IMPLEMENTED;
 }
 
 if ($r->method_number == M_OPTIONS) {
 return DECLINED; #http_core.c:default_handler() will pick this up
 }
 
 if ($r->method_number == M_PUT) {
 return HTTP_METHOD_NOT_ALLOWED;
 }
 
 unless (-e $r->finfo) {
 $r->log->error("File does not exist: ", $r->filename);
 return NOT_FOUND;
 }
 
 if ($r->method_number != M_GET) {
 return HTTP_METHOD_NOT_ALLOWED;
 }
 
 my $fh = Apache::File->new($r->filename);
 unless ($fh) {
 $r->log->error("file permissions deny server access: ", 
 $r->filename);
 return FORBIDDEN;
 }
 
 $r->update_mtime(-s $r->finfo);
 $r->set_last_modified;
 $r->set_etag;
 
 if((my $rc = $r->meets_conditions) != OK) {
 return $rc;
 }
 
 $r->set_content_length;
 $r->send_http_header;
 
 unless ($r->header_only) {
 $r->send_fd($fh);
 }
 
 close $fh;
 return OK;
 }
 
 1;
 
 __END__

Special Global Variables, Subroutines and Literals

As you know, Perl has several ``magic'' global variables, subroutines and literals that have the same meaning no matter what package they are used from. A handful of these variables have special meaning when running under mod_perl. Here we will describe these and other global variables maintained by mod_perl. Don't forget that Perl code has much longer lifetime and lives among many more namespaces in the mod_perl environment than it does in the mod_cgi CGI environment. When modifying a Perl global variable, we recommend that you always localize the variable so modifications do not trip up other Perl code running in the server.

Global variables

We begin with the list of magic global variables that have special significance to mod_perl.

$0

When running under Apache::Registry or Apache::PerlRun, this variable is set to that of the filename field of the request_rec.

When running inside of a <Perl> section, the value of $0 is the path to the configuration file that the Perl section is located in, such as httpd.conf or srm.conf.

$^X

Normally, this variable holds the path to the Perl program that was executed from the shell. Under mod_perl, there is no Perl program, just the Perl library linked with Apache. So, this variable is set to that of Apache binary in which Perl is currently running, such as /usr/local/apache/bin/httpd or C:\Apache\apache.exe.

$|

As the perlvar(1) manpage explains, if this variable is set to nonzero, it forces a flush right away and after every write or print on the currently selected output channel. Under mod_perl, setting $| when the STDOUT filehandle is selected will cause the rflush() method to be invoked after each print(). Because of the overhead associated with rflush(), you should avoid making this a general practice.

$/

The perlvar manpage describes this global variable as the input record separator, newline by default. The same is true under mod_perl, however, mod_perl ensures it is reset back to the newline default after each request.

%@

You are most likely familiar with Perl's $@ variable, which holds the Perl error message or exception value from the last eval() command, if any. There is also an undocumented %@ hash global, which is used internally for certain eval bookkeeping. This variable is put to good use by mod_perl, by saving the value of $@ keyed by the URI which triggered the error. This allows an ErrorDocument to provide some more clues as to what went wrong. Example:

 my $previous_uri = $r->prev->uri;
 my $errmsg = $@{$previous_uri};

This looks a bit weird, but it's just a hash key lookup on an the array named %@. Mentally substitute %SAVED_ERRORS for %@ and you'll see what's going on here.

%ENV

As with the Perl binary, this global hash contains the current environment. When the Perl interpreter is first created by mod_perl, this hash is emptied, with the exception of those variables passed and set via PerlPassEnv and PerlSetEnv configuration directives.

The usual configuration scoping rules apply. A PerlSetEnv directives located in the main part of the configuration file will influence all Perl handlers, while those located in <Directory>, <Location> and <Files> sections will only affect handlers in those areas that they apply to.

The Apache SetEnv and PassEnv directives also influence %ENV, but they don't take effect until the fixup phase. If you need to influence %ENV via server configuration for an earlier phase, such as authentication, be sure to use PerlSetEnv and PerlPassEnv instead, because these directives take effect as soon as possible.

There are a number of standard variables that Apache adds to the environment prior to invoking the content handler. These include DOCUMENT_ROOT and SERVER_SOFTWARE. By default, the complete %ENV hash is not set up until the content response phase. Only variables set by PerlPassEnv, PerlSetEnv and by mod_perl itself will be visible. Should you need the complete set of variables to be available sooner, your handler code can do so with the subprocess_env method. Example:

 my $r = shift;
 my $env = $r->subprocess_env;
 %ENV = %$env;

Unless you plan to spawn subprocesses, however, it will usually be more efficient to access the subprocess variables directly:

 my $tmp = $r->subprocess_env->{'TMPDIR'};

If you need to get at the environment variables that are set automatically by Apache before spawning CGI scripts, and you want to do this outside of a content handler, remember to call subprocess_env() once in a void context in order to initialize the environment table with the standard CGI and server-side include variables:

 $r->subprocess_env;
 my $port = $r->subprocess_env('SERVER_SOFTWARE');

There's rarely a legitimate reason to do this, however, because all the information you need can be fetched directly from the request object.

Filling in the %ENV hash before the response phase introduces a little overhead into each mod_perl content handler. If you don't want the %ENV hash to be filled at all by mod_perl, add this to your server configuration file:

 PerlSetupEnv Off

Regardless of the setting of PerlSetupEnv, or whether subprocess_env() has been called, mod_perl always adds a few special keys of its own to %ENV.

MOD_PERL

The value of this key will be set to a true value for code to test if it is running in the mod_perl environment or not. Example:

 if(exists $ENV{MOD_PERL}) {
 ... do something ...
 }
 else {
 ... do something else ...
 }

GATEWAY_INTERFACE

When running under the mod_cgi CGI environment, this value is CGI/1.1. However, when running under the mod_perl CGI environment, GATEWAY_INTERFACE will be set to CGI-Perl/1.1. This can also be used by code to test if it is running under mod_perl, however, testing for the presence of the MOD_PERL key is faster than using a regular expression or substr to test GATEWAY_INTERFACE.

PERL_SEND_HEADER

If the PerlSendHeader directive is set to On, this enviroment variable will also be set to On, otherwise, the variable will not exist. This is intended for scripts which do not use the CGI.pm header() method, which always sends proper HTTP headers not matter what the settings. Example:

 if($ENV{PERL_SEND_HEADER}) {
 print "Content-type: text/html\n\n";
 }
 else {
 my $r = Apache->request;
 $r->content_type('text/html');
 $r->send_http_header;
 }

%SIG

The Perl %SIG global variable is used to set signal handlers for various signals.

There is always one handler set by mod_perl for catching the PIPE signal. This signal is sent by Apache when a timeout occurs, triggered when the client drops the connection prematurely (e.g. by hitting the ``stop'' button). The internal Apache::SIG class catches this signal to ensure the Perl interpreter state is properly reset after a timeout.

The Apache::SIG handler does have one side-effect that you might want to take advantage of. If a transaction is aborted prematurely because of a PIPE signal, Apache::SIG will set the environment variable SIGPIPE to the number ``1'' before it exits. You can pick this variable up with a custom log handler statement and record it if you are interested in compiling statistics on the number of remote users who abort their requests prematurely.

Below is a LogFormat directive that will capture the SIGPIPE environment variable. If the transaction was terminated prematurely, the last field in the log file line will be ``1'', otherwise ``-''.

 LogFormat "%h %l %u %t \"%r\" %s %b %{SIGPIPE}e"

As for all other signals, you should be most careful not to stomp on Apache's own signal handlers, such as that for ALRM. It is best to localize the handler inside of a block so it can be restored as soon as possible.

Example:

 {
 local $SIG{ARLM} = sub { ... };
 ...
 }

At the end of each request, mod_perl will restore the %SIG hash to the same state it was in at server startup time.

@INC

As the perlvar manpage explains:

The array @INC contains the list of places to look for Perl scripts to be evaluated by the do EXPR, require, or use constructs.

The same is true under mod_perl. However, two additional paths are automatically added to the end of the array. These are the value of the configured ServerRoot and $ServerRoot/lib/perl.

At the end of each request, mod_perl will restore the value of @INC to the same value it was during server startup time. This includes any modifications made by code pulled in via PerlRequire and PerlModule. So, be warned, if a script compiled by Apache::Registry contains a use lib or other @INC modification statement, this modification will not ``stick''. That is, once the script is cached, the modification is undone until the script has changed on disk and is re-compiled. If one script relies on another to modify the @INC path, that modification should be moved to a script or module pulled in at server startup time, such as the perl startup script.

%INC

As the perlvar manpage explains, The %INC hash contains entries for each filename that has been included via do or require. The key is the filename you specified, and the value is the location of the file actually found. The require command uses this array to determine whether a given file has already been included.

The same is true in the mod_perl environment. However, this Perl feature may seem like a mod_perl bug at times. One such case is when .pm modules that are modified are not automatically recompiled the way that Apache::Registry script files are. The reason this behavior hasn't been changed is that calling the stat function to test the last modified time for each file in %INC requires considerable overhead and would affect Perl API module performance noticeably. If you need it, the Apache::StatINC module provides the ``re-compile when modified'' functionality, which the authors only recommend using during development. On a production server, it's best to set the PerlFreshRestart directive to on and to restart the server whenever you change a .pm file and want to see the changes take effect immediately.

Another problem area is pulling in ``library'' files which do not declare a package namespace. As all Apache::Registry and Apache::PerlRun script files are compiled inside their own unique namespace, pulling in such a file via require causes it to be compiled within this unique namespace. Since the library file will only be pulled in once per request, only the first script to require it will be able to see the subroutines it declares. Other scripts that try to call routines in the library will trigger a server error along the lines of:

 [Thu Sep 11 11:03:06 1998] Undefined subroutine 
 &Apache::ROOT::perl::test_2epl::some_function called at 
 /opt/www/apache/perl/test.pl line 79.

The mod_perl_traps manual page describes this problem in more detail, along with providing solutions.

Subroutines

Subroutines with names that are all in capitals have special meaning to Perl. Familiar examples may include DESTROY and BEGIN. mod_perl also recognizes a few subroutines and treats them specially.

BEGIN

Perl executes BEGIN blocks during the compile time of code as soon as possible. The same is true under mod_perl. However, since mod_perl normally only compiles scripts and modules once, in the parent server or once per-child, BEGIN blocks in that code will only be run once.

Once a BEGIN block has run, it is immediately undefined by removing it from the symbol table. In the mod_perl environment, this means BEGIN blocks will not be run during each incoming request unless that request happens to be the one that is compiling the code. When a .pm module or other Perl code file is pulled in via require or use, its BEGIN blocks will be executed:

 - Once at startup time if pulled in by the parent process by a
 B<PerlModule> directive or in the perl startup script.
 - Once per-child process if not pulled in by the parent process.
 - An additional time in each child process if Apache::StatINC is loaded
 and the module is modified.
 - An additional time in the parent process on each restart if
 B<PerlFreshRestart> is B<On>.
 - At unpredictable times if you fiddle with C<%INC> yourself. Don't
 do this unless you know what you are doing.

Apache::Registry scripts can contain BEGIN blocks as well. In this case, they will be executed:

 - Once at startup time if pulled in by the parent process via 
 I<Apache::RegistryLoader>.
 - Once per-child process if not pulled in by the parent process.
 - An additional time in each child process if the script file is modified.
 - An additional time in the parent process on each restart if
 the script was pulled in by the parent process with
 I<Apache::RegistryLoader> and B<PerlFreshRestart> is B<On>.

END

In Perl, an END subroutine defined in a module or script, is executed as late as possible, that is, when the interpreter is being exited. In the mod_perl environment, the interpreter does not exit until the server is shutdown. However, mod_perl does make a special case for Apache::Registry scripts.

Normally, END blocks are executed by Perl during its perl_run() function, which is called once each time the Perl program is executed, e.g. once per (mod_cgi) CGI script. However, mod_perl only calls perl_run() once during server startup. Any END blocks that are encountered during main server startup such as those pulled in by the PerlRequire or PerlModule, are suspended and run at server shutdown time during the child_exit phase.

Any END blocks that are encountered during compilation of Apache::Registry scripts are called after the script has completed the response, including subsequent invocations when the script is cached in memory. All other END blocks encountered during other Perl*Handler callbacks, e.g. PerlChildInitHandler, will be suspended while the process is running and called during child_exit when the process is shutting down.

Module authors may be wish to use $r->register_cleanup as an alternative to END blocks if this behavior is not desirable.

Magic Literals

Perl recognizes a few magic literals during script compilation. By and large, they act exactly like their counterparts in the standalone Perl interpreter.

__END__

This token works just as it does with the standalone Perl interpreter, causing compilation to terminate. However this causes a problem for Apache::Registry scripts. Since the scripts are compiled inside of a subroutine, using __END__ will cut off the enclosing brace, causing script compilation to fail. If your Apache::Registry scripts use this literal, they will not run.

In partial compensation for this deficiency, mod_perl lets you use the __END__ token anywhere in your server configuration files to cut out experimental configuration or to make a notepad space that doesn't require you to use the # comment token on each line. Everything below the __END__ token will be ignored.

Special Package Globals

There are a number of useful globals located in the Apache::Server namespace that you are free to use in your own modules. Treat them as read-only. Changing their values will lead to unpredictable results.

$Apache::Server::CWD

This variable is set to the directory from which the server was started.

$Apache::Server::Starting

If the code being run is in the parent server, when the server is first being started, the value is set to 1, zero otherwise.

$Apache::Server::ReStarting

If the code being run is in the parent server, when the server is being restarted, this variable will be true, false otherwise. The value is incremented each time the server is restarted.

$Apache::Server::SaveConfig

As described in Chapter 8, <Perl> configuration sections are compiled inside the Apache::ReadConfig namespace. This namespace is normally flushed after mod_perl has finished processing the section. However, if the $Apache::Server::SaveConfig variable is set to a true value, the namespace will not be flushed, making configuration data available to Perl modules at request time. Example:

 <Perl>
 $Apache::Server::SaveConfig = 1;

 $DocumentRoot = ...
 ...
 </Perl>

At request time, the value of $DocumentRoot can be accessed with the fully qualified name $Apache::ReadConfig::DocumentRoot.

The next chapters show the Apache API from the perspective of the C language programmer, telling you everything you need to know to squeeze the last drop of performance out of Apache by writing extension modules in a fast compiled language.