|
This chapter is the definitive list of all the Perl API classes and method
calls. They are organized functionally by class starting with the Apache request object, and moving onward through
Apache::SubRequest, Apache::Server, Apache::Connection,
Apache::URI, Apache::Util, Apache::Log, and other classes.
At the end of this chapter we discuss the Apache::File class, which provides advanced functionality for HTTP/1.1 requests, and a
discussion of the various ``magic'' globals, subroutines and literals that mod_perl recognizes.
The Apache request object implements a huge number of methods. To help you
find the method you're looking for, we've broken them down into eight broad
categories:
- Client Request Methods
-
These methods have to do with retrieving information about the current
request, such as the fetching the requested URI, learning the request
document's filename, or reading incoming HTTP headers.
- Server Response Methods
-
These methods are concerned with setting outgoing information, such as
setting outgoing headers and controlling the document language and
compression.
- Sending Data to the Client
-
These are methods for sending document content data to the client.
- Server Core Functions
-
These are methods that control key aspects of transaction processing but
are not directly related to processing browser data input or output. For
example, the subrequest API is covered in this section.
- Server Configuration Methods
-
These are methods for retrieving configuration information about the
server.
- Logging
-
These are methods for logging error messages and warnings to the server
error log.
- Access Control Methods
-
These are methods for controlling access to restricted documents and for
authenticating remote users.
- mod_perl-Specific Methods
-
These are methods that use special features of mod_perl which have no counterpart in the C API. They include such things as the
gensym() method for generating anonymous filehandles, and
set_handlers() for altering the list of subroutines that will handle the current request.
Should you wish to subclass the Apache object in order to add application-specific features, you'll be pleased to
find that it's easy to do so. Please see Chapter 7, Subclassing the Apache Class
for instructions.
This section covers the request object methods that are used to query or
modify the incoming client request. These methods allow you to retrieve
such information as the URI the client has requested, the request method in
use, the content of any submitted HTML forms, and various items of
information about the remote host.
- args()
-
The args() method returns the contents of the URI query string (that part of the
request URI that follows the ``?'' mark, if any). When called in a scalar
context, args() returns the entire string. When called in a list context, the method
returns a list of parsed key/value pairs.
my $query = $r->args;
my %in = $r->args;
One trap to be wary of: if the same argument name is present several times
(as can happen with a multi-selection list in a fill-out form), assignment
of args() to a hash will discard all but the last argument. To avoid this, you'll
need to use the more complex argument processing scheme described in the
next chapter.
- connection()
-
This method returns an object blessed into the Apache::Connection
class. See The Apache::Connection Class for information on what you can do with this object once you get it.
my $c = $r->connection;
- content()
-
When the client request method is
POST , which generally occurs when the remote client is submitting the contents
of a fill-out form, the $r->content method returns the submitted
information, but only if the request content type is of type application/x-www-form-urlencoded. When called in a scalar context, the entire string is returned. When
called in a list context, a list of parsed name=value pairs are returned.
To handle other types of PUT or POST ed content, you'll need to use a module such CGI.pm or Apache::Request or use the
read() method and parse the data yourself. Ways of doing this as well as a module
that simplifies the task, are described in the next chapter.
NOTE: you can only call content() once. If you call the method more than once, it will return undef (or an
empty list) after the first try.
- filename()
-
The filename() method sets or returns the result of the URI translation phase. During the
URI translation phase, your handler will call this method with the physical
path to a file in order to set the filename. During later phases of the
transaction, calling this method with no arguments returns its current
value.
Examples:
my $fname = $r->filename;
unless (open(FH, $fname)) {
die "can't open $fname $!";
}
my $fname = do_translation($r->uri);
$r->filename($fname);
- finfo()
-
During the default translation phase, Apache walks along the components of
the requested URI trying to determine where the physical file path ends and
the additional path information begins (this is described at greater length
at the beginning of Chapter 4). In the course of this walk, Apache makes
the system stat() call one or more times to read the directory information along the path.
When the translation phase is finished, the stat() information for the translated filename is cached in the request record,
where it can be recovered using the
finfo() method. If you need to stat() the file, you can take advantage of this cached stat structure rather than
repeating this expensive system call.
When finfo() is called, it points the cached stat information into Perl's special
filehandle _ which Perl uses to cache its own stat operations. You can then perform file
test operations directly on this filehandle rather than on the file itself,
which would incur the penalty of another stat() system call. For convenience,
finfo() returns a reference to the _ filehandle, so file tests can be done directly on the return value of finfo().
The following three examples all result with the same value for
$size . However the first two avoid the overhead of the implicit
stat() performed by the last.
my $size = -s $r->finfo;
$r->finfo;
my $size = -s _;
my $size = -s $r->filename;
It is possible for a module to be called upon to process a URL that does
not correspond to a physical file. In this case, the stat()
structure will contain the result of testing for a nonexistent file, and
Perl's various file test operations will all return false.
The Apache::Util package contains a number of routines that are useful for manipulating the
contents of the stat structure. For example, the ht_time() routine turns Unix timestamps into HTTP-compatible human readable strings.
See the Apache::Util
manpage and The Apache::Util Class section later in this chapter for more details.
Example:
use Apache::Util qw(ht_time);
if(-d $r->finfo) {
printf "%s is a directory\n", $r->filename;
}
else {
printf "Last Modified: %s\n", ht_time((stat _)[9]);
}
- get_client_block()
-
- setup_client_block()
-
- should_client_block()
-
The get_, setup_, and should_client_block methods are lower-level ways to read the data sent by the client in
POST and
PUT requests. This protocol exactly mirrors the C language API described in
Chapter 10 and provides for timeouts and other niceties. Although the Perl
API supports them, Perl programmers should generally use the simpler read() method instead.
- get_remote_host()
-
This method can be used to look up the remote client's DNS hostname or
simply return its IP address. When a DNS lookup is successful, its result
is cached and returned on subsequent calls to
get_remote_host() to avoid costly multiple lookups. This cached value can also be retrieved
with the Apache::Connection object's
remote_host() method.
This method takes an optional argument. The type of lookup performed by
this method is affected by this argument as well as the value of the HostNameLookups directive. Possible arguments to this method, whose symbolic names can be
imported from the Apache::Constants
module using the :remotehost import tag, are one of:
- REMOTE_HOST
-
If this argument is specified, Apache will try to look up the DNS name of
the remote host. This lookup may fail if the Apache configuration directive HostNameLookups is set to Off or the hostname cannot be determined by a DNS lookup, in which case the
function will return undef.
- REMOTE_NAME
-
When called with this argument, the method will return the DNS name of the
remote host if possible, or the dotted decimal representation of the
client's IP address otherwise. This is the default lookup type when no
argument is specified.
- REMOTE_NOLOOKUP
-
When this argument is specified, get_remote_host() will not perform a new DNS lookup (even if the HostNameLookups directive says so). If a successful lookup was done earlier in the request,
the cached hostname will be returned. Otherwise, the method returns the
dotted decimal representation of the client's IP address. This is really
the same as REMOTE_NAME, but the call returns instantly with the
information that is available.
- REMOTE_DOUBLE_REV
-
This argument will trigger a double reverse DNS lookup regardless of the
setting of the HostNameLookups directive. Apache will first call the DNS to return the hostname that maps
to the IP number of the remote host. It will then make another call to map
the returned hostname back to an IP address. If the new IP address that is
returned matches the original one, then the method returns the hostname.
Otherwise it returns undef. The reason for this baroque procedure is that
standard DNS lookups are susceptible to DNS spoofing in which a remote
machine temporarily assumes the apparent identity of a trusted host. Double
reverse DNS lookups make spoofing much harder, and are recommended if you
are using the hostname to distinguish between trusted clients and untrusted
ones. However double reverse DNS lookups are also twice as expensive.
In recent versions of Apache, double-reverse name lookups are always
performed for the name-based access checking implemented by
mod_access.
Examples:
my $remote_host = $r->get_remote_host;
# same as above
use Apache::Constants qw(:remotehost);
my $remote_host = $r->get_remote_host(REMOTE_NAME);
# double-reverse DNS lookup
use Apache::Constants qw(:remotehost);
my $remote_host = $r->get_remote_host(REMOTE_DOUBLE_REV) || "nohost";
- get_remote_logname()
-
This method returns the login name of the remote user, or undef if that
information could not be determined. This generally only works if the
remote user is logged into a Unix or VMS host, and that machine is running
the identd daemon (which implements a protocol known as RFC 1413).
The success of the call also depends on the status of the
IdentityCheck configuration directive. Since identity checks can adversely impact
Apache's performance, this directive is off by default.
Example:
my $remote_logname = $r->get_remote_logname;
- headers_in()
-
When called in a list context, the headers_in() method returns a list of key/value pairs corresponding to the client
request headers. When called in a scalar context, it returns a hash
reference tied to the Apache::Table class. This class provides methods for manipulating several of Apache's
internal key/value table structures, and to all extents and purposes acts
just like an ordinary hash table. However, it also provides object methods
for dealing correctly with multi-valued entries. See
The Apache::Table Class for details.
Examples:
my %headers_in = $r->headers_in;
my $headers_in = $r->headers_in;
Once you have copied the headers to a hash, you can refer to them by name.
See Table 9.1 for a list of incoming headers that you may need to use. For
example, you can view the length of the data that the client is sending by
retrieving the key ``Content-length'':
%headers_in = $r->headers_in;
my $cl = $headers_in{'Content-length'};
You'll need to be aware that browsers are not required to be consistent in
their capitalization of header field names. For example, some may refer to
``Content-Type'' and others to ``Content-type''. The Perl API copies the
field names into the hash as is, and like any other Perl hash, the keys are
case-sensitive. This is a potential trap.
For these reasons it's better to call headers_in() in a scalar context and use the returned tied hash. Since Apache::Table sits on top of the C table API, lookup comparisons are performed in a
case-insensitive manner. The tied interface also allows you to add or
change the value of a header field, in case you want to modify the request
headers seen by handlers downstream. This code fragment shows the tied hash
being used to get and set fields:
my $headers_in = $r->headers_in;
my $ct = $headers_in->{'Content-Length'};
$headers_in->{'User-Agent'} = 'Block this robot';
It is often convenient to refer to header fields without creating an
intermediate hash or assigning a variable to the
Apache::Table reference. This is the usual idiom:
my $cl = $r->headers_in->{'Content-Length'};
Certain request header fields such as ``Accept,'' ``Cookie'' and several
other request fields are multivalued. When you retrieve their values, they
will be packed together into one long string separated by commas. You will
need to parse the individual values out yourself. Individual values can
include parameters which will be separated by semicolons. Cookies are
common examples of this:
Set-Cookie: SESSION=1A91933A; domain=acme.com; expires=Wed, 21-Oct-1998 20:46:07 GMT
A few clients send headers with the same key on multiple lines. In this
case you can use the Apache::Table::get() method to retrieve all of the values at once.
For full details on the various incoming headers, see the documents at
http://www.w3.org/Protocols. Non-standard headers, such as those that exxperimental browsers transmit,
can also be retrieved with this method call.
- Table 9.1: Incoming HTTP Request Headers
-
Field Description
Accept MIME types that client accepts
Accept-encoding Compression methods that client accepts
Accept-language Language(s) that client accepts
Authorization Used by various authorization/authentication schemes
Connection Connection options, such as I<Keep-alive>
Content-length Length, in bytes, of data to follow
Content-type MIME type of data to follow
Cookie Client-side Data
From E-mail address of the requesting user (deprecated)
Host Virtual host to retrieve data from
If-modified-since Return document only if modified since specified
If-none-match Return document if it has changed
Referer URL of document that linked to the requested one
User-agent Name and version of the client software
- header_in()
-
The header_in() method (singular, not plural) is used to get or set the value of a client
incoming request field. If the given value is
undef , the header will be removed from the list of header fields:
my $cl = $r->header_in('Content-length');
$r->header_in($key, $val); #set the value of header '$key'
$r->header_in('Content-length' => undef); #remove the header
The key lookup is done in a case insensitive manner. The header_in() method predates the Apache::Table class, but remains for backwards compatibility and as a bit of a shortcut
to using the headers_in method.
- header_only()
-
If the client issues a
HEAD request it wants to receive the HTTP response headers only. Content
handlers should check for this by calling header_only() before generating the document body. The method will return true in the
case of a HEAD request, and false in the case of other requests. Alternatively, you could
examine the string value returned by method() directly, although this would be less portable if the HTTP protocol were
some day expanded to support more than one header-only request method.
Example:
# generate the header & send it
$r->send_http_header;
return OK if $r->header_only;
# now generate the document...
Do not try to check numeric value returned by method_number() to identify a header request. Internally, Apache uses the M_GET
number for both HEAD and GET methods.
- method()
-
This method will return the string version of the request method, such as
GET , HEAD or POST . Passing an argument will change the method, which is occasionally useful
for internal redirects (Chapter 4) and for testing authorization
restriction masks (Chapter 6).
Examples:
my $method = $r->method;
$r->method('GET');
If you update the method, you probably want to update the method_number
accordingly as well.
- method_number()
-
This method will return the request method number, internal constants
defined by the Apache API. The method numbers are available to Perl
programmers from the Apache::Constants module by importing the
:methods set. The relevant constants include
M_GET , M_POST ,
M_PUT and M_DELETE . Passing an argument will set this value, mainly of use for internal
redirects and for testing authorization restriction masks. If you update
the method number, you probably want to update the method() accordingly as well.
Note that there isn't an M_HEAD constant. This is because Apache sets the method number to M_GET when it receives a HEAD request and sets header_only() to return true.
Example:
use Apache::Constants qw(:methods);
if ($r->method_number == M_POST) {
# change the request method
$r->method_number(M_GET);
$r->method("GET");
$r->internal_redirect('/new/place');
}
There is no particular advantage of using method_number() over
method() for Perl programmers, other than being very slightly more efficient.
- parsed_uri()
-
When Apache parses the incoming request, it will turn the request URI into
a predigested
uri_components structure. The parsed_uri()
method will return an object blessed into the Apache::URI class, which provides methods for fetching and setting various parts of the
URI. See The Apache::URI Class for details.
Example:
use Apache::URI ();
my $uri = $r->parsed_uri;
my $host = $uri->hostname;
- path_info()
-
The path_info() method will return what is left in the path after the URI translation
phase. Apache's default translation method, described at the beginning of
the next chapter, uses a simple directory-walking algorithm to decide what
part of the URI is the file, and what part is the additional path
information.
If you provide an argument to path_info(), you can change the value of the additional path information.
Examples:
my $path_info = $r->path_info;
$r->path_info("/some/additional/information");
Note that in most cases, changing the path_info() requires you to sync the uri() with the update. In this example, we calculate the original uri minus any
path info, change the existing path info, then properly update the uri:
my $path_info = $r->path_info;
my $uri = $r->uri;
my $orig_uri = substr $uri, 0, length($uri) - length($path_info);
$r->path_info($new_path_info);
$r->uri($orig_uri . $r->path_info);
- protocol
-
The $r->protocol method will return a string identifying the protocol
that the client speaks. Typical values will be ``HTTP/1.0'' or
``HTTP/1.1''.
my $protocol = $r->protocol;
This method is read-only.
- proxyreq()
-
The proxyreq() method returns true if the current HTTP request is for a proxy URI; that
is, the actual document resides on a foreign server somewhere and the
client wishes Apache to fetch the document on its behalf. This method is
mainly intended for use during the filename translation phase of the
request. See Chapter 7 for examples.
Example:
sub handler {
my $r = shift;
return DECLINED unless $r->proxyreq;
# do something interesting...
}
- read()
-
The read() method provides Perl API programmers with a simple way to get at the data
submitted by the browser in
POST and PUT
requests. It should be used when the information submitted by the browser
is not in the application/x-www-form-urlencoded format that the content() method knows how to handle.
Call read() with a scalar variable to retrieve the read data, and the length of the
data to read. Generally you will want to ask for the entire data sent by
the client, which can be recovered from the incoming Content-length field:*
my $buff;
$r->read($buff, $r->header_in('Content-length'));
Internally, Perl sets up a timeout in case the client breaks the connection
prematurely. The exact value of the timeout is set by the
Timeout directive in the server configuration file. If a timeout does occur, the
script will be aborted.
Within a handler you may also recover client data by simply reading from
STDIN using Perl's read(), getc() and readline (<>) functions. This works because the Perl API ties STDIN to
Apache::read() before entering handlers.
- footnote
-
*At the time of this writing, HTTP/1.1 requests which do not
have a
Content-Length header, such as one that uses chunked encoding, are not properly handled by
this API.
- server()
-
This method returns a reference to an Apache::Server
object, from which you can retrieve all sorts of information about
low-level aspects of the server's configuration. See The
Apache::Server Class for details.
Example:
my $s = $r->server;
- the_request()
-
This method returns the unparsed request line sent by the client.
the_request() is primarily used by log handlers, since others handlers will find it more
convenient to use methods that return the information in preparsed form.
This method is read-only.
Example:
my $request_line = $r->the_request;
print LOGFILE $request_line;
Note that the_request() is functionally equivalent to this code fragment:
my $request_line = join ' ', $r->method, $r->uri, $r->protocol;
- uri()
-
The uri() method returns the URI requested by the browser. You may also pass this
method a string argument in order to set the URI seen by handlers further
down the line, something that a translation handler might want to do.
Examples:
my $uri = $r->uri;
$r->uri("/something/else");
This section covers the API methods used to build and query the outgoing
server response message. These methods allow you to set the type and length
of the outgoing document, set HTTP cookies, assign the document a language
or compression method, and set up authorization and authentication schemes.
Most of the methods in this section are concerned with setting the values
of the outgoing HTTP response header fields. We give a list of all of the
fields you are likelyt o use in Table 9.2. For a comprehensive list, see
the HTTP/1.0 and HTTP/1.1 specifications found at http://www.w3.org/Protocols.
- Table 9.2: Response Header Fields
-
Field Description
Allowed The methods allowed by this URI, such as POST
Content-Encoding The compression method of this data
Content-Language The language in which this document is written
Content-Length Length, in bytes, of data to follow
Content-Type MIME type of this data
Date The current date (GMT)
Expires Date the document expires
Last-Modified Date the document was last modified
Link The URL of this document's "parent," if any
Location The location of the document in redirection responses
ETag Opaque ID for this version of the document
Message-Id The ID of this document, if any
MIME-Version The version of MIME used (currently 1.0)
Pragma Hints to the browser, such as "no-cache"
Public The requests that this URL responds to (rarely used)
Server Name and version of the server software
Set-Cookie Give the browser a client-side cookie
WWW-Authenticate Used in the various authorization schemes
Vary Criteria that can be used to select this document
- bytes_sent()
-
This method will retrieve the number of bytes of information sent by the
server to the client, excluding the length of the HTTP headers. It is only
of value after the send_http_header() method (see below) has been called. This method is normally used by log
handlers to record and summarize network usage. See Chapter 7 for examples.
Example:
my $bytes_sent = $r->bytes_sent;
- cgi_header_out()
-
This method is similar to the header_out() function. Given a key/value pair, it sets the corresponding outgoing HTTP
response header field to the indicated value, replacing whatever was there
before. However, unlike header_out(), which blindly sets the field to whatever you tell it, cgi_header_out() recognizes certain special keys and takes the appropriate action. This is
used to emulate the magic fields recognized by Apache's own mod_cgi
CGI-handling routines.
Table 9.3 lists the headers that trigger special actions by
cgi_header_out().
- Table 9.3: Special Actions Triggered by cgi_header_out()
-
Header | Actions
-----------------------------------------------------------------------------
Content-Type | Set $r->content_type to the given value
Status | Set $r->status to the integer value in the string
| Set $r->status_line to the given value
Location | Set Location in the headers_out table to the given value
| and perform an internal redirect if URI is relative
Content-Length | Set Content-Length in the headers_out table to the
| given value
Transfer-Encoding | Set Transfer-Encoding in the headers_out table to
| the given value
Last-Modified | Parse the string date, feeding the time value to
| ap_update_mtime() and invoke ap_set_last_modified()
Set-Cookie | Call ap_table_add() to support multiple Set-Cookie headers
Other | Call ap_table_merge() with given key and value
You generally can use the Apache::Table or header_out()
methods to achieve the results you want. cgi_header_out() is provided for those who wish to create a CGI emulation layer, such as
Apache::Registry. Those who are designing such a system should also look at send_cgi_header(), described below in Sending Data
to the Client.
- content_encoding()
-
This method gets or sets the document encoding. Content encoding fields are
strings like ``gzip'' or ``compress'', and indicate that the document has
been compressed or otherwise encoded. Browsers that handle the particular
encoding scheme can decode or decompress the document on the fly.
Getting or setting content_encoding() is equivalent to using
headers_out() or header_out() to change the value of the ``Content-encoding'' header. Chapters 4 and 7
give examples of querying and manipulating the content encoding field.
Examples:
my $enc = $r->content_encoding;
if($r->filename =~ /\.gz$/) {
$r->content_encoding("gzip");
}
- content_languages()
-
The content_languages() method gets or sets the ``Content-language'' HTTP header field. Called
without arguments it returns an array reference consisting of two-letter
language identifiers. For example ``en'' for English, and ``no'' for
Norwegian. You can also pass it a array reference to set the list of
languages to a new value. This method can be used to implement support for
multi-language documents. See the Apache::MIME module in Chapter 7 for an example.
content_languages() is a convenient interface to the lower-level
header_out and headers_out methods.
Examples:
my $languages = $r->content_languages;
$r->content_languages(['en']);
- content_type()
-
This method corresponds to the Content-type header field, which tells the browser the MIME type of the returned
document. Common MIME types include ``text/plain'', ``text/html'' and
``image/gif''.
content_type() can be used either to get or set the current value of this field. It is
important to use content_type() to set the content type rather than calling headers_out() or header_out() to change the outgoing HTTP header directly. This is because a copy of the
content type is kept in the request record, and other modules and core
protocol components will consult this value rather than the outgoing
headers table.
Examples:
my $ct = $r->content_type;
$r->content_type('text/plain');
- custom_response()
-
When a handler returns a code other than
OK , DECLINED or
DONE , Apache aborts processing and throws an error. When an error is thrown,
application programs can catch it and replace Apache's default processing
with their own custom error handling routines by using the ErrorDocument
configuration directive. The arguments to ErrorDocument are the status code to catch and a custom string, static document, or CGI
script to invoke when the error occurs.
The module-level interface to Apache's error handling system is
custom_response(). Like the directive, the method call takes two arguments.* The first
argument is a valid response code from Table 3.1. The second is either a
string to return in response to the error, or a URI to invoke to handle the
request. This URI can be a static document, a CGI script, or even a content
handler in an Apache module. Chapters 4 and 6 have more extensive coverage
of the error handling system.
Examples:
use Apache::Constants qw(:common);
$r->custom_response(AUTH_REQUIRED, "sorry, I don't know you.");
$r->custom_response(SERVER_ERROR, "/perl/server_error_handler.pl");
- footnote
-
Of course, the method actually takes 3 arguments, the first of which is
request_rec object, but you know what we mean.
- err_headers_out()
-
Apache actually keeps two sets of outgoing response headers, one set to use
when the transaction is successful, and another to use in the case of a
module returning an error code. Although maintaining a dual set of headers
may seem redundant, it makes custom error handlers much easier to write, as
we shall see in the next chapter.
err_headers_out() is equivalent to headers_out(), but it gets and sets values in the table of HTTP header response fields
that are sent in the case of an error.
Unlike ordinary header fields, error fields are sent to the browser even
when the module aborts or returns a error status code. This allows modules
to do such things as setting cookies when errors occur, or implementing
custom authorization schemes. Error fields also persist across internal
redirects when one content handler passes the buck to another. This feature
is necessary to support the
ErrorDocument mechanism.
Examples:
my %err_headers_out = $r->err_headers_out;
my $err_headers_out = $r->err_headers_out;
$r->err_headers_out->{'X-Odor'} = "Something's rotten in Denmark";
- err_header_out()
-
Like header_in() and header_out() methods, err_header_out()
predates the Apache::Table class. It can be used to get or set a single field in the error headers
table. As with the other header methods, the key lookups are done in a case
insensitive manner. Its syntax is identical to header_out():
Example:
my $loc = $r->err_header_out('Location');
$r->err_header_out(Location => 'http://www.modperl.com/');
$r->err_header_out(Location => undef);
- headers_out()
-
headers_out() provides modules with the ability to get or set any of the outgoing HTTP
response header fields. When called in a list context, the headers_out() returns a list of key/value pairs corresponding to the current server
response headers. The capitalization of the field names is not
canonicalized prior to copying them into the list. browser.
When called in a scalar context, this method returns a hash reference tied
to the Apache::Table class. This class provides an interface to the underlying headers_out data structure. Fetching a key from the tied hash will retrieve the
corresponding HTTP field in a case insensitive fashion, and assigning to
the hash will change the value of the header so that it is seen by other
handlers further down the line, and ultimately affects the header that is
sent to the browser.
The headers that are set with headers_out() are cleared when an error occurs, and do not persist across internal
redirects (in which a module hands off its content-handling responsibility
to a different URI). To create headers that persist across errors and
internal redirects, use err_headers_out(), described below.
Examples:
my %headers_out = $r->headers_out;
my $headers_out = $r->headers_out;
$headers_out->{Cookie} = 'SESSION_ID=3918823';
The ``Content-type'', ``Content-encoding'' and ``Content-language''
response fields have special meaning to the Apache server and its modules.
These fields occupy their own slots of the request record itself and should
always be accessed using their dedicated methods rather than the generic headers_out() method. If you forget, and use
headers_out() instead, Apache and other modules may not recognize your changes, leading
to confusing results. In addition, the ``Pragma: no-cache'' idiom, used to
tell browsers not to cache the document, should be set indirectly using the no_cache() method.
The many features of the Apache::Table class are described in more detail in its own section.
- header_out()
-
Before the Apache::Table class was written, header_out()
was used to get or set the value of an individual HTTP field. Like the header_in() method, header_out() predates the
Apache::Table class, but remains for backwards compatibility and as a bit of a shortcut
to using the headers_in method.
If passed a single argument, header_out() returns the value of the corresponding field from the outgoing HTTP
response header. If passed a key/value pair, header_out() stably changes the value of the corresponding header field. A field can be
removed entirely by passing undef as its value. The key lookups are done in
a case insensitive manner.
Examples:
my $loc = $r->header_out('Location');
$r->header_out(Location => 'http://www.modperl.com/');
$r->header_out(Location => undef);
- handler()
-
The handler method gets or sets the name of the module that is responsible for the
content generation phase of the current request. For example, for requests
to run CGI scripts, this will be the value ``cgi-script.'' Ordinarily this
value is set in the configuration file using the SetHandler or AddHandler directives. However your handlers can set this value during earlier phases
of the transaction, typically the MIME type checking or fixup phases.
Chapter 7 gives examples of how to use handler() to create a handler that dispatches to other modules based on the
document's type.
Example:
my $handler = $r->handler;
if($handler eq "cgi-script") {
warn "shame on you. Fixing.\n"
$r->handler('perl-script');
}
handler() cannot be used to set handlers for anything but the response phase. Use set_handlers() or push_handlers() to change the handlers for other phases (see mod_perl Specific Methods).
- no_cache()
-
The no_cache() method gets or sets a boolean flag that indicates that the data being
returned is volatile. Browsers that respect this flag will avoid writing
the document out to the client-side cache. Setting this flag to true will
cause Apache to emit an ``Expires'' field with the same date and time as
the original request.
Examples:
$current_flag = $r->no_cache();
$r->no_cache(1); # set no-cache to true
- request_time()
-
This method returns the time at which the request started, expressed as a
Unix timestamp in seconds since the start of an arbitrary period called the
``epoch''.* You can pass this to Perl's localtime()
function to get a human readable string, or to any of the available time
and date handling Perl modules to manipulate it in various ways.
Unlike most of the other methods, this one is read only.
Example:
my $date = scalar localtime $r->request_time;
warn "request started at $date";
- footnote
-
*In case you were wondering, the epoch began at 00:00:00 GMT
on January 1, 1970, and is due to end in 2038. There's probably a good
explanation for this choice.
- status()
-
The status() method allows you to get or set the status code of the outgoing HTTP
response. Usually you will set this value indirectly by returning the
status code as the handler's function result. However, there are rare
instances when you want to trick Apache into thinking that the module
returned an
OK status code, but actually send the browser a non-OK status.
Call the method with no arguments to retrieve the current status code. Call
it with a numeric value to set the status. Constants for all the standard
status codes can be found in Apache::Constants.
Examples:
use Apache::Constants qw(:common);
my $rc = $r->status;
$r->status(SERVER_ERROR);
- status_line()
-
status_line() is used to get or set the error code and the human-readable status message
that gets sent to the browser. Ordinarily you should use status() to set the numeric code and let Apache worry about translating this into a
human readable string. However, if you want to generate an unusual response
line, you can use this method to set the line. To be successful, the
response line
must begin with one of the valid HTTP status codes.
Example:
my $status_line = $r->status_line;
$r->status_line("200 Bottles of Beer on the Wall");
If you update the status line, you probably want to update
status() accordingly as well.
The methods in this section are invoked by content handlers to send header
and document body data to the waiting browser. Non-content handlers should
not call these methods.
- print()
-
The Apache C API provides several functions for sending formatted data to
the client. However, Perl is more flexible in its string handling
functions, so only one method, print() is needed.
The print() method is similar to Perl's built-in print()
function except that all the data you print eventually winds up being
displayed on the user's browser. Like the built-in print() this method will accept a variable number of strings to print out. However,
the Apache print() method does not accept a filehandle argument for obvious reasons.
Like the read() method, print() sets a timeout so that if the client connection is broken the handler won't
hang around indefinitely trying to send data. If a timeout does occur, the
script will be aborted.
The method also checks the Perl autoflush global
$| . If the variable is non-zero, print() will flush the buffer after every command, rather than after every line.
This is consistent with the way the built-in print() works.
Example:
$r->print("hello" , " ", "world!");
An interesting feature of the Apache Perl API is that the STDOUT filehandle
is tied to Apache so that if you use the built-in
print() to print to standard output, the data will be redirected to the request
object's print() method. This allows CGI scripts to run unmodified under Apache::Registry, and also allows one content handler's output to be transparently
``chained'' to another handler's input. The TieHandle Interface section later in this chapter goes into more detail on how filehandles can
be tied to the Perl API, and Chapter 4 has more to say about chained
handlers.
Example:
print "hello world!"; # automatically invokes Apache::print()
There is also an optimization built into print(). If any of the arguments to the method are scalar references to strings,
they are automatically dereferenced for you. This avoids needless copying
of large strings when passing them to subroutines.
Example:
$a_large_string = join '', <GETTYSBURG_ADDRESS>;
$r->print(\$a_large_string);
- printf()
-
The printf() method works just the like the built-in function of the same name, except
that the data is sent to the client. Calling the built-in printf() on STDOUT will indirectly invoke this method because STDOUT is tied.
Example:
$r->printf("Hello %s", $r->connection->user);
- rflush()
-
For efficiency's sake, Apache usually buffers the data printed by the
handler and sends it to the client only when its internal buffers fill (or
the handler is done). The rflush() method causes Apache to flush and send its buffered outgoing data
immediately. You may wish to do this if you have a long-running content
handler and you wish the client to see the data start to appear sooner.
Don't call rflush() if you don't need to, as it causes a performance hit.* This method is also
called automatically after each
print() if the Perl global variable $| is set to non-zero.
Example:
$r->rflush;
- footnote
-
*If you are wondering why this method has an r prefix, it is carried over from the C API I/O methods (described in Chapter
10), all of which have an ap_r prefix. This is the only I/O method from the group for which there is a
direct Perl interface. If you find that the r
prefix is not pleasing to the eye, this is no accident. It is indended to
discourage the use of rflush() due to the perfomance implications.
- send_cgi_header()
-
As we mentioned in the section on cgi_header_out(), the mod_cgi
module scans for and takes special action on certain header fields emitted
by CGI scripts. Developers who wish to develop a CGI emulation layer can
take advantage of send_cgi_header(). It accepts a single string argument formatted like a CGI header, parses
it into fields, and passes the parsed fields to cgi_header_out().
cgi_header_out() then calls send_http_header() to send the completed header to the browser.
Don't forget to put a blank line at the end of the headers, just as a CGI
script would:
$r->send_cgi_header(<<EOF);
Status: 200 Just Fine
Content-type: text/html
Set-Cookie: open=sesame
EOF
You're welcome to use this method even if you aren't emulating the CGI
environment, since it provides a convenient one-shot way to set and send
the entire HTTP header, however, there is a performance hit associated with
parsing the header string.
As an aside, this method is used to implement the behavior of the PerlSendHeader directive. When this directive is set to ``On'',
mod_perl scans the first lines of text printed by the content handler until it finds
a blank line. Everything above the blank line is then sent to send_cgi_header().
- send_fd()
-
Given an open filehandle, filehandle glob or glob reference as argument,
this method sends the contents of the file to the client. Internally the
Perl interface extracts the file descriptor from the filehandle and uses
that directly, which is generally faster than calling the higher-level Perl
methods. The confusing naming of this method (it takes a filehandle, not a
file descriptor) is to be consistent with the naming of the corresponding C
API function call.
This method is generally used by content handlers that wish to send the
browser the unmodified contents of a file.
Example:
my $fh = Apache::gensym(); # generate a new filehandle name
open($fh, $r->filename) || return NOT_FOUND;
$r->send_fd($fh);
close($fh);
- send_http_header()
-
This method formats the outgoing response data into a proper HTTP response
and sends it to the client. The header is constructed from values
previously set by calls to content_type(),
content_encoding(), content_language(), status_line(), and
headers_out(). Naturally, this method should be called before any other methods for
sending data to the client.
Because setting the document's MIME type is such a common operation, the
Perl version of this API call allows you to save a few keystrokes by
specifying the content type as an optional argument to
send_http_header(). This is exactly equivalent to calling
content_type() followed by send_http_header().
Examples:
$r->send_http_header;
$r->send_http_header('text/plain');
A content type passed to send_http_header() will override any previous calls to content_type().
This section covers the API methods that are available for your use during
the processing of a request, but are not directly related to incoming or
outgoing data.
- chdir_file()
-
Given a filename as argument, change from the current directory to the
directory in which the file is contained. This is a convenience routine for
modules that implement scripting engines, since it is common to run the
script from the directory in which it lives. The current directory will
remain here, unless your module changes back to the previous directory. As
there is significant overhead associated with determining the current
directory, we suggest using the
$Apache::Server::CWD variable or the server_root_relative()
method if you wish to return to the previous directory afterward.
Example:
$r->chdir_file($r->filename);
- child_terminate()
-
Calling this method will cause the current child process to shutdown
gracefully after the current transaction is completed and the logging and
cleanup phases are done. This method is not available on Win32 systems.
Example:
$r->child_terminate;
- hard_timeout()
-
- kill_timeout()
-
- reset_timeout()
-
- soft_timeout()
-
The timeout API governs the interaction of Apache with the client. At
various points during the request/response cycle a browser that is no
longer responding can be timed out so that it doesn't continue to hold the
connection open. Timeouts are primarily of concern to C API programmers, as
mod_perl handles the details of timeouts internally for read and write
methods. However, these calls are included in the Perl API for
completeness.
The hard_timeout() method initiates a ``hard'' timeout. If the client read or write operation
takes longer than the time specified by Apache's Timeout directive, then the current handler will be aborted immediately and Apache
will immediately enter the logging phase. hard_timeout() takes a single string argument which should contain the name of your module
or some other identification. This identification will be incorporated into
the error message that is written to the server error log when the timeout
occurs.
soft_timeout(), in contrast, does not immediately abort the current handler. Instead,
when a timeout occurs control returns to the handler, but all reads and
write operations are replaced with no-ops so that no further data can be
sent or received to the client. In addition, the Apache::Connection object's aborted() method will return true. Like hard_timeout() you should pass this method the name of your module in order to be able to
identify the source of the timeout in the error log.
The reset_timeout() method can be called to set a previously initiated timer back to zero. It
is usually used between a series of read or write operations in order to
avoid killing the timeout and restarting it completely.
Finally, the kill_timeout() method is called to cancel a previously initiated timeout. It is generally
called when a series of I/O operations are completely done.
The examples below will give you the general idea of how these four methods
are used. Remember, however, that in the Perl API these methods are not
really necessary because they are called internally by the read() and print() methods.
# typical hard_timeout() usage
$r->hard_timeout("Apache::Example while reading data");
while (... read data loop ...) {
...
$r->reset_timeout;
}
$r->kill_timeout;
# typical soft_timeout() usage
$r->soft_timeout("Apache::Example while reading data");
while (... read data loop ...) {
...
$r->reset_timeout;
}
$r->kill_timeout;
- internal_redirect()
-
Unlike a full HTTP redirect in which the server tells the browser to look
somewhere else for the requested document, the
internal_redirect() method tells Apache to return a different URI without telling the client.
This is a lot faster than a full redirect.
The required argument is an absolute URI path on the current server. The
server will process the URI as if it were a whole new request, running the
URI translation, MIME type checking, and other phases before invoking the
appropriate content handler for the new URI. The content handler that
eventually runs is not necessarily the same as the one that invoked internal_redirect(). This method should only be called within a content handler.
Do not use internal_redirect() to redirect to a different server. You'll need to do a full redirect for
that. Both redirection techniques are described in more detail in the next
chapter.
Example:
$r->internal_redirect("/new/place");
Apache implements its ErrorDocument feature as an internal redirect, so many of the techniques that apply to
internal redirects also apply to custom error handling.
- internal_redirect_handler()
-
This method does the same thing as internal_redirect(), but arranges for the content handler used to process the redirected URI
to be the same as the current content handler.
Example:
$r->internal_redirect_handler("/new/place");
- is_initial_req()
-
There are several instances in which an incoming URI request can trigger
one or more secondary internal requests. An internal request is triggered
when internal_redirect() is called explicitly, and also happens behind the scenes when lookup_file() and
lookup_uri() are called.
With the exception of the logging phase, which is run just once for the
primary request, secondary requests are run through each of the transaction
processing phases, and the appropriate handlers are called each time. There
may be times when you don't want a particular handler running on a
subrequest or internal redirect, either to avoid performance overhead or to
avoid infinite recursion. The is_initial_req() method will return a true value if the current request is the primary one,
and false if the request is the result of a subrequest or an internal
redirect.
Example:
return DECLINED unless $r->is_initial_req;
- is_main()
-
This method can be used to distinguish between subrequests triggered by
handlers and the ``main'' request triggered by a browser's request for a
URI or an internal redirect. is_main() returns a true value for the primary request and for internal redirects,
and false for subrequests. Notice that this is slightly different from
is_initial_req(), which returns false for internal redirects as well as subrequests.
is_main() is commonly used to prevent infinite recursion when a handler gets
reinvoked after it has made a subrequest.
return DECLINED unless $r->is_main;
Like is_initial_req() this is a read-only method.
- last()
-
- main()
-
- next()
-
- prev()
-
When a handler is called in response to a series of internal redirects, ErrorDocuments or subrequests, it is passed an ordinary-looking request object and can
usually proceed as if it were processing a normal request. However, if a
module has special needs, it can use these methods to walk the chain to
examine the request objects passed to other requests in the series.
main() will return the request object of the parent request, the top of the chain. last() will return the last request in the chain. prev() and next() will return the previous and next requests in the chain, respectively. Each
of these methods will return a reference to an object belonging to the Apache class, or undef if the request doesn't exist.
The prev() method is handy inside an ErrorDocument handler to get at the information from the request that triggered the
error. For example, this code fragment will find the URI of the failed
request:
my $failed_uri = $r->prev->uri;
The last() method is mainly used by logging modules. Since Apache may have performed
several subrequests while attempting to resolve the request, the last object will always point to the final result.
Example:
my $bytes_sent = $r->last->bytes_sent;
Should your module wish log all internal requests, the next()
method will come in handy. Example:
sub My::logger {
my $r = shift;
my $first = $r->uri;
my $last = $r->last->uri;
warn "first: $first, last: $last\n";
for (my $rr = $r; $rr; $rr = $rr->next) {
my $uri = $rr->uri;
my $status = $rr->status;
warn "request: $uri, status: $status\n";
}
return OK;
}
Assuming the requested URI was /, which was mapped to
/index.html by the DirectoryIndex configuration, the example above would output these messages to the ErrorLog:
first: /, last: /index.html
request: /, status: 200
request: /index.html, status: 200
The next() and main() methods are rarely used, but are included for completeness. Handlers that
need to determine whether they are in the main request should call $r->is_main() rather than
!$r->main() , as the former is marginally more efficient.
- location()
-
If the current handler was triggered by a Perl*Handler directive within a <Location> section, this method will return the path indicated by the <Location> directive.
For example, given this <Location> section:
<Location /images/dynamic_icons>
SetHandler perl-script
PerlHandler Apache::Icon
</Location>
then location() will return /images/dynamic_icons.
This method is handy for converting the current document's URI into a
relative path. Example:
my $base = $r->location;
(my $relative = $r->uri) =~ s/^$base//;
- lookup_file()
-
- lookup_uri()
-
lookup_file() and lookup_uri() invoke Apache subrequests. A subrequest is treated exactly like an ordinary
request, except that the post read request, header parser, response
generation and logging phases are not run. This allows modules to pose
``what-if'' questions to the server. Subrequests can be used to learn the
MIME type mapping of an arbitrary file, map a URI to a filename, or find
out whether a file is under access control. After a successful lookup, the
response phase of the request can optionally be invoked.
Both methods take a single argument corresponding to an absolute filename
or a URI path respectively. lookup_uri() performs the URI translation on the provided URI, passing the request to
the access control and authorization handlers, if any, and then proceeds to
the MIME type checking phase. lookup_file() behaves similarly, but bypasses the initial URI translation phase and
treats its argument as a physical file path.
Both methods return an Apache::SubRequest object, which is identical for all intents and purposes to a plain old Apache
request object, as it inherits all methods from the Apache class. You can call the returned object's content_type(),
filename() and other methods to retrieve the information left there during subrequest
processing.
The subrequest mechanism is extremely useful, and there are many practical
examples of using it in Chapters 4, 5 and 6. The following code snippets
show how to use subrequests to look up the content type of a file and a
URI:
my $subr = $r->lookup_file('/home/http/htdocs/images/logo.tif');
my $ct = $subr->content_type;
my $ct = $r->lookup_uri('/images/logo.tif')->content_type;
In the lookup_uri() example, /images/logo.tif will be passed through the same series of Alias, ServerRoot and URI rewriting translations that the URI would be subjected to if it
were requested by a browser.
If you need to pass certain HTTP header fields to the subrequest, such as a
particular value of Accept, you can do so by calling
headers_in() before invoking lookup_uri() or lookup_file()
It is often a good idea to check the status of a subrequest in case
something went wrong. If the subrequest was successful, the status
value will be that of HTTP_OK. Example:
use Apache::Constants qw(:common HTTP_OK);
my $subr = $r->lookup_uri("/path/file.html");
my $status = $subr->status;
unless ($status == HTTP_OK) {
die "subrequest failed with status: $status";
}
- notes()
-
There are times when handlers need to communicate among themselves in a way
that goes beyond setting the values of HTTP header fields. To accommodate
this, Apache maintains a ``notes'' table in the request record. This table
is simply a list of key/value pairs. One handler can add its own key/value
entry to the notes table, and later the handler for a subsequent phase can
retrieve the note. Notes are maintained for the life of the current
request, and are deleted when the transaction is finished.
When called with two arguments this method sets a note. When called with a
single argument, it retrieves the value of that note. Both the keys and the
values must be simple strings.
Examples:
$r->notes('CALENDAR' => 'Julian');
my $cal = $r->notes('CALENDAR');
When called in a scalar context with no arguments, a hash reference tied to
the Apache::Table class will be returned. Example:
my $notes = $r->notes;
my $cal = $notes->{CALENDAR};
This method comes in handy for communication between a module written in
Perl and one written in C. For example, the logging API saves error
messages under a key named ``error-notes'', which could be used by ErrorDocuments to provide a more informative error message.
The LogFormat directive, part of the standard mod_log_config
module, can incorporate notes into log messages using the formatting
character %n . See the Apache documentation for details.
- subprocess_env()
-
The subprocess_env() method is used to examine and change the Apache environment table. Like
other table-manipulation functions, this method has a variety of behaviors
depending on the number of arguments it is called with and the context in
which it is called. Call the method with no arguments in a scalar context
to return a hash reference tied to the Apache::Table class:
my $env = $r->subprocess_env;
my $docroot = $env->{'DOCUMENT_ROOT'};
Call the method with a single argument to retrieve the current value of the
corresponding entry in the environment table, or undef if no entry by that
name exists:
my $doc_root = $r->subprocess_env("DOCUMENT_ROOT");
You may also call the method with a key/value pair to set the value of an
entry in the table:
$r->subprocess_env(DOOR => "open");
Finally, if you call subprocess_env() in a void context with no arguments, it will reinitialize the table to
contain the standard variables that Apache adds to the environment before
invoking CGI scripts and server-side include files:
$r->subprocess_env;
Changes made to the environment table only persist for the length of the
request. The table is cleared out and reinitialized at the beginning of
every new transaction.
In the Perl API, the primary use for this method is to set environment
variables for other modules to see and use. For example, a fixup handler
could use this call to set up environment variables that are later
recognized by mod_include and incorporated into server-side include pages. You do not ordinarily need
to call subprocess_env()
to read environment variables, because mod_perl automatically copies the
environment table into the Perl %ENV array before entering the response handler phase.
A potential confusion arises when a Perl API handler needs to launch a
subprocess itself using system(), backticks, or a piped open. If you need to pass environment variables to
the subprocess, set the appropriate keys in %ENV just as you would in an ordinary Perl script.
subprocess_env() is only required if you need to change the environment in a subprocess
launched by a different handler or module.
- register_cleanup()
-
The register_cleanup() method registers a subroutine that will be called after the logging stage
of a request. This is much the same as installing a cleanup handler with
the PerlCleanupHandler directive. See Chapter 7 for some practical examples of using
register_cleanup().
The method expects a code reference argument:
sub callback {
my $r = shift;
my $uri = $r->uri;
warn "process $$ all done with $uri\n";
}
$r->register_cleanup(\&callback);
Several methods give you access to the Apache server's configuration
settings. You can inspect the configuration, and in many cases, change it
dynamically. The most commonly-needed configuration information can be
obtained directly from the methods given in this section. More esoteric
information can be obtained via the
Apache::Server object returned by the request object's server()
method. See The Apache::Server Class for details.
- dir_config()
-
The dir_config() method and the PerlSetVar configuration directive together are the primary way of passing
configuration information to Apache Perl modules.
The PerlSetVar directive can occur in the main part of a configuration file, in a <VirtualHost>, <Directory>, <Location> or
<Files> section, or in a .htaccess file. It takes a key/value pair separated by whitespace.
In the following two examples, the first directive sets a key named
``Gate'' to a value of ``open''. The second sets the same key to a value of
``wide open and beckoning''. Notice how quotes are used to protect
arguments that contain whitespace:
PerlSetVar Gate open
PerlSetVar Gate "wide open and beckoning"
Configuration files can contain any number of PerlSetVar
directives. If multiple directives try to set the same key, the usual rules
of directive precedence apply. A key defined in a .htaccess
file has precedence over a key defined in a <Directory>, <Location>, or <Files> section, which in turn has precedence over a key defined in a
<VirtualHost> section. Keys defined in the main body of the
configuration file have the lowest precedence of all.
Configuration keys set with PerlSetVar can be recovered within Perl handlers using dir_config(). The interface is simple. Called with the name of a key, dir_config() looks up the key and returns its value if found, or undef otherwise.
Example:
my $value = $r->dir_config('Gate');
If called in a scalar context with no arguments, dir_config()
returns a hash reference tied to the Apache::Table class. See The Apache::Table Class for details.
my $dir_config = $r->dir_config;
my $value = $dir_config->{'Gate'};
Only scalar values are allowed in configuration variables set by
PerlSetVar. If you want to pass an array or hash, separate the items by a character
that doesn't appear elsewhere in the string and call split()
to break the retrieved variable into its components.
- document_root()
-
The document_root() method returns the value of the document root directory. The value of the
document root is set by the server configuration directive DocumentRoot, and usually varies between different virtual hosts. Apache uses the
document root to translate the URI into a physical pathname unless a more
specific translation rule, such as Alias, applies.
Example:
my $doc_root = $r->document_root;
If you are used to using the environment variable DOCUMENT_ROOT within your
CGI scripts in order to resolve URIs into physical pathnames, be aware that
there's a much better way to do this in the Apache API. Perform a
subrequest with the URI you want to resolve, and then call the returned
object's filename() method. This works correctly even when the URI is affected by Alias directives or refers to user-maintained virtual directories:
my $image = $r->lookup_uri('/~fred/images/cookbook.gif')->filename;
If you're interested in fetching the physical file corresponding to the
current request, call the current request object's filename()
method:
my $file = $r->filename;
- get_server_port()
-
This method returns the port number on which the server is listening.
Example:
my $port = $r->get_server_port;
If UseCanonicalName is configured to be On (the default), this method will return the value of the Port configuration directive. If no Port directive is present, the default port 80 is returned. If UseCanonicalName is Off and the client sent a
Host header, then the method returns the actual port specified here, regardless
of the value of the Port directive.
- get_server_name()
-
This read-only method returns the name of the server handling the request.
Example:
my $name = $r->get_server_name;
This method is sensitive to the value of the UseCanonicalName
configuration directive. If UseCanonicalName is On (the default), the method will always return the value of the current
ServerName configuration directive. If UseCanonicalName is
Off, then this method will return the value of the incoming request's Host header if present, or the value of the ServerName directive otherwise. These values can be different if the server has
several different DNS names.
The lower-level server_name() method in the Apache::Server class always acts as if UseCanonicalName were on.
- server_root_relative()
-
Called without any arguments, the server_root_relative() method returns the currently-configured ServerRoot directory (in which Apache's binaries, configuration files and logs
commonly reside). If you pass this method a relative pathname, it will
resolve the relative pathname to an absolute one based on the value of the
server root. This is the preferred way to locate configuration and log
files that are stored beneath the server root.
Examples:
# return ServerRoot
my $ServerRoot = $r->server_root_relative;
# return $ServerRoot/logs/my.log
my $log = $r->server_root_relative("logs/my.log");
The server_root_relative method can also be invoked without a request object by calling it directly
from the Apache class. The example below, which might be found at the
beginning of a Perl startup file, first imports the Apache module, and then
uses server_root_relative() to add a site-specific library directory to the search path. It does this
in a BEGIN {} block to ensure that this code is evaluated first. It then
loads a local module named My::App, which presumably will be found in the
site-specific directory.
#!/usr/bin/perl
# modify the search path
BEGIN {
use Apache():
use lib Apache->server_root_relative("lib/my_app");
}
use My::App ();
This section covers request object methods that generate entries in the
server error log. They are handy for debugging and error reporting. Prior
to Apache 1.3, the error logging API was a very simple one that didn't
distinguish between different levels of severity. Apache now has a more
versatile logging API similar to the Unix syslog system.* Each entry is associated with a severity level from low
(``debug'') to high (``critical''). By adjusting the value of the LogLevel directive, the webmaster can control which error messages are recorded to
the error log file.
First we cover the interface to the earlier API. Later we'll discuss the Apache::Log class, which implements the 1.3 interface.
- footnote
-
*In fact, the loglevel API now provides direct syslog support.
See the Apache documentation for the ErrorLog directive, which explains how to enable logging via syslog.
- log_error()
-
The log_error() messages writes a nicely timestamped error message to the server error log.
It takes one or more string arguments, concatenates them into a line, and
writes out the result. This method log at the ``error'' log level according
the newer API.
For example, this code:
$r->log_error("Can't open index.html $!");
results in the following ErrorLog entry:
[Tue Jul 21 16:28:51 1998] [error] Can't open index.html No such file or directory
- log_reason()
-
The log_reason() method behaves like log_error() but generates additional information about the request that can help with
the post-mortem. The format of the entries this method produces is:
[$DATE] [error] access to $URI failed for $HOST, reason: $MESSAGE
where $DATE is the time and date of the request,
$URI is the requested URI, $HOST is the remote
host, and $MESSAGE is a message that you provide. For example,
this code fragment:
$r->log_reason("Can't open index.html $!");
might generate the following entry in the error log:
[Tue Jul 21 16:30:47 1998] [error] access to /perl/index.pl
failed for w15.yahoo.com, reason: Can't open index.html No such file
or directory
The argument to log_reason() is the message you wish to display in the error log. If you provide an
additional second argument, it will be displayed rather than the URI of the
request. This is usually used to display the physical path of the requested
file:
$r->log_reason("Can't open file $!", $r->filename);
This type of log message is most often used by content handlers that need
to open and process the requested file before transmitting it to the
browser, such as server-side include systems.
- warn()
-
warn() is similar to log_error(), but on post-1.3.0 versions of Apache it will result in the logging of a
message only when
LogLevel is set to warn or higher.
Example:
$r->warn("Attempting to open index.html");
- as_string()
-
The as_string() method is a handy debugging aid for working out obscure problems with HTTP
headers. It formats the current client request and server response fields
into a HTTP header, and returns it as a multi-line string. The request
headers will come first, followed by a blank line, followed by the
response. For example, here is an example of using as_string() within a call to warn() and the output it might produce:
$r->warn("HTTP dump:\n", $r->as_string);
[Tue Jul 21 16:51:51 1998] [warn] HTTP dump:
GET /perl/index.pl HTTP/1.0
User-Agent: lwp-request/1.32
Host: localhost:9008
200 OK
Connection: close
Content-Type: text/plain
Apache version 1.3 introduced the notion of a log level. There are eight log levels, ranging in severity from emerg to debug. When modules call the new API logging routines, they provide the severity
level of the message. You can control which messages appear in the server
error logging by adjusting a new LogLevel directive. Messages greater than or equal to the severity level given by
LogLevel appear in the error log. Messages below the cutoff are discarded.
The Apache::Log API provides eight methods named for each of the severity levels. Each acts
like the request object's error_log()
method, except that it logs the provided message using the corresponding
severity level.
In order to use the new logging methods, you must use Apache::Log
in the Perl startup file or at within your module. You must then fetch an Apache::Log object by calling the log() method of either an
Apache ($r->log()) or an Apache::Server object ($r->server->log(). Both objects have access to the same methods described below. However,
the object returned from the
$r->log() provides some additional functionality. It will include the client IP
address, in dotted decimal form, with the log message. In addition, the
message will be saved in the request's
notes table, under a key named ``error-notes''. It is the equivalent of the C
language API's ap_log_rerror() function (Chapter 10).
The methods described below can be called with one or more string arguments
or a subroutine reference. If a subroutine reference is used, it is expect
to return a string which will be used in the log message. The subroutine
will only be invoked if the LogLevel is set to the given level or higher. This is most useful to provide verbose
debugging information during development, while saving CPU cycles during
production.
- log()
-
The log() method returns an object blessed into the Apache::Log
class. log() is implemented both for the Apache class and for the Apache::Server class.
Example:
use Apache::Log ();
my $log = $r->log; # messages will include client ip address
my $log = $r->server->log; # message will not include client ip address
- emerg()
-
This logs the provided message at the emergency log level, a level ordinarily reserved for problems that render the server
unusable.
$log->emerg("Cannot open lock file!");
- alert()
-
This logs the message using the alert level, which is intended for problems that require immediate attention.
$log->alert("getpwuid: couldn't determine user name from uid");
- crit()
-
This logs the message at the critical level, intended for severe conditions.
$log->crit("Cannot open configuration database!");
- error()
-
This logs the message at the error level, a catchall for non-critical error conditions.
$log->error("Parse of script failed: $@");
- warn()
-
The warn level is intended for warnings that may or may not require someone's
attention.
$log->warn("No database host specified, using default");
- notice()
-
notice() is used for normal but significant conditions.
$log->notice("Cannot connect to master database, trying slave $host");
- info()
-
This method is used for informational messages.
$log->info("CGI.pm version is old, consider upgrading") if
$CGI::VERSION < 2.42;
- debug()
-
This logs messages at the debug level, the lowest of them all. It is used for messages you wish to print
during development and debugging. The debug level will also include the filename and line number of the caller in the
log message.
$log->debug("Reading configuration from file $fname");
$log->debug(sub {
"The request: " . $r->as_string;
});
The Apache API provides several methods that are used for access control,
authentication and authorization. We gave complete examples of using these
methods in Chapter 6.
- allow_options()
-
The allow_options() method gives module writers access to the per-directory Options configuration. It returns a bitmap in which a bit is set to one if the
corresponding option is enabled. The
Apache::Constants module provides symbolic constants for the various options when you import
the tab :options. You will typically perform a bitwise AND (&) on the options bitmap to
check which ones are enabled.
For example, a script engine such as Apache::Registry or
Apache::SSI might want to check if it's OK to execute a script in the current location
using this code:
use Apache::Constants qw(:common :options);
unless($r->allow_options & OPT_EXECCGI) {
$r->log_reason("Options ExecCGI is off in this directory",
$r->filename);
return FORBIDDEN;
}
A full list of option constants can be found in the
Apache::Constants manual page.
- auth_name()
-
This method will return the current value of the per directory
configuration directive AuthName, which is used in conjunction with password-protected directories. AuthName declares an authorization ``realm'', which is intended as a high-level
grouping of an authentication scheme and a URI tree to which it applies.
If the requested file or directory is password protected,
auth_name() will return the realm name. An authentication module can then use this
realm name to figure out which database to authenticate the user against.
This method can also be used to set the value of the realm for use by later
handlers.
Examples:
my $auth_name = $r->auth_name();
$r->auth_name("Protected Area");
- auth_type()
-
Password-protected files and directories will also have an authorization
type, which is usually one of ``Basic'' or ``Digest.'' The authorization
type is set with the configuration directive
AuthType and retrieved with the API method auth_type(). Here's an example from a hypothetical authentication handler that can
only authenticate using the Basic method:
my $auth_type = $r->auth_type;
unless (lc($auth_type) eq "basic") {
$r->warn(__PACKAGE__, " can't handle AuthType $auth_type");
return DECLINED;
}
The differences between Basic and Digest authentication are discussed in
Chapter 6.
- get_basic_auth_pw()
-
The get_basic_auth_pw() method returns a two-element list. If the current request is protected with
Basic authentication, the first element of the returned list will be
OK and the second will be the plaintext password entered by the user. Other
possible return codes include DECLINED , SERVER_ERROR and AUTH_REQUIRED , the meaning of each is described in Chapter 6.
Example:
my($ret, $sent_pw) = $r->get_basic_auth_pw;
You can get the username part of the pair by calling
$r->connection->user as described in The
Apache::Connection Class.
- note_basic_auth_failure()
-
If a URI is protected by Basic authentication and the browser fails to
provide a valid username/password combination (or none at all),
authentication handlers are expected to call the
note_basic_auth_failure() method. This sets up the outgoing HTTP headers in such a way that the user
will be (re)challenged to provide his username and password for the current
security realm. For example:
my($ret, $sent_pw) = $r->get_basic_auth_pw;
unless($r->connection->user and $sent_pw) {
$r->note_basic_auth_failure;
$r->log_reason("Both a username and password must be provided");
return AUTH_REQUIRED;
}
Although it would make sense for note_basic_auth_failure() to return a status code of AUTH_REQUIRED , it actually returns no value.
- requires()
-
This method returns information about each of the require
directives currently in force for the requested URI. Since there may be
many require directives, this method returns an array reference. Each item in the array
is a hash that contains information about a different require directive. The format of this data structure is described in detail in
Chapter 6, under A Gender-Based Authorization
Module.
- satisfies()
-
Documents can be under access control (e.g., access limited by hostname or
password) and authentication/authorization control (password protection)
simultaneously. The satisfy directive determines how Apache combines the two types of restriction. If
Satisfy All is specified, Apache will not grant access to the requested document unless
both the access control and authentication/authorization rules are
satisfied. If Satisfy Any
is specified, the remote user is allowed to retrieve the document if he
meets the requirements of either one of the restrictions.
Authorization and access control modules gain access to this configuration
variable through the satisfies() method. It will return one of the three constants SATISFY_ALL , SATISFY_ANY or
SATISFY_NOSPEC . The latter is returned when there is no applicable
satisfy directive at all. These constants can be imported by requesting the
``:satisfy'' tag from Apache::Constants.
The following code fragment illustrates an access control handler that
checks the status of the satisfy directive. If the current document is forbidden by access control rules the
code checks whether
satisfy any is in effect, and if so, whether authentication is also required (using the some_auth_required() method call described next). Unless both these conditions are true, the
handler logs an error message. Otherwise it just returns the result code,
knowing that any error logging will be performed by the authentication
handler.
use Apache::Constants qw(:common :satisfy);
if ($ret == FORBIDDEN) {
$r->log_reason("Client access denied by server configuration")
unless $r->satisfies == SATISFY_ANY && $r->some_auth_required;
return $ret;
}
- some_auth_required()
-
If the configuration for the current request requires some form of
authentication or authorization, this method returns true. Otherwise it
returns an undef value.
Example:
unless ($r->some_auth_required) {
$r->log_reason("I won't go further unless the user is authenticated");
return FORBIDDEN;
}
There are a handful of Perl API methods for which there is no C language
counterpart. Those who are only interested in learning the C API can skip
this section
- exit()
-
It is common to come across Perl CGI scripts that use the Perl builtin
exit() function to leave the script prematurely. Calling exit()
from within a CGI script, which owns its process, is harmless, but calling exit() from within mod_perl would have the unfortunate effect of making the entire child process exit
unceremoniously, in most cases before completing the request or logging the
transaction. On Win32 systems, calling exit() will make the whole server quit. Oops!
For this reason mod_perl's version of this function call,
Apache::exit(), does not cause the process to exit. Instead, it calls Perl's croak() function to halt script execution, but does not log a message to the ErrorLog. If you really want the child server process to exit, call Apache::exit() with an optional status argument of DONE (available in Apache::Constants). The child process will be shut down, but only after it has had a chance
to properly finish handling the current requests.
In scripts running under Apache::Registry, Perl's built-in
exit() is overridden by Apache::exit() so that legacy CGI scripts don't inadvertently shoot themselves in the
foot. In Perl versions 5.005 and higher, exit() is overridden everywhere, including within handlers. In versions of mod_perl built with Perl 5.004 handlers can still inadvertently invoke the built-in exit(), so you should be on the watch for this mistake. One way to avoid it is to
explicitly import the ``exit'' symbol when you load the Apache module.
Here are various examples of exit():
$r->exit;
Apache->exit;
$r->exit(0);
$r->exit(DONE);
use Apache 'exit'; #this override's Perl's builtin
exit;
If a handler needs direct access to the Perl builtin version of
exit() after it has imported Apache's version, it should call
CORE::exit().
- gensym()
-
This function creates an anonymous glob and returns a reference to it for
use as a safe file or directory handle. Ordinary bareword filehandles are
prone to namespace clashes. The IO::File class avoids this, but some users have found that the IO::File carries too much overhead. Apache::gensym avoids this overhead but still avoids namespace clashes.
my $fh = Apache->gensym;
open $fh, $r->filename or die $!;
$r->send_fd($fh);
close $fh;
Because of its cleanliness most of the examples in this book use the
Apache::File interface for reading and writing files (See The
Apache::File Class). If you wish to squeeze out a bit of overhead, you may wish to use Apache::gensym() with Perl's builtin open()
function instead.
- current_callback()
-
If a module wishes to know what handler is currently being run, it can find
out with the current_callback method. This method is most useful to PerlDispatchHandlers who wish to only take action for certain phases.
if($r->current_callback eq "PerlLogHandler") {
$r->warn("Logging request");
}
- get_handlers()
-
The get_handlers method will return an array reference containing the list of all handlers
that are configured to handle the current request. This method take a
single argument specifying which handlers to return.
my $handlers = $r->get_handlers('PerlAuthenHandler');
- set_handlers()
-
If you would like to change the list of handlers configured for the current
request, you can change it with set_handlers(). This method takes two arguments, the name of the handler you wish to
change, and an array reference pointing to one or more references to the
handler subroutines you want to run for that phase. If any handlers were
previously defined, such as with a Perl*Handler
directive, they are replaced by this call. You can provide a second
argument of undef if you with to remove all handlers for that phase.
Examples:
$r->set_handlers(PerlAuthenHandler => [\&auth_one, \&auth_two]);
$r->set_handlers(PerlAuthenHandler => undef);
- push_handlers()
-
The push_handlers() method is used to add a new Perl handler routine to the current request's
handler ``stack''. Instead of replacing the list of handlers, it just
appends a new handler to the list. Each handler is run in turn until one
returns an error code. You'll find more information about using stacked
handlers and examples in Chapters 4, 6 and 7.
This method takes two arguments, the name of the phase you want to
manipulate, and a reference to the subroutine you want to handle that
phase.
Example:
$r->push_handlers(PerlLogHandler => \&my_logger);
- module()
-
If you need to find out if a Perl module has already been loaded, the
module() method will tell you. Pass it the package name of the module you're
interested in. It will return a true value if the module is loaded.
Example:
do { #something } if Apache->module('My::Module');
This method can also be used to test if a C module is loaded. In this case,
pass it the filename of the module, just as you would use with the
IfModule directive. It will return a true value if the module is loaded.
Example:
do { #something } if Apache->module('mod_proxy.c');
- define()
-
Apache version 1.3.1 added a -D command line switch that can be used to pass the server parameter names for
conditional configuration with the IfDefine directive. These names exist for the lifetime of the server and can be
accessed at any time by Perl modules using the
define method. Example:
if(Apache->define("SSL")) {
#the server was started with -DSSL
}
- post_connection()
-
This is simply an alias for the register_cleanup() method described in the Server Core Functions section.
- request()
-
The
Apache->request() class method returns a reference to the current request object, if any.
Handlers that use the vanilla Perl API will not need to call this method
because the request object is passed to them in their argument list.
However, some modules may not have a subroutine entry point and therefore
need a way to gain access the request object. For example, CGI.pm uses this
method to provide proper mod_perl support.
Called with no arguments, request() returns the stored Apache
request object. It may also be called with a single argument to set the
stored request object. This is what Apache::Registry does before invoking a script.
Example:
my $r = Apache->request; # get the request
Apache->request($r); # set the request
Actually, it's a little known fact that Apache::Registry scripts can access the request object directly via @_. This is slightly
faster than using Apache->request, but has the disadvantage of being obscure. This technique is demonstrated
in
Subclassing the Apache Class.
- httpd_conf()
-
The httpd_conf() method allows you to pass new directives to Apache at startup time. Pass it
a multi-line string containing the configuration
directive(s)
that you wish Apache to process. Using string interpolation, you can use
this method to dynamically configure Apache according to arbitrarily
complex rules.
httpd_conf() can only be called during server startup, usually from within a Perl
startup file. Because there is no request method at this time, you must
invoke httpd_conf() directly through the
Apache class.
Example:
my $ServerRoot = '/local/web';
Apache->httpd_conf(<<EOF);
Alias /perl $ServerRoot/perl
Alias /cgi-bin $ServerRoot/cgi-bin
EOF
Should a syntax error occur, Apache will log an error and the server will
exit, just as it would if the error was present in the
httpd.conf configuration file. A more sophisticated way of configuring Apache at
startup time via
<Perl> sections is discussed in Chapter 9.
The vast bulk of the functionality of the Perl API is contained in the
Apache object. However, a number of auxiliary classes, including
Apache::Table, Apache::Connection, and Apache::Server
provide additional methods for accessing and manipulating the state of the
server. This section discusses these classes.
In the CGI environment, the standard input and standard output file
descriptors are redirected so that data read and written is passed through
Apache for processing. In the Apache module API, handlers ordinarily use
the Apache read() and print() methods to communicate with the client. However, as a convenience, mod_perl
ties the STDIN and STDOUT filehandles to the Apache class prior to invoking Perl API modules. This allows handlers to read from
standard input and write to standard output exactly as if they were in the
CGI environment.
The Apache class supports the full TIEHANDLE interface, as described in
perltie(1). STDIN and STDOUT are already tied to
Apache by the time your handler is called. If you wish to tie your own input or
output filehandle, you may do so by calling tie() with the request object as the function's third parameter:
tie *BROWSER, 'Apache', $r;
print BROWSER 'Come out, come out, wherever you are!';
Of course, it is better not hard code the Apache class name, as
$r might be blessed into a subclass:
tie *BROWSER, ref $r, $r;
The Apache methods lookup_uri() and lookup_file() return a request record object blessed into the Apache::SubRequest
class. The Apache::SubRequest class is a subclass of Apache, and inherits most of its methods from there. Here are two examples of
fetching subrequest objects:
my $subr = $r->lookup_file($filename);
my $subr = $r->lookup_uri($uri);
The Apache::SubRequest class adds a single new method, run().
- run()
-
When a subrequest is created, the URI translation, access checks, and MIME
checking phases are run, but unlike a real request, the content handler for
the response phase is not actually run. If you would like to invoke the
content handler, the run() method will do it:
my $status = $subr->run;
When you invoke the subrequest's response handler in this way, it will do
everything a response handler is supposed to, including sending the HTTP
headers and the document body. run() returns the content handler's status code as its function result. If you
are invoking the subrequest run() method from within your own content handler, you must not send the HTTP
header and document body yourself, as this would be appended to the bottom
of the information that has already been sent. Most handlers that invoke run() will immediately return its status code, pretending to Apache that they
handled the request themselves:
my $status = $subr->run;
return $status;
The Apache::Server class provides the Perl interface to the C API
server_rec data structure, which contains lots of low-level information about the
server configuration. Within a handler, the current Apache::Server object can be obtained by calling the Apache request object's server() method. At Perl startup time (such as within a startup script or a module
loaded with PerlModule) you can fetch the server object by invoking Apache->server directly. By convention, we use the variable $s for server objects.
Examples:
#at request time
sub handler {
my $r = shift;
my $s = $r->server;
....
}
#at server startup time, e.g. PerlModule or PerlRequire
my $s = Apache->server;
This section discusses the various methods that are available to you via
the server object. They correspond closely to the fields of the
server_rec structure, which we revisit in Chapter 10.
- is_virtual()
-
This method returns true if the current request is being applied to a
virtual server. This is a read-only method.
Example:
my $is_virtual = $s->is_virtual;
- log()
-
The log() method retrieves an object blessed into the
Apache::Log class. You can then use this object to access the full-featured logging
API. See The Apache::Log class for the details.
Example:
use Apache::Log ();
my $log = $s->log;
The Apache::Server::log() method is identical in most respects to the Apache::log() method discussed earlier. The difference is that messages logged with Apache::log() will include the IP address of the browser and add the messages to the notes table under a key named ``error-notes''. See the description of notes() under Server
Core Functions.
- port()
-
Returns the port on which this (virtual) server is listening. If no port is
explicitly listed in the server configuration file (that is, the server is
listening on the default port 80) this method will return 0. Use the
higher-level Apache::get_server_port() method if you wish to avoid this pitfall.
Example:
my $port = $r->server->port || 80;
This method is read-only.
- server_admin()
-
This method returns the e-mail address of the person responsible for this
server as configured by the ServerAdmin directive.
Example:
my $admin = $s->server_admin;
This method is read-only.
- server_hostname()
-
Returns the (virtual) hostname used by this server, as set by the
ServerName directive.
Example:
my $hostname = $s->server_hostname;
This method is read-only.
- names()
-
If this server is configured to use virtual hosts, the names()
method will return the names by which the current virtual host is
recognized as specified by the ServerAlias directives (including wild-carded names). The function result is an array
reference containing the host names. If no alias names are present or the
server is not using virtual hosts, this will return a reference to an empty
list.
Example:
my $s = $r->server;
my $names = $s->names;
- next()
-
Apache maintains a linked list of all configured virtual servers, which can
be accessed with the next method.
Example:
for(my $s = Apache->server; $s; $s = $s->next) {
printf "Contact %s regarding problems with the %s site\n",
$s->server_admin, $s->server_hostname;
}
- log_error()
-
This method is the same as the Apache::log_error() method, except that it's available through the Apache::Server object. This allows you to use it in Perl startup files and other places
where the request object isn't available. Example:
my $s = Apache->server;
$s->log_error("Can't open config file $!");
- warn()
-
This method is the same as the Apache::warn() method, but it's available through the Apache::Server object. This allows you to use it in Perl startup files and other places
where the request object isn't available. Example:
my $s = Apache->server;
$s->warn("Can't preload script $file $!");
The Apache::Connection class provides a Perl interface to the C language conn_rec data structure, which provides various low-level details about the network
connection back to the client. Within a handler, the connection object can
be obtained by calling the Apache request object's connection() method. The connection object is not available outside of handlers for the
various request phases because there is no connection established in those
cases. By convention, we use the variable $c for connection objects.
Example:
sub handler {
my $r = shift;
my $c = $r->connection;
...
}
In this section we discuss the various methods that are available through
the connection. They correspond closely to the fields of the C API conn_rec structure discussed at in Chapter 10.
- aborted()
-
This method returns true if the client has broken the connection
prematurely. This can happen if the remote user's computer has crashed, a
network error has occurred, or, more trivially, if the user pressed the
``stop'' button before the request or response was fully transmitted.
However, this value is only set if the timeout was set with soft_timeout().
Example:
if($c->aborted) {
warn "uh,oh, the client has gone away!";
}
- auth_type()
-
If authentication was used to access a password protected document, this
method returns the type of authentication that was used, currently either
``Basic'' or ``Digest.'' This method is different from the request object's auth_type() method, which we discussed earlier, because the latter returns the value of
the AuthType
configuration directive, in other words the type of authentication the
server would like to use. The connection object's auth_type()
method returns a value only when authentication was successfully completed,
undef otherwise:
Example:
if($c->auth_type ne 'Basic') {
warn "phew, I feel a bit better";
}
This method is read-only.
- local_addr()
-
This method returns a packed SOCKADDR_IN structure in the same format as
returned by the Perl Socket module's pack_sockaddr_in()
function. This packed structure contains the port and IP address at the
server's side of the connection. This is set by the server when the
connection record is created so it is always defined.
Example:
use Socket ();
sub handler {
my $r = shift;
my $local_add = $r->connection->local_addr;
my($port, $ip) = Socket::unpack_sockaddr_in($local_add);
...
}
For obvious reasons, this method is read-only.
- remote_addr()
-
This method returns a packed SOCKADDR_IN structure for the port and IP
address at the client's side of the connection. This is set by the server
when the connection record is created so it is always defined.
Among other things, the information returned by this method and
local_addr() can be used to perform RFC1413 ident lookups on the remote client even when
the configuration directive IdentityCheck
is turned off. Using Jan-Pieter Cornet's Net::Ident module for example:
use Net::Ident qw(lookupFromInAddr);
...
my $remoteuser = lookupFromInAddr ($c->local_addr,
$c->remote_addr, 2);
- remote_host()
-
This method returns the hostname of the remote client. It only returns the
name if the HostNameLookups directive is set to On and the DNS lookup was successful -- that is, the DNS contains a reverse
name entry for the remote host. If hostname based access control is in use
for the given request, a double-reverse lookup will occur regardless of the HostNameLookups setting, in which case, the cached hostname will be returned. If
unsuccessful, the method returns undef.
It is almost always better to use the high-level get_remote_host()
method available from the Apache request object (see above). The high level
method returns the dotted IP address of the remote host if its DNS name
isn't available, and it caches the results of previous lookups, avoiding
overhead if you call the method multiple times.
Example:
my $remote_host = $c->remote_host || "nohost";
my $remote_host = $r->get_remote_host(REMOTE_HOST); # better
This method is read-only.
- remote_ip()
-
This method returns the dotted decimal representation of the remote
client's IP address. It is set by the server when the connection record is
created and is always defined.
Example:
my $remote_ip = $c->remote_ip;
The remote_ip() can also be changed, which is helpful if your server is behind a proxy such
as the squid acelerator. By using the
X-Forwarded-For header sent by the proxy, the remote_ip can be set to this value so logging modules include the address of the real
client. The only subtle point is that X-Forwarded-For may be multi-valued in the case of a single request that has been forwarded
across multiple proxies. It's safest to choose the last IP address in the
list, since this corresponds to the original client.
Example:
my $header = $r->headers_in->{'X-Forwarded-For'};
if( my $ip = (split /,\s*/, $header)[-1] ) {
$r->connection->remote_ip($ip);
}
- remote_logname()
-
This method returns the login name of the remote user, provided that the
configuration directive IdentityCheck is set to On and the remote user's machine is running an identd daemon. If one or both
of these conditions is false, the method returns undef.
It is better to use the high level get_remote_logname() method which is provided by the request object. When the high level method
is called the result is cached and reused if called again. This is not true
of remote_logname().
Example:
my $remote_logname = $c->remote_logname || "nobody";
my $remote_logname = $r->get_remote_logname; # better
- user()
-
When Basic authentication is in effect, user() returns the name that the remote user provided when prompted for his
username and password. The password itself can be recovered from the
request object by calling get_basic_auth_pw().
Example:
my $username = $c->user;
The HTTP message protocol is simple in large part due to its consistent use
of the key/value paradigm in its request and response header fields.
Because much of an external module's work is getting and setting these
header fields, Apache provides a simple yet powerful interface called the table structure. Apache tables are keyed case-insensitive lookup tables. API function calls allow you to
obtain the list of defined keys, iterate through them, get the value of a
key, and set key values. Since many HTTP header fields are potentially
multi-valued, Apache also provides functionality for getting, setting and
merging the contents of multi-valued fields.
The five C data structures listed below are implemented as tables. This
list is likely to grow in the future.
- headers_in
-
- headers_out
-
- err_headers_out
-
- notes
-
- subprocess_env
-
As discussed in The Apache Request Record the Perl API provides five method calls named
headers_in(), headers_out(), err_headers_out, notes() and
subprocess_env() that retrieve these tables. The Perl manifestation of the Apache table API
is the Apache::Table
class. It provides a TIEHASH interface that allows transparent access to
its methods via a tied hash reference, as well as API methods that can be
called directly.
The TIEHASH interface is easy to use. Simply call one of the methods listed
above in a scalar context to return a tied hash reference. For example:
my $table = $r->headers_in;
The returned object can now be used to get and set values in the
headers_in table by treating it as an ordinary hash reference, but the keys are looked
up case insensitively. Examples:
my $type = $table->{'Content-type'};
my $type = $table->{'CONTENT-TYPE'}; # same thing
$table->{'Expires'} = 'Sat, 08 Aug 1998 01:39:20 GMT';
If the field you are trying to access is multi-valued, then the tied hash
interface suffers the limitation that fetching the key will only return the first defined value of the field. You can get around this by using the
object-oriented interface to access the table (we show an example of this
below), or use the each operator to access each key and value sequentially. The following code
snippet shows one way to fetch all the Set-Cookie fields in the outgoing HTTP header:
while (my($key, $value) = each %{$r->headers_out}) {
push @cookies, $value if lc($key) eq 'set-cookie';
}
When you treat an Apache::Table objects as a hash reference, you are accessing its internal get() and set() methods (among others) indirectly. To gain access to the full power of the
table API, you can invoke these methods directly by using the method call
syntax.
Here is the list of publicly available methods in Apache::Table, along with brief examples of usage.
- add()
-
The add() method will add a key/value pair to the table. Because Apache tables can
contain multiple instances of a key, you may call
add() multiple times with different values for the same key. Instead of the new
value of the key replacing the previous one, it will simply be appended to
the list. This is useful for multi-valued HTTP header fields such as Set-Cookie. The outgoing HTTP header will contain multiple instances of the field.
my $out = $r->headers_out;
for my $cookie (@cookies) {
$out->add("Set-Cookie" => $cookie);
}
Another way to add multiple values is to pass an array reference as the
second argument. This code has the same effect as the previous example:
my $out = $r->headers_out;
$out->add("Set-Cookie" => \@cookies);
- clear()
-
This method wipes the current table clean, discarding its current contents.
It's unlikely that you would want to perform this on a public table, but
here's an example that clears the notes table:
$r->notes->clear;
- do()
-
This method provides a way to iterate through an entire table item by item.
Pass it a reference to a code subroutine to be called once for each table
entry. The subroutine should accept two arguments corresponding to the key
and value respectively, and should return a true value. The routine can
return a false value to terminate the iteration prematurely.
This example dumps the contents of the headers_in field to the browser:
$r->headers_in->do(sub {
my($key, $value) = @_;
$r->print("$key => $value\n");
1;
});
For another example of do(), see listing 7.12 from the previous chapter, where we use it to transfer
the incoming headers from the incoming Apache request to an outgoing LWP HTTP::Request
object.
- get()
-
Probably the most frequently-called method, the get() function returns the table value at the given key. For multi-valued keys,
get() implements a little syntactic sugar. Called in a scalar context, it returns
the first value in the list. Called in an array context, it returns all
values of the multi-valued key.
my $ua = $r->headers_in->get('User-Agent');
my @cookies = $r->headers_in->get('Cookie');
get() is the underlying method that is called when you use the tied hash
interface to retrieve a key. However the ability to fetch a multi-valued
key as an array is only available when you call get()
directly using the object-oriented interface.
- merge()
-
merge() behaves like add() but each time it is called the new value is merged into the previous one,
creating a single HTTP header field containing multiple comma-delimited
values.
In the HTTP protocol a comma separated list of header values is equivalent
to the same values specified by repeated header lines. Some clients are
buggy enough that it is worthwhile for the server to control the merging
explicitly and avoid merging headers that cause trouble (like Set-Cookie).
merge() works like add(). You can either merge a series of entries one at a time:
my @languages = qw(en fr de);
foreach (@languages) {
$r->headers_out->merge("Content-Language" => $_);
}
or merge a bunch of entries in a single step by passing an array reference:
$r->headers_out->merge("Content-Language" => \@languages);
- new()
-
The new() method is available to create an Apache::Table object from scratch. It requires an Apache object to allocate the table and optionally, the number of entries to
intially allocate. Note that just like the other Apache::Table objects returned by API methods, references cannot be used as values, only
strings. Examples:
my $tab = Apache::Table->new($r); #default, allocates 10 entries
my $tab = Apache::Table->new($r, 20); #allocate 20 entries
- set()
-
set() takes a key/value pair and updates the table with it, creating the key if
it didn't exist before, or replacing its previous
value(s) if
it did. The resulting header field will be single-valued. Internally this
method is called when you assign a value to a key using the tied hash
interface.
Here's an example of using set() to implement an HTTP redirect:
$r->headers_out->set(Location => 'http://www.modperl.com/');
- unset()
-
This method can be used to remove a key and its contents. If there are
multiple entries with the same key, they will all be removed.
Example:
$r->headers_in->unset('Referer');
Apache version 1.3 introduced a utility module for parsing URIs,
manipulating their contents and ``unparsing'' them back into string form.
Since this functionality is part of the server C API, Apache::URI offers a
lightweight alternative to the URI::URL
module that ships with the libwww-perl package.*
An Apache::URI object is returned when you call the request object's parsed_uri() method. You may also call the Apache::URI
parse() constructor to parse an arbitrary string and return a new
Apache::URI object.
Example:
use Apache::URI ();
my $parsed_uri = $r->parsed_uri;
- footnote
-
*At the time of this writing, URI::URL was scheduled to be
replaced by URI.pm, which will be distributed separately from the
libwww-perl package.
- fragment()
-
This method returns or sets the fragment component of the URI. You know
this as the part that follows the hash mark (``#'') in links. The fragment
component is generally used only by clients and some Web proxies.
Examples:
my $fragment = $uri->fragment;
$uri->fragment('section_1');
- hostinfo()
-
This method gets or sets the remote host information, which usually
consists of a hostname and port number in the format hostname:port. Some rare URIs, such as those used for non-anonymous FTP, attach a
username and password to this information, for use in accessing private
resources. In this case, the information returned is in the format username:password@hostname:port.
This method returns the host information when called without arguments, or
sets the information when called with a single string argument.
Examples:
my $hostinfo = $uri->hostinfo;
$uri->hostinfo('www.modperl.com:8000');
- hostname()
-
This method returns or sets the hostname component of the URI object.
my $hostname = $uri->hostname;
$uri->hostname('www.modperl.com');
- parse()
-
The parse() method is a constructor used create a new
Apache::URI object from a URI string. Its first argument is an Apache request object,
and the second is a string containing an absolute or relative URI. In the
case of a relative URI, the
parse() method uses the request object to determine the location of the current
request and resolve the relative URI. Example:
my $uri = Apache::URI->parse($r, 'http://www.modperl.com/');
If the URI argument is omitted, the parse() method will construct a fully qualified URI from $r object, including the scheme, hostname, port, path and query string.
Example:
my $self_uri = Apache::URI->parse($r);
- password()
-
This method gets or sets the password part of the hostinfo component:
my $password = $uri->password;
$uri->password('rubble');
- path()
-
This method returns or sets the path component of the URI object.
my $path = $uri->path;
$uri->path('/perl/hangman.pl');
- path_info
-
After the ``real path'' part of the URI comes the ``additional path
information''. This component of the URI is not defined by the official URI
RFC, because it is an internal concept of Web servers that need to do
something with the part of the path information that is left over after
translating the rest into a valid filename.
path_info() gets or sets the additional path information portion of the URI, using the
current request object to determine what part of the path is real and what
part is additional.
Example:
$uri->path_info('/foo/bar');
Warning: the unparse() method does not take the additional path information into account. It
returns the URI minus the additional information.
- port()
-
This method returns or sets the port component of the URI object.
my $port = $uri->port;
$uri->port(80);
- query()
-
This method gets or sets the query string component of the URI, in other
words, the part after the ``?'':
Examples:
my $query = $uri->query;
$uri->query('one+two+three');
- rpath()
-
This method returns the ``real path,'' that is the path() minus the
path_info().
Example:
my $path = $uri->rpath();
- scheme()
-
This method returns and/or sets the scheme component of the URI. This is
the part that identifies the URI's protocol, such as http or
ftp. Called without arguments, the current scheme is retrieved. Called with a
single string argument, the current scheme is set.
Examples:
my $scheme = $uri->scheme;
$uri->scheme('http');
- unparse()
-
This method returns the string representation of the URI. Relative URIs are
resolved into absolute ones.
my $string = $uri->unparse;
- user()
-
This method gets or sets the username part of the hostinfo component:
my $user = $uri->user;
$uri->user('barney');
The Apache API provides several utility functions that are used by various
standard modules. The Perl API makes these available as function calls in
the Apache::Util package.
Although there is nothing here that doesn't already exist in some existing
Perl module, these C versions are considerably faster than their
corresponding Perl functions and avoid the memory bloat of pulling in yet
another Perl package.
To make these functions available to your handlers, import the
Apache::Util module with an import tag of ``:all'':
use Apache::Util qw(:all);
- escape_uri()
-
This function encodes all unsafe characters in a URI into
%XX hex escape sequences. This is equivalent to the URI::Escape::uri_escape() function form the LWP package. Example:
use Apache::Util qw(escape_uri);
my $escaped = escape_uri($url);
- escape_html()
-
This function replaces unsafe HTML character sequences (``<'', ``>'' and ``&'') with their entity representations. This is
equivalent to the
HTML::Entities::encode() function. Example:
use Apache::Util qw(escape_html);
my $display_html = escape_html("<h1>Header Level 1 Example</h1>");
- ht_time()
-
This function produces dates in the format required by the HTTP protocol.
You will usually call it with a single argument, the number of seconds
since the ``epoch''. The current time expressed in these units is returned
by the Perl built-in time() function.
You may also call ht_time() with optional second and third arguments. The second argument, if present,
is a format string that follows the same conventions as the strftime() function in the POSIX library. The default format is %a, %d %b %Y %H:%M:%S %Z , where ``%Z'' is an Apache extension that always expands to ``GMT''. The
optional third argument is a flag that selects whether to express the
returned time in GMT or using the local timezone. A true value (the
default) selects GMT, which is what you will want in nearly all cases.
Unless you have a good reason to use a non-standard time format, you should
content yourself with the one-argument form of this function. The function
is equivalent to the LWP package's
HTTP::Date::time2str() function when passed a single argument.
Examples:
use Apache::Util qw(ht_time);
my $str = ht_time(time);
my $str = ht_time(time, "%d %b %Y %H:%M %Z"); # 06 Nov 1994 08:49 GMT
my $str = ht_time(time, "%d %b %Y %H:%M %Z",0); # 06 Nov 1994 13:49 EST
- parsedate()
-
This function is the inverse of ht_time(), parsing HTTP dates and returning the number of seconds since the epoch.
You can then pass this value to Time::localtime (or another of Perl's date-handling modules) and extract the date fields
that you want.
The parsedate() recognizes and handles date strings in any of three standard formats:
Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, the modern HTTP format
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, the old obsolete HTTP format
Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
Example:
use Apache::Util qw(parsedate);
my $secs;
if (my $if_modified = $r->headers_in->{'If-modified-since'}) {
$secs = parsedate $if_modified;
}
- size_string()
-
This function converts the given file size into a formatted string. The
size given in the string will be in units of bytes, kilobytes or megabytes,
depending on the size of the file. This function formats the string just as
the C ap_send_size() API function does, but returns the string rather than sending it directly
to the client. The ap_send_size() function is used in mod_autoindex to display the size of files in automatic directory listings, and by
mod_include to implement the fsize directive.
This example shows size_string() being used to get the formatted size of the currently requested file:
use Apache::Util qw(size_string);
my $size = size_string -s $r->finfo;
- unescape_uri()
-
This function decodes all
%XX hex escape sequences in the given uri. It is equivalent to the URI::Escape::uri_unescape() function from the LWP package.
Example:
use Apache::Util qw(unescape_uri);
my $unescaped = unescape_uri($safe_url);
- unescape_uri_info()
-
This function is similar to unescape_uri() but is specialized to remove escape sequences from the query string portion
of the URI. The main difference is that it translates the ``+'' character
into spaces as well as recognizing and translating the hex escapes.
Example:
use Apache::Util qw(unescape_info);
$string = $r->uri->query;
my %data = map { unescape_uri_info($_) } split /[=&]/, $string, -1;
This would correctly translate the query string
``name=Fred+Flintstone&town=Bedrock'' into the hash:
data => 'Fred Flintstone',
town => 'Bedrock'
Among the packages installed by the Perl API is a tiny one named, simply
enough, ``mod_perl.'' You can query this class to determine what version of mod_perl is installed and what features it makes available.
- import()
-
If your Apache Perl API modules depend on version-specific features of
mod_perl, you can use the import() method to require that a certain version of mod_perl be installed. The
syntax is simple:
use mod_perl 1.16; # require version 1.16 or higher
When mod_perl is built, you can control which handlers and other features are enabled. import() can also be used to check for the presence of individual features.
#require Authen and Authz handlers to be enabled
use mod_perl qw(PerlAuthenHandler PerlAuthzHandler);
Here is the list of features that you can check for:
- PerlDispatchHandler
-
- PerlChildInitHandler
-
- PerlChildExitHandler
-
- PerlPostReadRequestHandler
-
- PerlTransHandler
-
- PerlHeaderParserHandler
-
- PerlAccessHandler
-
- PerlAuthenHandler
-
- PerlAuthzHandler
-
- PerlTypeHandler
-
- PerlFixupHandler
-
- PerlHandler
-
- PerlLogHandler
-
- PerlInitHandler
-
- PerlCleanupHandler
-
- PerlStackedHandlers
-
- PerlMethodHandlers
-
- PerlDirectiveHandlers
-
- PerlSections
-
- PerlSSI
-
- hook()
-
The hook() function can be used at runtime to determine whether the current mod_perl
installation provides support for a certain feature. This is the internal
function that import() uses to check for configured features. This function is not exported, so
you have to refer to it using its fully-qualified name, mod_perl::hook().
hook() recognizes the same list of features that import() does.
Example:
use mod_perl ();
unless(mod_perl::hook('PerlAuthenHandler')) {
die "PerlAuthenHandler is not enabled!";
}
All of the HTTP status codes are defined in the httpd.h file, along with server specific status codes such as OK , DECLINED and
DONE . The Apache::Constants class provides access to these codes as constant subroutines. As there are
many of these constants, they are not all exported by default. By default,
only those listed in the :common export tag are exported. A variety of export tags are defined, allowing you
to bring in various sets of constants to suit your needs. You are also free
to bring in individual constants, just as you can with any other Perl
module.
Here are the status codes listed by export tag group:
- :common
-
This tag imports the most commonly used constants.
OK
DECLINED
DONE
NOT_FOUND
FORBIDDEN
AUTH_REQUIRED
SERVER_ERROR
- :response
-
This tag imports the common response codes, plus these response codes:
DOCUMENT_FOLLOWS
MOVED
REDIRECT
USE_LOCAL_COPY
BAD_REQUEST
BAD_GATEWAY
RESPONSE_CODES
NOT_IMPLEMENTED
CONTINUE
NOT_AUTHORITATIVE
CONTINUE and NOT_AUTHORITATIVE are aliases for DECLINED.
- :methods
-
These are the method numbers, commonly used with the Apache
method_number method.
METHODS
M_GET
M_PUT
M_POST
M_DELETE
M_CONNECT
M_OPTIONS
M_TRACE
M_PATCH
M_PROPFIND
M_PROPPATCH
M_MKCOL
M_COPY
M_MOVE
M_LOCK
M_UNLOCK
M_INVALID
Each of the M_* constants corresponds to an integer value, where
M_GET..M_UNLOCK is 0..14. The METHODS constant is the number of M_* constants, 15 at the time of this writing. This is designed to accommodate
support for other request methods.
for (my $i = 0; $i < METHODS; $i++) {
...
}
- :options
-
These constants are most commonly used with the Apache
allow_options method:
OPT_NONE
OPT_INDEXES
OPT_INCLUDES
OPT_SYM_LINKS
OPT_EXECCGI
OPT_UNSET
OPT_INCNOEXEC
OPT_SYM_OWNER
OPT_MULTI
OPT_ALL
- :satisfy
-
These constants are most commonly used with the Apache satisfy()
method:
SATISFY_ALL
SATISFY_ANY
SATISFY_NOSPEC
- :remotehost
-
These constants are most commonly used with the Apache
get_remote_host method:
REMOTE_HOST
REMOTE_NAME
REMOTE_NOLOOKUP
REMOTE_DOUBLE_REV
- :http
-
This is the full set of HTTP response codes: (NOTE: this list is not
definitive. See the Apache source code for the most up to date listing).
HTTP_OK
HTTP_MOVED_TEMPORARILY
HTTP_MOVED_PERMANENTLY
HTTP_METHOD_NOT_ALLOWED
HTTP_NOT_MODIFIED
HTTP_UNAUTHORIZED
HTTP_FORBIDDEN
HTTP_NOT_FOUND
HTTP_BAD_REQUEST
HTTP_INTERNAL_SERVER_ERROR
HTTP_NOT_ACCEPTABLE
HTTP_NO_CONTENT
HTTP_PRECONDITION_FAILED
HTTP_SERVICE_UNAVAILABLE
HTTP_VARIANT_ALSO_VARIES
- :server
-
These are constants related to the version of the Apache server software:
MODULE_MAGIC_NUMBER
SERVER_VERSION
SERVER_BUILT
- :config
-
These are constants most commonly used with configuration directive
handlers:
DECLINE_CMD
- :types
-
These are constants which define internal request types:
DIR_MAGIC_TYPE
- :override
-
These constants are used to control and test the context of configuration
directives.
OR_NONE
OR_LIMIT
OR_OPTIONS
OR_FILEINFO
OR_AUTHCFG
OR_INDEXES
OR_UNSET
OR_ALL
ACCESS_CONF
RSRC_CONF
- :args_how
-
These are the constants which define configuration directive prototypes.
RAW_ARGS
TAKE1
TAKE2
TAKE12
TAKE3
TAKE23
TAKE123
ITERATE
ITERATE2
FLAG
NO_ARGS
As you may notice, the list above is shorter than what is defined in
Apache's include/httpd.h header file. The missing constants are available as subroutines via Apache::Constants, they are just not exportable by default. The less frequently used
constants were left out of this list to keep memory consumption at a
reasonable level.
There are two options if you need to access a constant that is not
exportable by default. One is simply to use the fully qualifed subroutine
name, for example:
return Apache::Constants::HTTP_MULTIPLE_CHOICES();
Or, use the export method in a server startup file to add exportable names. Example:
#startup script
Apache::Constants->export(qw( HTTP_MULTIPLE_CHOICES ));
#runtime module
use Apache::Constants qw(:common HTTP_MULTIPLE_CHOICES);
...
return HTTP_MULTIPLE_CHOICES;
While the HTTP constants are generally used a return codes from handler
subroutines, it is also possible to use the builtin die()
function to jump out of a handler with a status code that will be
propagated back to Apache. Example:
unless (-r _) {
die FORBIDDEN;
}
Two classes, Apache::ModuleConfig and Apache::CmdParms, provide access to the custom configuration directive API.
Most Apache Perl API modules use the simple PerlSetVar directive to declare per-directory configuration variables. However, with a
little more effort, you can create entirely new configuration directives.
This process is discussed in detail in Chapter 8.
Once the configuration directives have been created, they can be retrieved
from within handlers using the Apache::ModuleConfig->get()
class method. get() returns the current command configuration table as an Apache table blessed
into the Apache::Table
class. get() takes one or two arguments. The first argument can be the current request
object to retrieve per-directory data or an
Apache::Server object to retrieve per-server data. The second, optional, argument is the
name of the module whose configuration table you are interested in. If not
specified, this argument defaults to the current package, which is usually
what you want.
Here's an example:
use Apache::ModuleConfig ();
...
sub handler {
my $r = shift;
my $cfg = Apache::ModuleConfig->get($r);
my $printer = $cfg->{'printer-address'};
...
}
The Apache::CmdParms class provides a Perl interface to the Apache
cmd_parms data structure. When Apache encounters a directive, it invokes a command
handler that is responsible for processing the directive's arguments. The Apache::CmdParms object is passed to the responsible handler and contains information that
may be useful when processing these arguments.
An example of writing a directive handler is given in Chapter 8. In this
section, we just summarize the methods that Apache::CmdParms
makes available.
- path()
-
If the configuration directive applies to a certain <Location>,
<Directory> or <Files> section, the path() method returns the path or filename pattern to which the section applies.
Example:
my $path = $parms->path;
- server()
-
This method returns an object blessed into the Apache::Server
class. This is the same Apache::Server object which is retrieved at request time via the Apache method named server(). See above. Example:
my $s = $parms->server;
- cmd()
-
This method returns an object blessed into the Apache::Command
class. The Apache::Module package from CPAN must be installed to access Apache::Command methods. Example:
use Apache::Module ();
...
my $name = $parms->cmd->name;
- info()
-
If the directive handler has stashed any info in the cmd_data slot, this method will return that data. This is generally somewhat static
information, normally used to reuse a common configuration function. For
example, the fancy directory indexer,
mod_autoindex and its family of AddIcon* directives uses this technique quite effectively to manipulate the
directive arguments.
my $info = $parms->info;
- limited()
-
The methods present in the current Limit configuration are converted into a bitmask, which is returned by this
method. For example:
# httpd.conf
<Limit GET POST>
SomeDirective argument_1 argument_2
</Limit>
# Perl module
use Apache::Constants qw(:methods);
sub SomeDirective ($$$$) {
my($parms, $cfg, @args) = @_;
my $method_mask = $parms->limited;
if($method_mask & (1 << M_POST)) {
...
}
}
- override()
-
This method converts current value of the AllowOverride directive into a bitmask and returns it. You can then import the
Apache::Constants :override tag to retrieve the values of individual bits in the mask. Modules don't
generally need to check this value, the internal configuration functions
take care of the required context checking.
Example:
use Apache::Constants qw(:override);
my $override_mask = $parms->override;
if($override_mask & OR_ALL) {
#this directive is allowed anywhere in the configuration files
}
- getline()
-
If the directive handler needs to read from the configuration file
directly, it may do so with the getline() method. The first line returned in the example below is the line
immediately following the line on which the directive appeared. It's up to
your handler to decide when to stop reading lines; in the example below we
use pattern matching.
Reading from the configuration file directly is normally done when a
directive is declared with a prototype of RAW_ARGS. With this prototype, arguments are not parsed by Apache, that job is left
up to the directive handler. Let's say you need to implement a
configuration container, in the same format as the standard
<Directory> and <Location> directives:
<Container argument>
....
</Container>
Here is a directive handler to parse it:
sub Container ($$$*) {
my($parms, $cfg, $arg, $fh) = @_;
$arg =~ s/>//;
while($parms->getline($line)) {
last if $line =~ m:</Container>:i;
...
}
}
There is an alternative to using the getline method when the
RAW_ARGS prototype is used, a tied filehandle which is passed as the directive
handler's last argument. Perl's builtin read and
getc functions may be used on this filehandle, along with the <> readline
operator:
sub Container ($$$*) {
my($parms, $cfg, $arg, $fh) = @_;
$arg =~ s/>//;
while(defined(my $line = <$fh>)) {
last if $line =~ m:</Container>:i;
...
}
}
The Perl API includes class named Apache::File, which, when loaded, provides advanced functions for opening and
manipulating files at the server side.
Apache::File does two things. First, it provides an object-oriented interface to
filehandles similar to Perl's standard
IO::File class. While the Apache::File module does not provide all the functionality of IO::File, its methods are approximately twice as fast as the equivalent IO::File methods. Secondly, when you use Apache::File , it adds several new methods to the Apache
class which provide support for handling files under the HTTP/1.1 protocol.
Like IO::File, the main advantage of accessing filehandles through
Apache::File's object-oriented interface is the ability to create new anonymous
filehandles without worrying about namespace collision. Furthermore, you
don't have to close the filehandle explicitly before exiting the subroutine
that uses it; this is done automatically when the filehandle object goes
out of scope.
Example:
{
use Apache::File;
my $fh = Apache::File->new($config);
# no need to close
}
However, Apache::File is still not as fast as using Perl's native
open() and close() functions. If you wish to get the highest performance possible, you should
use open() and close() in conjunction with the standard Symbol::gensym or Apache::gensym
functions.
Example:
{ # using standard Symbol module
use Symbol 'gensym';
my $fh = gensym;
open $fh, $config;
close $fh;
}
{ # Using Apache::gensym() method
my $fh = Apache->gensym;
open $fh, $config;
close $fh;
}
A little known feature of Perl is that when lexically defined variables go
out of scope, any indirect filehandle stored in them is automatically
closed. So in fact there's really no reason to perform an explicit close() on the filehandles in the two examples above unless you want to test the
close operation's return value. As always with Perl, there's more than one
way to do it.
These are methods associated directly with Apache::File objects. They form a subset of what's available from the Perl IO::File and
FileHandle classes.
- new()
-
This method creates a new filehandle, returning the filehandle object on
success, undef on failure. If an additional argument is given, it will be
passed to the open() method automatically.
Examples:
use Apache::File ();
my $fh = Apache::File->new;
my $fh = Apache::File->new($filename) or die "Can't open $filename $!";
- open()
-
Given an Apache::File object previously created with new(), this method opens a file and associates it with the object. The open()
method accepts the same types of arguments as the standard Perl
open() function, including support for file modes.
Examples:
$fh->open($filename);
$fh->open(">$out_file");
$fh->open("|$program");
- close()
-
The close() method is equivalent to the Perl builtin close
function, returns true upon success, false upon failure.
$fh->close or die "Can't close $filename $!";
- tmpfile()
-
The tmpfile() method is responsible for opening up a unique temporary file. It is similar
to the tmpnam() function in the POSIX module, but doesn't come with all the memory overhead
that loading POSIX does. It will choose a suitable temporary directory
(which must be writable by the Web server process). It then generates a
series of filenames using the current process ID and the
$TMPNAM package global. Once a unique name is found, it is
opened for writing, using flags that will cause the file to be created only
if it does not already exist. This prevents race conditions in which the
function finds what seems to be an unused name, but someone else claims the
same name before it can be created.
As an added bonus, tmpfile() calls the register_cleanup() method behind the scenes to make sure the file is unlinked after the
transaction is finished.
Called in a list context, tmpfile() returns the temporary file name and a filehandle opened for reading and
writing. In a scalar context only the filehandle is returned.
Example:
my($tmpnam, $fh) = Apache::File->tmpfile;
my $fh = Apache::File->tmpfile;
When a handler pulls in Apache::File, the module adds a number of new methods to the Apache request object. These methods are generally of interest to handlers that
wish to serve static files from disk or memory using the features of the
HTTP/1.1 protocol that provide increased performance through client-side
document caching.
To take full advantage of the HTTP/1.1 protocol, your content handler will
test the meets_conditions() method before sending the body of a static document. This avoids sending a
document that is already cached and up to date on the browser's side of the
connection. You will then want to call set_content_length() and update_mtime()
in order to make the outgoing HTTP headers correctly reflect the correct
size and modification time of the requested file. Finally, you may want to
call set_etag() in order to set the file's ``entity tag'' when communicating with
HTTP/1.1-compliant browsers.
In the section following this one, we demonstrate these methods fully by
writing a pure Perl replacement for the http_core module's default document retrieval handler.
- discard_request_body()
-
The majority of
GET method handlers do not deal with incoming client data, unlike POST and PUT handlers. However, according to the HTTP/1.1 specification, any method,
including GET can include a request body. The discard_request_body() method tests for the existence of a request body and if present, simply
throws away the data. This discarding is especially important when
persistent connections are being used, so that the request body will not be
attached to the next request. If the request is malformed, an error code
will be returned, which the module handler should propagate back to Apache.
Example:
if ((my $rc = $r->discard_request_body) != OK) {
return $rc;
}
- meets_conditions()
-
In the interest of HTTP/1.1 compliance, the meets_conditions()
method is used to implement ``conditional
GET '' rules. These rules include inspection of client headers, including If-Modified-Since,
If-Unmodified-Since, If-Match and If-None-Match. Consult RFC 2068 section 9.3 (which you can find at
http://www.w3.org/Protocols) if you are interested in the nitty gritty details.
As far as Apache modules are concerned, they need only check the return
value of this method before sending a request body. If the return value is
anything other than OK, the module should return from the handler with that value. A common
return value other than OK is HTTP_NOT_MODIFIED, which is sent when the document is already cached on the client side, and
has not changed since it was cached.
if((my $rc = $r->meets_conditions) != OK) {
return $rc;
}
#else ... go and send the response body ...
- mtime()
-
This method returns the last modified time of the requested file, expressed
as seconds since the epoch. The last modified time may also be changed
using this method, although
update_mtime() method is better suited to this purpose.
Example:
my $date_string = localtime $r->mtime;
- set_content_length()
-
This method sets the outgoing Content-length header based on its argument, which should be expressed in byte units. If
no argument is specified, the method will use the size returned by $r->filename. This method is a bit faster and more concise than setting
Content-length in the headers_out table yourself. Examples:
$r->set_content_length;
$r->set_content_length(-s $r->finfo); #same as above
$r->set_content_length(-s $filename);
- set_etag()
-
This method is used to set the outgoing ETag header corresponding to the requested file. ETag is an opaque string that
identifies the currrent version of the file and changes whenever the file
is modified. This string is tested by the meets_conditions() method if the client provide an If-Match or If-None-Match header.
$r->set_etag;
- set_last_modified()
-
This method is used to set the outgoing Last-Modified header from the value returned by $r->mtime. The method checks that the specified time is not in the future. In
addition, using
set_last_modified() is faster and more concise than setting
Last-Modified in the headers_out table yourself.
You may provide an optional time argument, in which case the method will
first call the update_mtime() to set the file's last modification date. It will then set the outgoing Last-Modified
header as before.
Examples:
$r->update_mtime((stat $r->finfo)[9]);
$r->set_last_modified;
$r->set_last_modified((stat $r->finfo)[9]); #same as the two lines above
- update_mtime()
-
Rather than setting the request record mtime field directly, you can use the update_mtime() method to change the value of this field. It will only be updated if the
new time is more recent than the current mtime. If no time argument is present, the default is the last modified time of $r->filename.
Example:
$r->update_mtime;
$r->update_mtime((stat $r->finfo)[9]); #same as above
$r->update_mtime(time);
Apache's http_core module already has a default handler to send files straight from disk to
the client. Such files include static HTML, plain text, compressed archives
and image files in a number of different formats. A bare bones handler in
Perl only requires a few lines of code as Listing 9.1 shows. After the
standard preamble, the
handler() function attempts to open $r->filename . If the file cannot be opened, the handler simply assumes file permission
problems and returns FORBIDDEN . Otherwise, the entire contents of the file are passed down the HTTP
stream using the request object's
send_fd() method. It then does a little tidying up by calling
close() on the filehandle and returns OK so that Apache knows the response has been sent.
- Listing 9.1 A simple, but flawed way to send static files
-
package Apache::EchoFile;
use strict;
use Apache::Constants qw(:common);
use Apache::File ();
sub handler {
my $r = shift;
my $fh = Apache::File->new($r->filename) or return FORBIDDEN;
$r->send_fd($fh);
close $fh;
return OK;
}
1;
__END__
While this works well in most cases, there is more involved in sending a
file over HTTP than you might think. To fully support the HTTP/1.1
protocol, one has to handle the PUT and OPTIONS methods, handle GET
requests that contain a request body, and provide support for
``If-Modified-Since'' requests.
Listing 9.2 is the Apache::SendFile module, a Perl version of the
http_core module default handler. It starts off as before by loading the Apache::Constants module. However it brings in more constants than usual. The :response group pulls in the constants we normally see using the :common tag, plus a few more including the NOT_IMPLEMENTED constant. The :methods group brings in the method number constants including M_INVALID , M_OPTIONS , M_PUT
and M_GET . The :http tag imports a few of the less commonly used status codes, including HTTP_METHOD_NOT_ALLOWED .
We next bring in the Apache::File module in order to open and read the contents of the file to be sent and to
load the HTTP/1.1-specific file handling methods.
The first step we take upon entering the handler() function is to call the discard_request_body() method. Unlike HTTP/1.0, where only POST and PUT requests may contain a
request body, in HTTP/1.1 any method may include a body. We have no use for
it, so we throw it away to avoid potential problems.
We now check the request method by calling the request object's
method_number() method. Like the http_core handler, we only handle GET requests (method numbers M_GET). For any other
type of request we return an error, but in each case the error is slightly
different. For the method M_INVALID, which is set when the client specifies
a request that Apache doesn't understand, we return an error code of
NOT_IMPLEMENTED. For M_OPTIONS, which is sent by an HTTP/1.1 client that is
seeking information about the capabilities of the server, we return
DECLINED in order to allow Apache's core to handle the request (it sends a
list of allowed methods).
The PUT method is applicable even if the resource doesn't exist, but we
don't support it, so we return HTTP_METHOD_NOT_ALLOWED in this case. At this point we test for existence of the requested file by
applying the -e file test to the cached stat() information returned by the request object's finfo() method. If the file does not exist, we log an error message and return NOT_FOUND . Finally, we specifically check for a request method of M_GET and again
return HTTP_METHOD_NOT_ALLOWED if this is not the case.
Provided the request has passed all these checks, we attempt to open the
requested file with Apache::File. If the file cannot be opened, the handler logs an error message and
returns FORBIDDEN .
At this point, we know that the request method is valid and the file exists
and is accessible. But this doesn't mean we should actually send the file
because the client may have cached it previously and has asked us to
transmit it only if it has changed. The update_mtime(),
set_last_modified() and set_etag() methods together set up the HTTP/1.1 headers that indicate when the file
was changed and assign it a unique ``entity tag'' that changes when the
file changes.
We then call the meets_conditions() method to find out if the file has already been cached by the client. If
this is the case, or some other condition set by the client fails, meets_conditions() returns a response code other than OK, which we propagate back to Apache. Apache then does whatever is
appropriate.
Otherwise we call the set_content_length() method to set the outgoing Content-length header to the length of the file, then call
send_http_header() to send the client the full set of HTTP headers. The return value of header_only() is tested to determine whether the client has requested the header only; if
the method returns false, then the client has requested the body of the
file as well as the headers, and we send the file contents using the send_fd() method. Lastly, we tidy up by closing the filehandle and returning OK .
The real default handler found in http_core.c actually does a bit more work than this. It includes logic for sending
files from memory via mmap() if USE_MMAP_FILES is defined, along with support for HTTP/1.1 byte ranges and Content-MD5.
After reading through this you'll probably be completely happy to return
DECLINED when the appropriate action for your module is just to return the
unmodified contents of the requested file!
- Listing 9.2 A 100% pure Perl implementation of the default
http_core content handler
-
package Apache::SendFile;
use strict;
use Apache::Constants qw(:response :methods :http);
use Apache::File ();
use Apache::Log ();
sub handler {
my $r = shift;
if ((my $rc = $r->discard_request_body) != OK) {
return $rc;
}
if ($r->method_number == M_INVALID) {
$r->log->error("Invalid method in request ", $r->the_request);
return NOT_IMPLEMENTED;
}
if ($r->method_number == M_OPTIONS) {
return DECLINED; #http_core.c:default_handler() will pick this up
}
if ($r->method_number == M_PUT) {
return HTTP_METHOD_NOT_ALLOWED;
}
unless (-e $r->finfo) {
$r->log->error("File does not exist: ", $r->filename);
return NOT_FOUND;
}
if ($r->method_number != M_GET) {
return HTTP_METHOD_NOT_ALLOWED;
}
my $fh = Apache::File->new($r->filename);
unless ($fh) {
$r->log->error("file permissions deny server access: ",
$r->filename);
return FORBIDDEN;
}
$r->update_mtime(-s $r->finfo);
$r->set_last_modified;
$r->set_etag;
if((my $rc = $r->meets_conditions) != OK) {
return $rc;
}
$r->set_content_length;
$r->send_http_header;
unless ($r->header_only) {
$r->send_fd($fh);
}
close $fh;
return OK;
}
1;
__END__
As you know, Perl has several ``magic'' global variables, subroutines and
literals that have the same meaning no matter what package they are used
from. A handful of these variables have special meaning when running under mod_perl. Here we will describe these and other global variables maintained by mod_perl. Don't forget that Perl code has much longer lifetime and lives among many
more namespaces in the mod_perl environment than it does in the mod_cgi CGI environment. When modifying a Perl global variable, we recommend that
you always localize the variable so modifications do not trip up other Perl
code running in the server.
We begin with the list of magic global variables that have special
significance to mod_perl.
- $0
-
When running under Apache::Registry or Apache::PerlRun, this variable is set to that of the filename field of the
request_rec.
When running inside of a <Perl> section, the value of $0
is the path to the configuration file that the Perl section is located in,
such as httpd.conf or srm.conf.
- $^X
-
Normally, this variable holds the path to the Perl program that was
executed from the shell. Under mod_perl, there is no Perl program, just the Perl library linked with Apache. So,
this variable is set to that of Apache binary in which Perl is currently
running, such as /usr/local/apache/bin/httpd or
C:\Apache\apache.exe.
- $|
-
As the perlvar(1) manpage explains, if this variable is set to nonzero, it forces a flush
right away and after every write or print on the currently selected output
channel. Under mod_perl, setting
$| when the STDOUT filehandle is selected will cause the
rflush() method to be invoked after each print(). Because of the overhead associated with rflush(), you should avoid making this a general practice.
- $/
-
The perlvar manpage describes this global variable as the input record separator,
newline by default. The same is true under mod_perl, however, mod_perl
ensures it is reset back to the newline default after each request.
- %@
-
You are most likely familiar with Perl's
$@ variable, which holds the Perl error message or exception value from the
last eval() command, if any. There is also an undocumented %@ hash global, which is used internally for certain eval bookkeeping. This
variable is put to good use by mod_perl, by saving the value of $@ keyed by the URI which triggered the error. This allows an ErrorDocument to provide some more clues as to what went wrong. Example:
my $previous_uri = $r->prev->uri;
my $errmsg = $@{$previous_uri};
This looks a bit weird, but it's just a hash key lookup on an the array
named %@ . Mentally substitute %SAVED_ERRORS for %@ and you'll see what's going on here.
- %ENV
-
As with the Perl binary, this global hash contains the current environment.
When the Perl interpreter is first created by mod_perl, this hash is
emptied, with the exception of those variables passed and set via PerlPassEnv and PerlSetEnv configuration directives.
The usual configuration scoping rules apply. A PerlSetEnv
directives located in the main part of the configuration file will
influence all Perl handlers, while those located in
<Directory>, <Location> and <Files>
sections will only affect handlers in those areas that they apply to.
The Apache SetEnv and PassEnv directives also influence %ENV , but they don't take effect until the fixup phase. If you need to influence %ENV via server configuration for an earlier phase, such as authentication, be
sure to use PerlSetEnv and PerlPassEnv
instead, because these directives take effect as soon as possible.
There are a number of standard variables that Apache adds to the
environment prior to invoking the content handler. These include
DOCUMENT_ROOT and SERVER_SOFTWARE. By default, the complete %ENV
hash is not set up until the content response phase. Only variables set by PerlPassEnv, PerlSetEnv and by mod_perl itself will be visible. Should you need the complete set of variables to be
available sooner, your handler code can do so with the
subprocess_env method. Example:
my $r = shift;
my $env = $r->subprocess_env;
%ENV = %$env;
Unless you plan to spawn subprocesses, however, it will usually be more
efficient to access the subprocess variables directly:
my $tmp = $r->subprocess_env->{'TMPDIR'};
If you need to get at the environment variables that are set automatically
by Apache before spawning CGI scripts, and you want to do this outside of a
content handler, remember to call
subprocess_env() once in a void context in order to initialize the environment table with
the standard CGI and server-side include variables:
$r->subprocess_env;
my $port = $r->subprocess_env('SERVER_SOFTWARE');
There's rarely a legitimate reason to do this, however, because all the
information you need can be fetched directly from the request object.
Filling in the %ENV hash before the response phase introduces a little overhead into each mod_perl content handler. If you don't want the %ENV hash to be filled at all by mod_perl, add this to your server configuration file:
PerlSetupEnv Off
Regardless of the setting of PerlSetupEnv, or whether subprocess_env() has been called, mod_perl always adds a few special keys of its own to %ENV .
- MOD_PERL
-
The value of this key will be set to a true value for code to test if it is
running in the mod_perl environment or not. Example:
if(exists $ENV{MOD_PERL}) {
... do something ...
}
else {
... do something else ...
}
- GATEWAY_INTERFACE
-
When running under the mod_cgi CGI environment, this value is
CGI/1.1. However, when running under the mod_perl CGI environment, GATEWAY_INTERFACE will be set to CGI-Perl/1.1. This can also be used by code to test if it is running under mod_perl,
however, testing for the presence of the MOD_PERL key is faster than using a regular expression or
substr to test GATEWAY_INTERFACE.
- PERL_SEND_HEADER
-
If the PerlSendHeader directive is set to On, this enviroment variable will also be set to On, otherwise, the variable will not exist. This is intended for scripts
which do not use the CGI.pm
header() method, which always sends proper HTTP headers not matter what the
settings. Example:
if($ENV{PERL_SEND_HEADER}) {
print "Content-type: text/html\n\n";
}
else {
my $r = Apache->request;
$r->content_type('text/html');
$r->send_http_header;
}
- %SIG
-
The Perl
%SIG global variable is used to set signal handlers for various signals.
There is always one handler set by mod_perl for catching the
PIPE signal. This signal is sent by Apache when a timeout occurs, triggered when
the client drops the connection prematurely (e.g. by hitting the ``stop''
button). The internal Apache::SIG class catches this signal to ensure the Perl interpreter state is properly
reset after a timeout.
The Apache::SIG handler does have one side-effect that you might want to take advantage of.
If a transaction is aborted prematurely because of a PIPE signal, Apache::SIG will set the environment variable SIGPIPE to the number ``1'' before it
exits. You can pick this variable up with a custom log handler statement
and record it if you are interested in compiling statistics on the number
of remote users who abort their requests prematurely.
Below is a LogFormat directive that will capture the SIGPIPE environment variable. If the
transaction was terminated prematurely, the last field in the log file line
will be ``1'', otherwise ``-''.
LogFormat "%h %l %u %t \"%r\" %s %b %{SIGPIPE}e"
As for all other signals, you should be most careful not to stomp on
Apache's own signal handlers, such as that for ALRM . It is best to localize the handler inside of a block so it can be
restored as soon as possible.
Example:
{
local $SIG{ARLM} = sub { ... };
...
}
At the end of each request, mod_perl will restore the %SIG hash to the same state it was in at server startup time.
- @INC
-
As the perlvar manpage explains:
The array @INC contains the list of places to look for Perl
scripts to be evaluated by the do EXPR , require , or use constructs.
The same is true under mod_perl. However, two additional paths are automatically added to the end of the
array. These are the value of the configured
ServerRoot and $ServerRoot/lib/perl.
At the end of each request, mod_perl will restore the value of @INC
to the same value it was during server startup time. This includes any
modifications made by code pulled in via PerlRequire and
PerlModule. So, be warned, if a script compiled by
Apache::Registry contains a use lib or other @INC
modification statement, this modification will not ``stick''. That is, once
the script is cached, the modification is undone until the script has
changed on disk and is re-compiled. If one script relies on another to
modify the @INC path, that modification should be moved to a script or module pulled in at
server startup time, such as the perl startup script.
- %INC
-
As the perlvar manpage explains, The
%INC hash contains entries for each
filename that has been included via do or require . The key is the filename you specified, and the value is the location of
the file actually found. The require command uses this array to determine whether a given file has already been
included.
The same is true in the mod_perl environment. However, this Perl feature may seem like a mod_perl bug at times. One such case is when .pm modules that are modified are not
automatically recompiled the way that Apache::Registry script files are. The reason this behavior hasn't been changed is that
calling the
stat function to test the last modified time for each file in %INC
requires considerable overhead and would affect Perl API module performance
noticeably. If you need it, the Apache::StatINC module provides the ``re-compile when modified'' functionality, which the
authors only recommend using during development. On a production server,
it's best to set the PerlFreshRestart directive to on and to restart the server whenever you change a .pm file and want to see
the changes take effect immediately.
Another problem area is pulling in ``library'' files which do not declare a package namespace. As all Apache::Registry and
Apache::PerlRun script files are compiled inside their own unique namespace, pulling in
such a file via require causes it to be compiled within this unique namespace. Since the library
file will only be pulled in once per request, only the first script to
require it will be able to see the subroutines it declares. Other scripts that try
to call routines in the library will trigger a server error along the lines
of:
[Thu Sep 11 11:03:06 1998] Undefined subroutine
&Apache::ROOT::perl::test_2epl::some_function called at
/opt/www/apache/perl/test.pl line 79.
The mod_perl_traps manual page describes this problem in more detail, along with providing
solutions.
Subroutines with names that are all in capitals have special meaning to
Perl. Familiar examples may include DESTROY and BEGIN.
mod_perl also recognizes a few subroutines and treats them specially.
- BEGIN
-
Perl executes BEGIN blocks during the compile time of code as soon as possible. The same is
true under mod_perl. However, since
mod_perl normally only compiles scripts and modules once, in the parent server or
once per-child, BEGIN blocks in that code will only be run once.
Once a BEGIN block has run, it is immediately undefined by removing it from the symbol
table. In the mod_perl environment, this means
BEGIN blocks will not be run during each incoming request unless that request
happens to be the one that is compiling the code. When a .pm module or
other Perl code file is pulled in via require or
use , its BEGIN blocks will be executed:
- Once at startup time if pulled in by the parent process by a
B<PerlModule> directive or in the perl startup script.
- Once per-child process if not pulled in by the parent process.
- An additional time in each child process if Apache::StatINC is loaded
and the module is modified.
- An additional time in the parent process on each restart if
B<PerlFreshRestart> is B<On>.
- At unpredictable times if you fiddle with C<%INC> yourself. Don't
do this unless you know what you are doing.
Apache::Registry scripts can contain BEGIN blocks as well. In this case, they will be executed:
- Once at startup time if pulled in by the parent process via
I<Apache::RegistryLoader>.
- Once per-child process if not pulled in by the parent process.
- An additional time in each child process if the script file is modified.
- An additional time in the parent process on each restart if
the script was pulled in by the parent process with
I<Apache::RegistryLoader> and B<PerlFreshRestart> is B<On>.
- END
-
In Perl, an END subroutine defined in a module or script, is executed as late as possible,
that is, when the interpreter is being exited. In the mod_perl environment, the interpreter does not exit until the server is shutdown.
However, mod_perl does make a special case for Apache::Registry scripts.
Normally, END blocks are executed by Perl during its perl_run()
function, which is called once each time the Perl program is executed, e.g.
once per (mod_cgi) CGI script. However, mod_perl only calls perl_run() once during server startup. Any END blocks that are encountered during main server startup such as those pulled
in by the PerlRequire or PerlModule, are suspended and run at server shutdown time during the child_exit phase.
Any END blocks that are encountered during compilation of
Apache::Registry scripts are called after the script has completed the response, including
subsequent invocations when the script is cached in memory. All other END blocks encountered during other Perl*Handler callbacks, e.g. PerlChildInitHandler, will be suspended while the process is running and called during child_exit
when the process is shutting down.
Module authors may be wish to use $r->register_cleanup as an alternative to END blocks if this behavior is not desirable.
Perl recognizes a few magic literals during script compilation. By and
large, they act exactly like their counterparts in the standalone Perl
interpreter.
- __END__
-
This token works just as it does with the standalone Perl interpreter,
causing compilation to terminate. However this causes a problem for
Apache::Registry scripts. Since the scripts are compiled inside of a subroutine, using __END__ will cut off the enclosing brace, causing script compilation to fail. If
your Apache::Registry
scripts use this literal, they will not run.
In partial compensation for this deficiency, mod_perl lets you use the __END__ token anywhere in your server configuration files to cut out experimental
configuration or to make a notepad space that doesn't require you to use
the # comment token on each line. Everything below the __END__ token will be ignored.
There are a number of useful globals located in the Apache::Server
namespace that you are free to use in your own modules. Treat them as
read-only. Changing their values will lead to unpredictable results.
- $Apache::Server::CWD
-
This variable is set to the directory from which the server was started.
- $Apache::Server::Starting
-
If the code being run is in the parent server, when the server is first
being started, the value is set to
1 , zero otherwise.
- $Apache::Server::ReStarting
-
If the code being run is in the parent server, when the server is being
restarted, this variable will be true, false otherwise. The value is
incremented each time the server is restarted.
- $Apache::Server::SaveConfig
-
As described in Chapter 8, <Perl> configuration sections are compiled inside the Apache::ReadConfig namespace. This namespace is normally flushed after mod_perl has finished
processing the section. However, if the $Apache::Server::SaveConfig variable is set to a true value, the namespace will not be flushed, making
configuration data available to Perl modules at request time. Example:
<Perl>
$Apache::Server::SaveConfig = 1;
$DocumentRoot = ...
...
</Perl>
At request time, the value of $DocumentRoot can be accessed with the fully qualified name $Apache::ReadConfig::DocumentRoot.
The next chapters show the Apache API from the perspective of the C
language programmer, telling you everything you need to know to squeeze the
last drop of performance out of Apache by writing extension modules in a
fast compiled language.
|
|