Mod rewrite

From HalfgeekKB
Jump to navigation Jump to search

Template:Lowercase

mod_rewrite is a particularly useful tool for tricking the user agent. The point is to allow a user to access a URI by using another. The net effect is almost the same as a redirect, except that no redirect happens—the correct content is automatically passed through!

Hiding a script

See mod_rewrite/Hide a script.

General

To get started, simply stuff a few lines into your .htaccess. The first must be

RewriteEngine on

This essentially boots up mod_rewrite. (Additionally, you can set this to off instead of commenting out all mod_rewrite lines in the file.)

Afterward, you add a rule that screws with the request URI:

RewriteRule ^(.*).txt$ $1.html [L]

This particular rule reroutes any requests to a .txt file to the .html file with the same basename. Simple! The first argument is obviously a regexp pattern. In Apache, as in most programs, the pattern syntax isn't as extensive as in Perl (you can't seem to use a trailing ? to indicate non-greed, for example) but it's still functional. You can lead the entire pattern with a ! to negate it.

The trailing [L] flag is optional; it (like a Perl last or a C break) means to skip the remaining rewrite directives. There are many other flags (see the Apache doc). Multiple flags on the same rule are separated by commas. If there are no flags, also omit the brackets.

[NC] makes the pattern case-insensitive (like Perl /i). [R] causes the use of an HTTP redirect to effect the rewrite.

Before the RewriteRule, you may have one or more RewriteCond lines:

 RewriteCond %{HTTP_USER_AGENT} !^CoralWebPrx/.*http://coralcdn\.org

Leading a RewriteRule with one or more RewriteConds has the effect of only running that RewriteRule if all of the RewriteCond lines apply. The first argument is a search string, and the second is the pattern. Adding an [OR] flag ties this and the following RewriteCond by OR instead of AND.

[NC] also works here.

%{HTTP_USER_AGENT} is one of many variables available for interpolation. The doc lists the available variables.

There are plenty more details and interesting uses; see the docs.

Examples

Only show the real file to Coral Cache

This example reroutes all requests for .txt files to the script redherring with the basename as the query string (for example, mono.txt leads to redherring?mono), unless it detects the Coral Cache user agent, in which case it gives the real .txt file. This sort of thing might be useful for using the URI on your own server to track the access before handing off the request to CCDN.

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !^CoralWebPrx/.*http://coralcdn\.org
RewriteRule ^(.*).txt$ redherring?$1 [L]

Coral itself suggests checking for coral-no-serve in the query string in case Coral somehow decides to reject the request. This also involves the addition of a X-Coral-Control header. The product then becomes

Header append X-Coral-Control "redirect-home"

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !^CoralWebPrx/.*http://coralcdn\.org
RewriteCond %{QUERY_STRING} !^coral-no-serve
RewriteRule ^(.*).txt$ redherring?$1 [L]

Available variables

  • HTTP headers
    • HTTP_USER_AGENT
    • HTTP_REFERER
    • HTTP_COOKIE
    • HTTP_FORWARDED
    • HTTP_HOST
    • HTTP_PROXY_CONNECTION
    • HTTP_ACCEPT
  • Connection and request
    • REMOTE_ADDR
    • REMOTE_HOST
    • REMOTE_USER
    • REMOTE_IDENT
    • REQUEST_METHOD
    • SCRIPT_FILENAME
    • PATH_INFO
    • QUERY_STRING
    • AUTH_TYPE
  • Server internals:
    • DOCUMENT_ROOT
    • SERVER_ADMIN
    • SERVER_NAME
    • SERVER_ADDR
    • SERVER_PORT
    • SERVER_PROTOCOL
    • SERVER_SOFTWARE
  • System variables
    • TIME_YEAR
    • TIME_MON
    • TIME_DAY
    • TIME_HOUR
    • TIME_MIN
    • TIME_SEC
    • TIME_WDAY
    • TIME
  • Specials
    • API_VERSION: This is the version of the Apache module API (the internal interface between server and module) in the current httpd build, as defined in include/ap_mmn.h. The module API version corresponds to the version of Apache in use (in the release version of Apache 1.3.14, for instance, it is 19990320:10), but is mainly of interest to module authors.
    • THE_REQUEST: The full HTTP request line sent by the browser to the server (e.g., "GET /index.html HTTP/1.1"). This does not include any additional headers sent by the browser.
    • REQUEST_URI: The resource requested in the HTTP request line. (In the example above, this would be "/index.html".)
    • REQUEST_FILENAME: The full local filesystem path to the file or script matching the request.
    • IS_SUBREQ: Will contain the text "true" if the request currently being processed is a sub-request, "false" otherwise. Sub-requests may be generated by modules that need to resolve additional files or URIs in order to complete their tasks.
  • Environment and special access
    • ENV:variable
    • HTTP:header where header can be any HTTP MIME-header name. This is looked-up from the HTTP request. Example: %{HTTP:Proxy-Connection} is the value of the HTTP header "Proxy-Connection:".
    • LA-U:variable for look-aheads which perform an internal (URL-based) sub-request to determine the final value of variable. Use this when you want to use a variable for rewriting which is actually set later in an API phase and thus is not available at the current stage. For instance when you want to rewrite according to the REMOTE_USER variable from within the per-server context (httpd.conf file) you have to use %{LA-U:REMOTE_USER} because this variable is set by the authorization phases which come after the URL translation phase where mod_rewrite operates. On the other hand, because mod_rewrite implements its per-directory context (.htaccess file) via the Fixup phase of the API and because the authorization phases come before this phase, you just can use %{REMOTE_USER} there.
    • LA-F:variable which performs an internal (filename-based) sub-request to determine the final value of variable. Most of the time this is the same as LA-U above.

See Also

Apache docs: mod_rewrite