Apache mod_rewrite Introduction
This document supplements the mod_rewrite
reference documentation. Itdescribes the basic concepts necessary for use ofmod_rewrite
. Other documents go into greater detail,but this doc should help the beginner get their feet wet.
Introduction
The Apache module mod_rewrite
is a very powerful andsophisticated module which provides a way to do URL manipulations. Withit, you can do nearly all types of URL rewriting that you may need. Itis, however, somewhat complex, and may be intimidating to the beginner.There is also a tendency to treat rewrite rules as magic incantation,using them without actually understanding what they do.
This document attempts to give sufficient background so that whatfollows is understood, rather than just copied blindly.
Remember that many common URL-manipulation tasks don't require thefull power and complexity of mod_rewrite
. For simpletasks, see mod_alias
and the documentationon mapping URLs to thefilesystem.
Finally, before proceeding, be sure to configurethe RewriteLog
. Althoughthis log file can give an overwhelming amount of information, it isindispensable in debugging problems with mod_rewrite
configuration, since it will tell you exactly how each rule isprocessed.
Regular Expressions
mod_rewrite uses the Perl CompatibleRegular Expression vocabulary. In this document, we do not attemptto provide a detailed reference to regular expressions. For that, werecommend the PCRE man pages, thePerl regularexpression man page, and MasteringRegular Expressions, by Jeffrey Friedl.
In this document, we attempt to provide enough of a regex vocabularyto get you started, without being overwhelming, in the hope thatRewriteRule
s will be scientificformulae, rather than magical incantations.
Regex vocabulary
The following are the minimal building blocks you will need, in orderto write regular expressions and RewriteRule
s. They certainly do notrepresent a complete regular expression vocabulary, but they are a goodplace to start, and should help you read basic regular expressions, aswell as write your own.
Character | Meaning | Example |
---|
. | Matches any singlecharacter | c.t will match cat ,cot , cut , etc. |
+ | Repeats the previous match one or moretimes | a+ matches a , aa , aaa , etc |
* | Repeats the previous match zero or moretimes. | a* matches all the same thingsa+ matches, but will also match an empty string. |
? | Makes the match optional. | colou?r will match color and colour . |
^ | Called an anchor, matches the beginningof the string | ^a matches a string that begins witha |
$ | The other anchor, this matches the end ofthe string. | a$ matches a string that ends witha . |
( ) | Groups several characters into a singleunit, and captures a match for use in a backreference. | (ab)+ matches ababab - that is, the + applies to the group.For more on backreferences see below. |
[ ] | A character class - matches one of thecharacters | c[uoa]t matches cut ,cot or cat . |
[^ ] | Negative character class - matches any character not specified | c[^/]t matches cat or c=t but not c/t |
In mod_rewrite
the !
character can beused before a regular expression to negate it. This is, a string willbe considered to have matched only if it does not match the rest ofthe expression.
Regex Back-Reference Availability
One important thing here has to be remembered: Whenever you use parentheses in Pattern or in one of the CondPattern, back-references are internally created which can be used with the strings $N
and %N
(see below). These are available for creating the strings Substitution and TestString. Figure 2 shows to which locations the back-references are transferred for expansion.
Figure 2: The back-reference flow through a rule.
RewriteRule basics
A RewriteRule
consistsof three arguments separated by spaces. The arguments are
- Pattern: which incoming URLs should be affected by the rule;
- Substitution: where should the matching requests be sent;
- [flags]: options affecting the rewritten request.
The Pattern is always a regularexpression matched against the URL-Path of the incoming request(the part after the hostname but before any question mark indicatingthe beginning of a query string).
The Substitution can itself be one of three things:
- A full filesystem path to a resource
RewriteRule ^/games.* /usr/local/games/web
This maps a request to an arbitrary location on your filesystem, muchlike the Alias
directive.
- A web-path to a resource
If DocumentRoot
is setto /usr/local/apache2/htdocs
, then this directive wouldmap requests for http://example.com/foo
to thepath /usr/local/apache2/htdocs/bar
.
- An absolute URL
RewriteRule ^/product/view$ http://site2.example.com/seeproduct.html [R]
This tells the client to make a new request for the specified URL.
The Substitution can alsocontain back-references to parts of the incoming URL-pathmatched by the Pattern. Consider the following:
RewriteRule ^/product/(.*)/view$ /var/web/productdb/$1
The variable $1
will be replaced with whatever textwas matched by the expression inside the parenthesis inthe Pattern. For example, a requestfor http://example.com/product/r14df/view
will be mappedto the path /var/web/productdb/r14df
.
If there is more than one expression in parenthesis, they areavailable in order in thevariables $1
, $2
, $3
, and soon.
Rewrite Flags
The behavior of a RewriteRule
can be modified by theapplication of one or more flags to the end of the rule. For example, thematching behavior of a rule can be made case-insensitive by theapplication of the [NC]
flag:
RewriteRule ^puppy.html smalldog.html [NC]
For more details on the available flags, their meanings, andexamples, see the Rewrite Flags document.
Rewrite conditions
One or more RewriteCond
directives can be used to restrict the types of requests that will besubject to thefollowing RewriteRule
. Thefirst argument is a variable describing a characteristic of therequest, the second argument is a regularexpression that must match the variable, and a third optionalargument is a list of flags that modify how the match is evaluated.
For example, to send all requests from a particular IP range to adifferent server, you could use:
RewriteCond %{REMOTE_ADDR} ^10.2.
RewriteRule (.*) http://intranet.example.com$1
When more thanone RewriteCond
isspecified, they must all match forthe RewriteRule
to beapplied. For example, to deny requests that contain the word "hack" intheir query string, except if they also contain a cookie containingthe word "go", you could use:
RewriteCond %{QUERY_STRING} hack
RewriteCond %{HTTP_COOKIE} !go
RewriteRule .* - [F]
Notice that the exclamation mark specifies a negative match, so the rule is only applied if the cookie does not contain "go".
Matches in the regular expressions contained inthe RewriteCond
s can beused as part of the Substitution inthe RewriteRule
using thevariables %1
, %2
, etc. For example, thiswill direct the request to a different directory depending on thehostname used to access the site:
RewriteCond %{HTTP_HOST} (.*)
RewriteRule ^/(.*) /sites/%1/$1
If the request was for http://example.com/foo/bar
,then %1
would contain example.com
and $1
would contain foo/bar
.
.htaccess files
Rewriting is typically configured in the main server configurationsetting (outside any <Directory>
section) orinside <VirtualHost>
containers. This is the easiest way to do rewriting and isrecommended. It is possible, however, to do rewritinginside <Directory>
sections or .htaccess
files at the expense of some additional complexity. This techniqueis called per-directory rewrites.
The main difference with per-server rewrites is that the pathprefix of the directory containing the .htaccess
file isstripped before matching inthe RewriteRule
. In addition, the RewriteBase
should be used to assure the request is properly mapped.