Conditional Get is a method for web browsers to use cached versions of a web page instead of downloading it, if it hasn't been modified since last visit. The purpose of this feature is to improve the general performance of web browsing, when you don't need to repeatedly download the same content over and over again. In addition this will also reduce the bandwidth consumed by web pages.
Conditional Get is mostly used just for static files, such as images and stylesheets. You can't really cache dynamic pages, because the content of the page changes on every request. For normal files, you don't even need to do anything special, since most server software automatically handles conditional get for these static files. However, because PHP pages are generated dynamically on each page request, it's not possible for server to automatically determine when was the last time the content changed
Of course, most of the time it's not even possible to support conditional get, as the contents are dynamic, but sometimes you need to create almost static files with PHP. These could include, for example, images or stylesheets that are generated or modified by PHP, even though they don't change very often. In these cases, you will need to write support for conditional get yourself in PHP. This tutorial will show you how to do that.
Supporting conditional get with PHP is not particularly hard. The idea is that you send browsers two different headers: "Last-Modified" and "ETag". The "Last-Modified" header should contain the last time the file was modified and "ETag" should contain a unique indentifier for that version of themodified file. To put it simply, the contents of "ETag" should change whenever the page changes. The "ETag" field is defined as something that only the origin server can understand, so basically it can contain anything you want, as long as it just changes with the content. A simple way to achieve this could be, for example, using md5 of the last modification date. The value should be inside quotes, even though most browsers will simply treat the value of the header as a string.
Sending the headers are naturally done using header() function.
The Last-Modification date should contain the date in GMT according to
format specified in rfc2616.
In other words, the proper format to use in gmdate() is
"D, d M Y H:i:s \G\M\T". Although, in PHP 4.3.11 and newer
versions you can also just use: "r"
The biggest problem for you is to determine the last modification date. This depends on what you are actually doing. With RSS feeds, for example, you usually want to use the last time an item was added to the RSS feed. With other things, you just need design your system so that there is a way for you to determine the last modification date for the content you wish to support Conditional Get.
Once you have been able to determine the last modification date, you can send
the proper headers like this (assuming you have the last modification date as
unix timestamp in $time):
$lastmod = gmdate('D, d M Y H:i:s \G\M\T', $time);
$etag = '"' . md5($lastmod) . '"';
header("Last-Modified: $lastmod");
header("ETag: $etag");
Now the browser gets the proper headers, which it will also send in on the next request it makes to the same page.
To actually support the conditional get, we need to check if the browser sent
the appropriate headers and see if they match to the once we are about to send.
The headers which contain the last modification date and the etag are
"If-Modified-Since" and "If-None-Match". These are available to the PHP script
in $_SERVER['HTTP_IF_MODIFIED_SINCE'] and
$_SERVER['HTTP_IF_NONE_MATCH']. What you need to do, is to check
if these two variables are set, and if they are whether they match the
$lastmod and $etag which were previously set. If they
are, you should send a 304 header instead of the page contents.
The checking can be achieved be using code like:
$ifmod = isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])
? $_SERVER['HTTP_IF_MODIFIED_SINCE'] == $lastmod
: null;
$iftag = isset($_SERVER['HTTP_IF_NONE_MATCH'])
? $_SERVER['HTTP_IF_NONE_MATCH'] == $etag
: null;
if (($ifmod || $iftag) && ($ifmod !== false && $iftag !== false))
{
header('HTTP/1.0 304 Not Modified');
die;
}
The previous code will check if either of the headers is set and if the ones that are set matches, it will send the 304 header and end the page execution.
The gain maximium benefit from supporting the conditional get, you should try to do the checking as early in the page processing as possible and kill the script if 304 header is sent. This way, it won't do unnecessary processing that is required to output the actual page. In RSS feeds this would mean that only build the actual feed once you checked for the conditional get.
And that is all you need to do to support conditional get.
Here is an example function that I use myself to easily provide the support where I need:
// Function for checking conditional get
function conditionalget ($lastmod)
{
$lastmod = gmdate('D, d M Y H:i:s', intval($lastmod)) . ' GMT';
$etag = '"' . md5($lastmod) . '"';
// ETag is sent even with 304 header
header("ETag: $etag");
$ifmod = isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])
? $_SERVER['HTTP_IF_MODIFIED_SINCE'] == $lastmod
: null;
$iftag = isset($_SERVER['HTTP_IF_NONE_MATCH'])
? $_SERVER['HTTP_IF_NONE_MATCH'] == $etag
: null;
// If either matches and neither is a mismatch, send not modified header
if (($ifmod || $iftag) && ($ifmod !== false && $iftag !== false))
{
header('HTTP/1.0 304 Not Modified');
die;
}
// Last-Modified doesn't need to be sent with 304 response
header("Last-Modified: $lastmod");
}
Use of the function is very simple. Just call it with the last modification time and it will handle sending all the headers and sending the appropriate 304 header, if needed. Note that it will end the page execution if 304 header is sent. Also, naturally as the script deals with headers, this needs to be done before any content is sent to the browser.
One small thing to note is that the implementation of conditional get this way is not exactly according to the HTTP1.1 protocol standard. However, it is what I would call "Good Enough". This should work with most browsers, and at least I don't personally know any browser that doesn't work with the above example.
The problem with the standard is that handling of the "If-Modified-Since" and "If-None-Match" headers are quite a bit more complex. The modification date header should not be treated as string and actually few other date strings should be accepted as well. However, the HTTP1.1 standard does suggest that browsers should send the modification date header exactly as received, because servers tend to do exact string comparisons. That, however, is not enforced.
The "If-None-Match" header may actually contain multiple etags from different cached versions of the page or a literal "*". The proper way to handle the header is to check if any of the tags sent matches the tag that would be sent by the page. However, it appears that browsers will treat the ETag field as literal string instead, and send it exactly as it is, even if it is malformed. Thus, equal string comparison as in the example above is "good enough" way to handle the header.
You might be wondering why both of these headers are used, when it should be logical that only one of them is enough. Ironically, this is because the HTTP1.1 standard suggests that both of them are sent whenever possible. Also, this is because theoretically it should offer better compatibility, because it will also work on browsers that support only one of these headers.