Recent Topics

1 Jul 03, 2009 22:14    

I hope I'm not to late to make my case this modification get included in the next stable release.

Now that we have a wonderful disk-caching system, I thought I would implement cache-control directives in order to further reduce bandwidth & cpu consumption.

The idea is that if a page is valid in the disk cache (its lifetime is lower than the maximun age ($max_age_seconds) b2evo can very well respond with a 304 Not modified HTTP header and so instructs the browser or proxy cache to present the user with the locally cached copy, instead of sending again the same many KB of data. If the age of the cached page is higher than $max_age_seconds, b2evo will regenerate the cache and respond with a normal 200 OK HTTP header and the newly minted data, with new Last-Modified info.

The relevant RFC is [url=http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html]RFC2616[/url].

So the changes, surprisingly, are very few. First in inc/_blog_main.inc.php:

--- blogs/inc/_blog_main.inc.php        2009-07-03 21:35:02.000000000 +0200
+++ blog/inc/_blog_main.inc.php 2009-06-03 00:19:18.000000000 +0200
@@ -22,7 +22,7 @@
  * @author blueyed: Daniel HAHLER
  * @author fplanque: Francois PLANQUE
  *
- * @version $Id: _blog_main.inc.php,v 1.139 2009/06/29 02:14:04 fplanque Exp $
+ * @version $Id: _blog_main.inc.php,v 1.137 2009/05/31 14:03:31 tblue246 Exp $
  */
 
 if( !defined('EVO_CONFIG_LOADED') ) die( 'Please, do not access this page directly.' );
@@ -461,8 +461,8 @@
                debug_die( sprintf( T_( 'The skin [%s] is not installed on this system.' ), htmlspecialchars( $skin ) ) );
        }
        else if( ! empty( $tempskin ) )
-       { // By definition, we want to see the temporary skin (if we don't use feedburner... )
-               $redir = 'no';  
+       {
+               $redir = 'no';  // By definition, we want to see the temporary skin
        }
 }
 
@@ -512,6 +512,8 @@
        if( ! $PageCache->check() )
        {       // Cache miss, we have to generate:
 
+    header('Last-Modified: ' . strftime('%a, %d %b %Y %H:%M:%S %z', $servertimenow) );
+    header('Cache-control: public, must-revalidate, max-age=0'/*. $PageCache->max_age_seconds*/);
                if( $skin_provided_by_plugin = skin_provided_by_plugin($skin) )
                {
                        $Plugins->call_method( $skin_provided_by_plugin, 'DisplaySkin', $tmp_params = array('skin'=>$skin) );

As you can see here, what I do is to send a Last-Modified and Cache-Control HTTP headers when the cache page is generated, because the old cached page has expired (its age is greater than $max_age_seconds) or it didn't exist (it's the first time this page is viewed).

So far so good.

Now we go to inc/_core/model/_pagecache.class.php:

--- blogs/inc/_core/model/_pagecache.class.php  2009-05-26 00:27:29.000000000 +0200
+++ blog/inc/_core/model/_pagecache.class.php   2009-06-25 22:42:49.000000000 +0200
@@ -38,7 +38,7 @@
   /**
         * How old can a cached object get before we consider it outdated
         */
-       var $max_age_seconds = 300;  // 5 minutes for now
+       var $max_age_seconds = 7200;  // 1 hour for now
 
   /**
         * After how many bytes should we output sth live while collecting cache content:
@@ -305,7 +305,7 @@
                        }
 
                        // timestamp of cache generation:
-                       $retrieved_ts = trim($lines[1]);
+                       $retrieved_ts = (int) trim($lines[1]);
                        unset($lines[1]);
                        $cache_age = $servertimenow-$retrieved_ts;
                        $Debuglog->add( 'Cache age: '.floor($cache_age/60).' min '.($cache_age % 60).' sec', 'cache' );
@@ -314,6 +314,17 @@
                                return false;
                        }
 
+      // Set some headers to aid browsers' cache determine whether they want it
+      // again or not.
+                       $if_mod_since = empty($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? NULL : strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE']) ;
+                       if( $if_mod_since && $retrieved_ts <= $if_mod_since ) {
+        header('HTTP/1.1 304 Not Modified');
+        exit;
+      }
+      header('Cache-Control: public, must-revalidate, max-age=0'/*.sprintf('%d',$this->max_age_seconds - $cache_age )*/  );
+      //header(sprintf("Expires: %s", strftime('%a, %d %b %Y %H:%M:%S %z', $servertimenow+$this->max_age_seconds-$cache_age) ) );
+      header(sprintf("Last-Modified: %s", strftime('%a, %d %b %Y %H:%M:%S %z', $retrieved_ts) ) );
+      header('X-b2evolution: From-Cache');
                        // Go through headers
                        $i = 2;
                        while( $headerline = trim($lines[++$i]) )

Here it is the meat, so to speak.

If the browser has already the page in its local cache and it knows the date of last modification (which we sent previously), the second time the browser will send a conditional GET with the If-Modified-Since header. In this case and if cached page is valid (ie, it hasn't been modified since we generated the b2evo cached version) instead of spitting it out, what we send is a 304 Not modified and immediately exit, so we don't send any data, only the header.

If this particular user is viewing this particular page for the first time (but another user already saw it and, consequently caused b2evo to generate a cached version), then we send the data as usual with the Last-modified header, so this user's browser can cache the page locally.

That's it.

I have been testing this in my blog (shameless plug here: http://liberal-venezolano.net/) since early june and I can say bandwidth consumption has been reduced significantly, on the order of 40 to 50%. I would guess I have reduced a lot my CPU usage, but this I have not measured.

So, I hope others here take a look at this implementation and give their opinions.

Enjoy seeing your 304's soar in your apache logs :)


Form is loading...