33 views

CHAPTER 14 - Database Query/Result Caching

Caching the results of database
queries can speed up your site and reduce the load on the database server. The
biggest challenge is to determine the best caching strategy. Should you cache
the results from every single query? Do you know in advance which queries
are going to be expensive?
The following example demonstrates an approach to this using the
Cache_DB class, which is part of the Cache PEAR package. It wraps a DB con-
nection object inside a proxy object that intercepts query() calls and uses a
Strategy pattern to determine a caching strategy for each query:
<?php
require_once ”DB.php”;
require_once ”Cache/DB.php”;
abstract class QueryStrategy
{
protected $cache;
abstract function query($query, $params);
}
class Cache1HourQueryStrategy extends QueryStrategy
{
function __construct($dsn, $cache_options) {
$this->cache = new Cache_DB(”file”, $cache_options, 3600);
$this->cache->setConnection($dsn);
}
function query($query, $params = array()) {
$hitmiss = $this->cache->isCached(md5($query), ”db_cache”)
? ” HIT” : “MISS”;
print “Cache 1h $hitmiss: $query\n”;
return $this->cache->query($query, $params);
}
}
class Cache5MinQueryStrategy extends QueryStrategy
{
function __construct($dsn, $cache_options) {
$this->cache = new Cache_DB(”file”, $cache_options, 300);
$this->cache->setConnection($dsn);
}
function query($query, $params = array()) {
$hitmiss = $this->cache->isCached(md5($query), ”db_cache”)
? ” HIT” : “MISS”;
print “Cache 5m $hitmiss: $query\n”;
return $this->cache->query($query, $params);
}
}
class UncachedQueryStrategy extends QueryStrategy
{
function __construct($dsn) {
$this->cache = DB::connect($dsn);
}
function query($query, $params = array()) {
print “Uncached: $query\n”;
return $this->cache->query($query, $params);
}
}
class QueryCacheStrategyWrapper
{
private $cache_1h = null;
private $cache_5m = null;
private $direct = null;
function __construct($dsn) {
$opts = array(
”cache_dir” => ”/tmp”,
”filename_prefix” => ”query”);
$this->cache_1h = new Cache1HourQueryStrategy($dsn, $opts);
$this->cache_5m = new Cache5MinQueryStrategy($dsn, $opts);
$this->direct = new UncachedQueryStrategy($dsn);
}
function query($query, $params = array()) {
$obj = $this->cache_5m;
$re = ”/\s+FROM\s+(\S+)\s*((AS\s+)?([A-Z0-9_]+))?(,*)/i”;
if (preg_match($re, $query, $m)) {
if ($m[1] == ”bids”) {
$obj = $this->direct;
} elseif ($m[5] == ”,”) { // a join
$obj = $this->cache_1h;
}
}
return $obj->query($query, $params);
}
function __call($method, $args) {
return call_user_func_array(array($this->dbh, $method),
$args);
}
}
$dbh = new QueryCacheStrategyWrapper(getenv(”DSN”));
test_query($dbh, “SELECT * FROM vendors”);
test_query($dbh, “SELECT v.name, p.name FROM vendors v, products p”.
” WHERE p.vendor = v.id”);
test_query($dbh, “SELECT * FROM bids”);
function test_query($dbh, $query) {
$u1 = utime();
$r = $dbh->query($query);
$u2 = utime();
printf(”elapsed: %.04fs\n\n”, $u2 - $u1);
}
function utime() {
list($usec, $sec) = explode(” “, microtime());
return $sec + (double)$usec;
}
The QueryCacheStrategyWrapper class implements the Strategy wrapper,
and uses a regular expression to determine whether the query should be
cached and if it should be cached for five minutes or one hour. If the query con-
tains a join across multiple database tables, it is cached for one hour; if it is a
SELECT on the bids table (for an auction), the query is not cached. The rest will
be cached for five minutes.
Here is the output from this example the first time the queries are run,
and the results are not cached:
Cache 5m MISS: SELECT * FROM vendors
elapsed: 0.0222s
Cache 1h MISS: SELECT v.name, p.name FROM vendors v, products p WHERE
p.vendor = v.id
elapsed: 0.0661s
Uncached: SELECT * FROM bids WHERE product = 42
elapsed: 0.0013s
As you can see, the join is relatively expensive compared to the other
queries. Now, look at the timings on the second run:
Cache 5m MISS: SELECT * FROM vendors
elapsed: 0.0098s
Cache 1h MISS: SELECT v.name, p.name FROM vendors v, products p WHERE
p.vendor = v.id
elapsed: 0.0055s
Uncached: SELECT * FROM bids WHERE product = 42
elapsed: 0.0015s
The cache gave a 125 percent speed-up for the first query, and a whop-
ping 1,100 percent speed-up for the join.
A good exercise to complete after reading the APD section, “Profiling with
ADP,” later in this chapter would be to adapt the caching strategy in your own
database (just change the “bids” table name), and use APD to compare the per-
formance of the wrapped caching solution with a regular non-caching
approach.

Call Caching
Call caching
means caching the return value of a
function given a set of parameters. Both the Cache and Cache_Lite PEAR pack-
ages provide this. Chapter 11, “Important PEAR Packages,” contains an exam-
ple of call caching.
 

Compiled Templates
Most template systems today compile tem-
plates to PHP code before displaying them. This not only makes the template
display faster, but it also allows an opcode cache to cache them between
requests so they do not need to be parsed on every request.
The only template packages in PEAR that do not compile to PHP code
are HTML_Template_IT and HTML_Template_PHPLIB. If you use one of the others,
such as Smarty or HTML_Template_Flexy, everything will be taken care of for you.
 

Output Caching
Finally, you may cache the printed output of an
entire script or just parts of it using PHP’s output buffering functions. Again,
the PEAR caching packages have wrappers in place for output caching. See
the Cache_Lite example in Chapter 11.

Post a Comment

You must be logged in to post a comment.