Recent Topics

[1.9.x] Database query caching

Started by on Mar 06, 2008 – Contents updated: Mar 06, 2008

Mar 06, 2008 10:47    

My b2evolution Version: 1.9.x

Now that my heavily used blog is becoming more...heavily used, I'm finding myself in performance optimization mode.

I've noticed that certain queries specify a full timestamp, eg:

SELECT COUNT( DISTINCT YEAR(post_datestart), MONTH(post_datestart) ) 
FROM evo_posts INNER JOIN evo_postcats ON post_ID = postcat_post_ID INNER JOIN evo_categories ON postcat_cat_ID = cat_ID 
WHERE (cat_blog_ID = 2) 
  AND ( ( post_status IN ('published') ) ) 
  AND (post_datestart <= '2008-03-06 04:34:57')

On one hand, this is good because I can publish articles in advance and know that they'll show up at the right time. But on the other hand, this is bad because the '2008-03-06 04:34:57' is effectively preventing this query from being cachec by MySQL.

Given my needs, it's acceptable for articles to be published on or around a given time - meaning, the query could look something like this:

SELECT COUNT( DISTINCT YEAR(post_datestart), MONTH(post_datestart) ) 
FROM evo_posts INNER JOIN evo_postcats ON post_ID = postcat_post_ID INNER JOIN evo_categories ON postcat_cat_ID = cat_ID 
WHERE (cat_blog_ID = 2) 
  AND ( ( post_status IN ('published') ) ) 
  AND (post_datestart <= '2008-03-06 04:00:00')

note the minutes & seconds have been zeroed - which should allow the query to be cached for a whole hour...

Two questions:

1) Is there any way to modify queries like this in b2evo across the board, or do I need to individually search-and-replace?

2) Does this approach/strategy seem to make sense? Am I missing any other obvious query caching mechanism that I could be leveraging?

There are 7 queries similar in nature that have a exec time of 0.35-0.40...so it seems feasible to reclaim around 2s or so of execution time.

Mar 06, 2008 17:41

I'm pretty sure you would have to edit everything you find, but what you really SHOULD do is upgrade. Optimization is always part of the upgrading process for the dev team is why.

Mar 07, 2008 04:22

I decided that I don't have the stomach for an upgrade to 2.x...yet.

So I modified the ./inc/MODEL/items/_itemquery.class.php file, changing the line:

$date_max = date('Y-m-d H:i:s', $timestamp_max + $time_difference );

to:

$date_max = date('Y-m-d H:00:00', $timestamp_max + $time_difference );

I noticed that quite a few queries that were being fired for page accesses weren't being cached by MySQL - I assumed because the full timestamp was passed in the query, resulting in different queries each time. The change I made was intended to cause the query to be cached for at least an hour - with the side effect that a published article may day up to an additional hour to display - something that I could live with for a boost in performance.

Overall, things got better but when I brought up the main page in debug mode, I noticed that - even though the same queries were being fired - sometimes they were being cached by MySQL and other times they weren't.

As an experiment, I cut-and-pasted a few of those queries into MySQL directly...and confirmed that if I fired off the same query quickly after the first, it was being cached; eg: the first execution would be something like 0.372 seconds but the second query would complete in something like 0.0009 seconds.

BUT...

After a couple of minutes - whether I keep firing the query or not - the execution time went back to 0.372 seconds again before dropping back down. Eg: it looked like at some point not too long after MySQL read from the cache, it wasn't able to again (Note : I also fired off the queries with SQL_NO_CACHE to benchmark the time and confirmed that it was in the 0.372 second range without the cache).

I spent 4 hours pouring over MySQL documentation to verify that my 512MB query cache was properly configured (it is)...but I cannot understand why the queries aren't being used from cache longer.

Here's an example of one of the queries:

Query #12: Get result count

SELECT COUNT( DISTINCT post_ID )
FROM evo_posts INNER JOIN evo_postcats ON post_ID = postcat_post_ID INNER JOIN evo_categories ON postcat_cat_ID = cat_ID
WHERE (cat_blog_ID = 2)
  AND ( ( post_status IN ('published') ) )
  AND (post_datestart <= '2008-03-06 22:00:00')

My brain is fried. Any MySQL gurus have any thoughts?


Form is loading...

Content Mangement System – This forum is powered by b2evolution CMS, a complete engine for your website.