MySQL, Memcache, PHP revised


Some time ago I was writing about the InnoDB Memcache Daemon plugin already for the MySQL server. Back then we had a labs release with a little preview only. Meanwhile quite some time passed and new developments were made - just in time for the MySQL 5.6 RC announced this weekend by Tomas.
The innodb_memcache daemon plugin is a plugin for the MySQL Server end contains an embedded memcached. This embedded memcached is configured to use MySQL's InnoDB engine as storage backend. By using this data stored inside an InnoDB table can be accessed using memcache's key-value protocol. Back in the times of the previous blog post this was limited to data from a single table, which maps easily to the key-value nature of memcache but is a clear limitation. The InnoDB obviously knows that and improvd it:
A user may now define multiple configurations at the same time and therefore access different tables at the same time - or the same table using different key-columns as memcache key, for accessing the data the memcache key names will then be prefixed by specially formatted configuration names.
Let's take a look at an simple example. Assume we have this configuration inside innodb_memcache.containers:
Name: prefix_test schema: test table: test key column: id key name: PRIMARY KEY value columns: lastname
We can then use the memcache configuration using a key like this:
set @@prefix_test.1 Schlüter get @@prefix_test.1
The first call will store my last name with id=1 in the test.test table. For accessing multiple configurtions we simply add entries to the containers list.
Of course we can still access miltiple columns, as in the previous version:
Name: test_first_last schema: test table: test key columns: id key name: PRIMARY value columns: firstname,lastname
And then we add my firstname:
set @@test_first_last.1 Johannes,Schlüter get @@test_first_last.1
The configurations above are, obviously, just a short introduction. For full information please check the documentation.
Now this blog entry is tagged a s PHP. Hs is that coming into play? - Well, on the one side we have this fast memcache interface, which allows to access almost arbitrarry data from the database. On the other side we have our PHP mysqlnd plugin interface where we can add special features, like query caching or load balancing, transparently to any PHP application. Why not combine those two things? - Good question. That's what we have done in the PECL mysqlnd_memcache PHP extension. This PHP extension is a plugin to mysqlnd intercepting queries sent to the server. In a quick analysis it tries to identfy whether an SQL statement can - transparently - be converted into memcache requests. We therefore exchange some computing power on the PHP server and gain more performance from the MySQL server. As SQL is a rather complex language and memcache is a quite limited key-value protocol this will only work for a limited subset of common queries though.
So let's take a look at some PHP code:
<?php $mysqli = new mysqli("localhost", "usr", "pass", "test"); $memcache = new memcached(); $memcache->addServer("localhost", 11211); mysqlnd_memcache_set($mysqli, $memcache); ?>
Here we ceate a MySQL connection using mysqli as well as a memcache connection using the pecl/memcached extension. Instead of mysqli we could, as with any mysqlnd plugin, use ext/mysql or the MySQL PDO driver. We then associate the MySQL connection with the memcache connection. As a consequence of this code the mysqlnd_memcache plugin will query the MySQL server for the current memcache configuration. Subsequently it will analyse SQL queries sent to the server:
<?php $q1 = $mysqli->query("SELECT firstname, lastname FROM test WHERE id = 1"); $q2 = $mysqli->query("SELECT * FROM test"); ?>
These are two normal queries and nothing special on first sight. In case there's no error $q1 and $q2 will hold mysqli_result instances where rows can be read using fetch_row() or in other provided ways. But there are things going on in the back: The PHP extension will see that the first one can be translated to a memcche request and then fetch the data using this shortcut. The second call tries to read all data from the table. The memcache protocol provides no way for doing that so this query will use the "classic" way of sending the SQL to the MySQL server.
In order to be fast and limit the overhead - mind: we have to check any query - we didn't add a full SQL parser to this plugin but the check is done using a regular expression which will be enriched using data collected from the MySQL Server. In case this reguar expression causes trouble it can be overriden when the inital association is established. There are a few other caveats in the initial 1.0.0-beta reease available from PECL therefore we'd love to hear from you to see what you need and how we can improve your MySQL experience!
Monday, October 1. 2012 at 08:36 (Link) (Reply)
a question though: I imagine that the mysqlnd_memcache extension would add some parsing overhead to every mysql query executed. How can we be sure that this overhead is less than the gain in response time we get from using the memcache-api of mysql instead of its native api?
and then a 2nd question: is there any difference in memory usage (both on the php and mysql server sides) for using memcached protocol instead of mysql one?
E.g. is data fetched via memcache stored in the mysql query cache?
Monday, October 1. 2012 at 15:49 (Link) (Reply)
about your 1st question: I can tell zou that in the current implementation it is less (as this plugin is using a relatively simple regular expression where the precompiled regexp is cached in PHP) but even if it wasn't you have a benefit: Scaling web servers is simpler than scaling web servers, so when ofloading work this can be beneficial. Though, in general the best answer is: Benchmark it yourself. These performance differences depend on soooooo many things in your setup. Any generic benchmark will not tell you much for your case
For the 2nd question: I haven't measured this but the innodb server plugin is relatively low on the server. Directly talking to InnoDB. Therefore it is not using the query cache, but it is using InnoDB 's buffer pool etc.
Monday, October 1. 2012 at 16:36 (Link) (Reply)
I presume the second web is supposed to be a DB.
Monday, October 1. 2012 at 21:46 (Link) (Reply)
Read only transactions are another good opportunity to exploit in 5.6. Either setting autocommit on or using START TRANSACTION READ ONLY.
gggeek, consider the round trips of the traditional MySQL SQL login and the CPU cost of the authentication process on the client compared to the CPU cost of a regular expression. The memcached API should be an easy win based on eliminating some round trips.
Views are my own, for an official Oracle view, consult a PR person.
James Day, MySQL Senior Principal Support Engineer, Oracle.
Tuesday, October 2. 2012 at 03:20 (Link) (Reply)
But in the example given above, it seems that the mysql login is also done, isn't it?
@Johannes sure I will benchmark, and come back with results.
Since we're using an ORM, there's a lot of table access based on primary keys.
Do you think that mapping all of them (about 100) for memcached access could work?
Wednesday, October 3. 2012 at 04:19 (Link) (Reply)
Thursday, October 4. 2012 at 12:17 (Link) (Reply)
Being faster than others is good but we can be even faster... one round trip plus a little per query is what I want to see. Then maybe one round trip for multiple queries carried out sequentially or in parallel, perhaps batched up by a connector transparently.
James Day
Thursday, October 4. 2012 at 12:21 (Link) (Reply)
With memcached API, that may turn out to be doable more quickly querying the database directly than splitting some of the work into memcached itself and some into the database server.
James
Wednesday, October 3. 2012 at 01:31 (Reply)
Do you support persist connection?
Wednesday, October 3. 2012 at 01:44 (Link) (Reply)