Entries tagged as mysql
Related tags
acquisation conferences james gosling java php sun microsystems ulf work buffer node.js optimisation protobuf christmas coding escher family fun games home sweet home launchpad mysql storage engine pony puzzle weihnachten closures goto namespaces php qa php releases php testfest php testing php53 phpqa phpt project managment .net ajax anniversary array assembler banshee BarCamp bazaar berkeley db birthday boredom Bryan Cantrill c# comments cvs database db debugging delphi development dsp DTrace ego english events exchange firefox frustration gecko german git google gsoc gsoc08 gsoc09 improvements ipc08 iterator javafx json 23c3 berlin blogger ccc data center dtrace froscon froscon08 froscon10 hamburg hausdurchsuchung ipc ipc06 ipc07 ipc09 ipc10 lawblog montreal mysqlde mysqlnd mysqlnd plugins nuremberg oscon osdc osdc.il07 osdc09 php conferences php extensions php quebec php quebec 09 phpbarcelona cta query analyzer barcamp gopal MacOS opensolaris pecl planet php releases scream solaris wolfram entertainment microsoft ms paint opensource packages paint php 5 php 5.4 php 6 php.next processes testing unicode video youtube amber road blog brendand gregg crossbow hardware linux netbook phpbcat server solaris zones storage travel ubuntu virtualization web 2.0 st. augustin train badeente barcelona beer garden blogging catalonia concert dancehall daten frei dendemann earth god google maps hiphop history job kaffee kaunas kinderzimmer productions linkblog lithuania live beate merk br alpha bundesdatenschutzbeauftragter bundestrojaner bundesverfassungsgericht cdu csu Daten frei datenschutz gmail movies onlinedursuchung patriot act politik privacy recht und ordnung regierung schäuble sicherheitsstaat stupidity terroristen tv urteilvorratsdatenspeicherung easter gsoc2008 qa testfest ide netbeans bc information_schema mysql plugins mysql56 presentations talks mysql proxy phpmysqlproxy mysqli resultset stored procedures siemens xing mysql 8.0 mysql js mysql json php oo memcache memcached mysql cluster oracle performance php session scalability memcche mysql analytics symfony2 mysql fabric mysqlnd_ms pdo php.iterator type hints wordpress mysqlnd_qcSep 17: Using JSON features to restructure results

Recently there was a question about which clients were connected to a server being asked in the MySQL Community Slack. The relevant information is available from performance schema, as most connectors will send information about themselves when connecting:
select * from performance_schema.session_connect_attrs; +----------------+-----------------+------------------------+------------------+ | PROCESSLIST_ID | ATTR_NAME | ATTR_VALUE | ORDINAL_POSITION | +----------------+-----------------+------------------------+------------------+ | 130 | _pid | 17412 | 0 | | 130 | _platform | x86_64 | 1 | | 130 | _os | Linux-5.4.0 | 2 | | 130 | _source_host | maniacmansion | 3 | | 130 | _client_name | mysql-connector-nodejs | 4 | | 130 | _client_version | 8.0.22 | 5 | | 130 | _client_license | GPL-2.0 | 6 | | 130 | program_name | mysqlx-shell | 7 | | 129 | _client_name | libmysqlxclient | 0 | | 129 | _client_version | 8.0.21 | 1 | | 129 | _os | Linux | 2 | | 129 | _platform | x86_64 | 3 | | 129 | _client_license | GPL | 4 | | 129 | _pid | 17257 | 5 | | 129 | program_name | mysqlsh | 6 | | 131 | _pid | 17510 | 0 | | 131 | _platform | x86_64 | 1 | | 131 | _os | Linux | 2 | | 131 | _client_name | libmysql | 3 | | 131 | os_user | johannes | 4 | | 131 | _client_version | 8.0.22 | 5 | | 131 | program_name | mysql | 6 | +----------------+-----------------+------------------------+------------------+ 22 rows in set (0.0027 sec)
Now this gives us quite some information, but the person asking wanted to have it presented in a nice, transposed way, where for isntance the _client_name
and _client_version
appeared in their own column. As I like MySQL's JSON features I decided to use those for doing this. (There are other ways, which in many cases might be better, see Window functions, CTEs, sub-selects, ... this here is only to show JSON things)
First feature I needed is JSON Aggregation. JSON aggregation works similar to GROUP_CONCAT
or other aggregations: If we have a GROUP BY
clause we can build a JSON object, taking one column as key and one column as value:
SELECT PROCESSLIST_ID, JSON_PRETTY(JSON_OBJECTAGG(ATTR_NAME, ATTR_VALUE)) attribs FROM performance_schema.session_connect_attrs GROUP BY PROCESSLIST_ID; | PROCESSLIST_ID | attribs | | 129 | { "_os": "Linux", "_pid": "17257", "_platform": "x86_64", "_client_name": "libmysqlxclient", "program_name": "mysqlsh", "_client_license": "GPL", "_client_version": "8.0.21" } | 130 | { "_os": "Linux-5.4.0", "_pid": "17412", "_platform": "x86_64", "_client_name": "mysql-connector-nodejs", "_source_host": "maniacmansion", "program_name": "mysqlx-shell", "_client_license": "GPL-2.0", "_client_version": "8.0.22" } | 131 | { "_os": "Linux", "_pid": "17510", "os_user": "johannes", "_platform": "x86_64", "_client_name": "libmysql", "program_name": "mysql", "_client_version": "8.0.22" }
With this all data is properly aggregated. Now we can use my friend JSON_TABLE
to convert the JSON data back into a table:
SELECT PROCESSLIST_ID, i.* FROM ( SELECT PROCESSLIST_ID, JSON_OBJECTAGG(ATTR_NAME, ATTR_VALUE) j FROM performance_schema.session_connect_attrs GROUP BY PROCESSLIST_ID ) s, JSON_TABLE( s.j, '$' COLUMNS ( client_name VARCHAR(100) PATH "$._client_name", client_version VARCHAR(100) PATH "$._client_version" ) ) AS i ; +----------------+------------------------+----------------+ | PROCESSLIST_ID | client_name | client_version | +----------------+------------------------+----------------+ | 129 | libmysqlxclient | 8.0.21 | | 130 | mysql-connector-nodejs | 8.0.22 | | 131 | libmysql | 8.0.22 | +----------------+------------------------+----------------+
Any yay, we got a readable result about which active session is using which connector and can identify outdated ones. Of course we have all SQL things avaialble, thus adding a WHERE i.client_version != '8.0.22'
would filter the result accordingly.
Sep 27: Efficient Node.js Buffer usage
When building network libraries for Node.js, as we do at work, one quite quickly comes by Node's Buffer type. A Buffer gives access to a memory region in a quite raw form, allowing to handle raw data and allowing to interpret binary streams. The Buffer interface predates ES6 TypedArrays and has some optimizations.
Two optimisations are notable:
For one the slice() method does not copy data, but returns a view on the underlying data. This makes it quite efficient to work on a window of the data, but when writing one has to be careful. Simple example:
const buffer = Buffer.from("hello"); const slice = buffer.slice(1,2); slice[0] = 97; console.log(buffer.toString('utf8')); // will print 'hallo'
The second one is that allocating a small buffer won't actually go to the operating system and allocate a memory area, but Node.js has a memory region from which small buffers can be derived quickly. This actually works by using the slicing logic.
const buffer1 = Buffer.from("hello"); const buffer2 = Buffer.from("world"); console.log(buffer1.byteOffset); // 0 console.log(buffer2.byteOffset); // 8
This indicates that both buffers use the same memory region with an alignment of 8 bytes.
So how does this work? - Underlying the Buffer, in modern versions of Node is an ArrayBuffer. We can ask the Buffer to provide us with the underlying ArrayBuffer using the .buffer property. One thing one has to be careful about is that for a slice the returned ArrayBuffer will be the full buffer and not only a sliced part. Giving the two Buffers from above we can see this.
const buffer3 = Buffer.from(buffer2.buffer); console.log(buffer3.length); // 8192 const buffer4 = buffer3.slice(0, 5); console.log(buffer4.toString('utf8')); // hello
A raw ArrayBuffer doesn't provide many things, but we can create a new Buffer on top of it. This won't copy the data, but use the same memory as above. We can see the the pre-allocated block by Node (in the version I'm using for this test) apparantely is 8192 bytes. 8k is a common size used for multiple buffers. A factor in such a choice is that many filesystems use 512 byte blocks and 8k is a handleable multiple of it. Additionally CPU caches often are mutiples of 8k. So this is not a fully arbitrary choice.
Since most of the data most likely is rubbish we slice it and look at the beginning of the data and notably we're not seeing "world", but "hello" confirming that these are indeed using the same underlying buffer.
As said Buffer is a Node-specific type and different Node.js libraries can't handle it. One of thosel ibraries we use is the Google Protocol Buffer library. In simple terms protobuf is a format for serializing data for example for exchange over network. We can call it a typed and more efficient JSON alternative.
Protobuf's deserializeBinary() function now won't work with Buffer instaces, but requires a TypedArray, like Uint8Array. A Uint8Array is the ES6 counterpart to Node's Buffer. It is a layer on top of ArrayBuffer for accessing bytes.
The easy way to make it work is converting a Buffer to Uint8Array like this:
const buf = stream.read(length); const u8 = new Uint8Array(buf); const msg = Message.deserializeBinary(u8);
However this is inefficient as the Uint8Array constructor will copy the data. But as we learned, both Buffer and Uint8Array are just views on top of an ArrayBuffer. The Buffer gives us access to the ArrayBuffer and if we are careful about our slicing offsets we can ask the Uint8Array to re-use the same memory without copying:
const buf = stream.read(length); const u8 = new Uint8Array(buf.buffer, buf.byteOffset, buf.length); const msg = Message.deserialzeBinary(u8);
A tiny bit more to type, but a notable gain in performance, which we can verify with a trivial benchmark:
console.time("A"); for (let i = 0; i < 1000; ++i) { const b = new Buffer(10000); const u = new Uint8Array(b); } console.timeEnd("A"); console.time("B"); for (let i = 0; i < 1000; ++i) { const b = new Buffer(10000); const u = new Uint8Array(b.buffer, b.byteOffset, b.length); } console.timeEnd("B");
On my machine I see results like
A: 9.895ms B: 5.216ms
Difference in a real application of course will be different.
Sep 15: MySQL 8.0: JSON Aggregation functions

In MySQL 5.7 the new JSON support had been added. The JSON support consists out of three features:
All three features combined allow building very powerful applications without committing to a fixed data structure on every part, but I was missing one thing a lot: Aggregation.
A topic I'm thinking a lot about is finding ways to select nested data. In MySQL we typically have two ways for this. Either one sends multiple queries to retrieve different nesting levels or one builds JOINs which tend to deliver repetitive responses. A tool we got for a while is GROUP_CONCAT. With GROUP_CONCAT we can get some of our nested information:
mysql> SELECT blog.*, GROUP_CONCAT(blog_comment.id) AS comments FROM blog JOIN blog_comment ON blog.id = blog_comment.blog_id GROUP BY blog_id; +----+-------------+----------------------+----------+ | id | title | body | comments | +----+-------------+----------------------+----------+ | 1 | Hello World | This is my new Blog! | 1,2 | | 2 | spam trap | spam goes here | 3 | +----+-------------+----------------------+----------+ 2 rows in set (0.07 sec)
This gives us the ids for all comments to a specific blog post. But it's not really nice to process. To improve this we can try to abuse GROUP_CONCAT like this:
mysql> SELECT blog.*, CONCAT('[', GROUP_CONCAT( JSON_OBJECT("author", blog_comment.author, "body", blog_comment.body)), ']') AS comments FROM blog JOIN blog_comment ON blog.id = blog_comment.blog_id GROUP BY blog_id; +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | id | title | body | comments | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | 1 | Hello World | This is my new Blog! | [{"body": "Great blog!", "author": "Your Friend"},{"body": "Great new blog", "author": "Best Friend"}] | | 2 | spam trap | spam goes here | [{"body": "Buy these pills to enlarge your blog", "author": "Visitor"}] | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec)
But this has multiple issues, not only is it kind of ugly and error-prone to build our JSON string manually using CONCAT() like this, but also GROUP_CONCAT uses an internal buffer with a default size of only 1024, which, while configurable, can easily be too little for our use case.
Now in the new development milestone release for MySQL 8.0 we can see two new aggregation functions: JSON_ARRAYAGG() and JSON_OBJECTAGG() which build proper JSON arrays or objects without any need for a hack:
mysql> SELECT blog.*, JSON_ARRAYAGG( JSON_OBJECT("author", blog_comment.author, "body", blog_comment.body) ) AS comments FROM blog JOIN blog_comment ON blog.id = blog_comment.blog_id GROUP BY blog_id; +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | id | title | body | comments | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | 1 | Hello World | This is my new Blog! | [{"body": "Great blog!", "author": "Your Friend"},{"body": "Great new blog", "author": "Best Friend"}] | | 2 | spam trap | spam goes here | [{"body": "Buy these pills to enlarge your blog", "author": "Visitor"}] | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec)
This is now quite nice and by using the proper aggregation function we can be sure that our result will always be valid JSON, which allows further easy processing.
Sep 3: Types in PHP and MySQL


Since PHP 7.0 has been released there's more attention on scalar types. Keeping types for data from within your application is relatively simple. But when talking to external systems, like a database things aren't always as one eventually might initially expect.
For MySQL the type we see -- in the first approximation -- is defined by the network protocol. The MySQL network protocol by default converts all data into strings. So if we fetch an integer from the database and use PHP 7's typing feature we get an error:
<?php declare(strict_types=1); function getInteger() : int { $mysqli = new mysqli(...); return $mysqli->query("SELECT 1")->fetch_row()[0]; } var_dump(getInteger()); ?> Fatal error: Uncaught TypeError: Return value of getInteger() must be of the type integer, string returned in t.php:6
Of course the solution is easy: Either we cast ourselves or we disable the strict mode and PHP will cast for us.
Now let's take a look at another case. Assume we have an application where we fetch an integer ID from the database. We know MySQL will send us a string and we treat the ID as opaque data anyways so we have the type check for a string. Now we refactor the code slightly and make use of prepared statements. What will the result be?
<?php declare(strict_types=1); function getId() : string { $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT 1"); $stmt->execute(); return $stmt->get_result()->fetch_row()[0]; } var_dump(getId()); ?> Fatal error: Uncaught TypeError: Return value of getId() must be of the type string, integer returned in t.php:8
Wait! - What's up there!? -- Didn't I just say that the MySQL protocol will always send a string, thus we retrieve a string in PHP!? - Yes I did and that's true for "direct queries." It's not true for results from prepared statements. With prepared statements the MySQL protocol uses a binary encoding of the data and therefore mysqlnd and mysqli will try to find the matching PHP type. This isn't always possible, especially if we're going into the range of big values. So let's query for PHP_INT_MAX and PHP_INT_MAX+1 and look at the types:
<?php $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT 9223372036854775807, 9223372036854775808"); $stmt->execute(); var_dump($stmt->get_result()->fetch_row()); ?> array(2) { [0]=> int(9223372036854775807) [1]=> string(19) "9223372036854775808" }
Here 9223372036854775807 is the largest value a PHP integer can represent and thus is an integer. 9223372036854775808 however is to large and can't fit in a signed 64bit integer thus it is converted in a string, as this keeps all information and can be handled at least to some degree.
Similar things happens to other types which can't be properly represented in PHP:
<?php $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT 1.23"); $stmt->execute(); var_dump($stmt->get_result()->fetch_row()); ?> array(2) { [0]=> string(4) "1.23" }
Yay - yet another wtf! So what is going on this time? -- Well, a literal in SQL is treated as DECIMAL. A DECIMAL field is supposed to be precise. If this were to be converted into a PHP float aka. double we probably would loose the precision, thus treating it as string again makes sure we're not loosing information. If we had a FLOAT or DOUBLE field this could safely be represented as float in PHP:
<?php $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT RAND()"); $stmt->execute(); var_dump($stmt->get_result()->fetch_row()); ?> array(2) { [0]=> float(0.16519711461402206) }
So to summarize:
- For a direct query the MySQL server sends strings, PHP returns all data as string
- For prepared statements MySQL sends data in binary form and PHP will use a corresponding type
- If the value could only be represented with a potential data loss in PHP it is converted to a string by PHP, even with prepared statements
Now we might expect the same when using PDO. Let's check:
<?php $pdo = new PDO("mysql:host=localhost", "...", "..."); $stmt = $pdo->prepare("SELECT 9223372036854775808, RAND()"); $stmt->execute(); var_dump($stmt->fetch(PDO::FETCH_NUM)); ?> array(2) { [0]=> string(1) "1" [1]=> string(18) "0.3217373297752229" }
This example uses prepared statements, but returns strings!? The reason is that PDO by default doesn't use prepared statements on the network layer but an emulation within PHP. This means PHP will replace potential placeholders and then runs a direct query. As mentioned above with a direct query the MySQL server will send strings, thus PHP will represent all data as string. However we can easily ask PDO to disable the emulation:
<?php $pdo = new PDO("mysql:host=localhost", "...", "..."); $pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false); $stmt = $pdo->prepare("SELECT 1, RAND()"); $stmt->execute(); var_dump($stmt->fetch(PDO::FETCH_NUM)); ?> array(2) { [0]=> int(1) [1]=> float(0.24252333421495) }
This leaves the question whether you should disable the emulation in order to get the correct types. Doing this has some impact on performance characteristics: With native prepared statements there will be a client-server round-trip during the prepare and another round-trip for the execute. With emulation only during the execute. The native prepared statements also require some server resources to store the handle. However if a single statement is executed multiple times there might be some savings. Also the type representation means that different type conversions happen and a different amount of data is transfered. For most cases this shouldn't have notable impact, but in the end only a benchmark will tell.
Hope this helps to give a better understanding, or more confusion
Feb 24: On rumors of "PHP dropping MySQL"


Over the last few days different people asked me for comments about PHP dropping MySQL support. These questions confused me, but meanwhile I figured out where these rumors come from and what they mean.
The simple facts are: No, PHP is not dropping MySQL support and we, Oracle's MySQL team, continue working with the PHP community.
For the long story we first have to remember what "PHP's MySQL support" includes. There key part are four extensions which are part of the main PHP tree:
- ext/mysql
- ext/mysqli
- ext/pdo_mysql
- ext/mysqlnd
The first one, ext/mysql provides the mysql_* functions. This is the classic interface taught in many (old) books and used by lots of (old) software. mysqli is "mysql improved", this is a younger extension providing access to all MySQL features. pdo_mysql contains the driver for PDO, PHP's database API abstraction layer. mysqlnd is the MSQL native driver, this module goes mostly unseen and provides a common set of functionality for the three other modules in order to talk to the MySQL Server.
Additionally we maintain a bunch of PECL extensions plugging into mysqlnd, which I won't mention in detail, the PHP documentation has info on most.
Now that's a lot of code and we an look at what is actually happening:
The old mysql extension, ext/mysql, is old. The code goes back to the early days of PHP and MySQL and embraces some bad design decisions. For example if no explicit connection resource is passed all functions will try to use the last connection which was being used. So given a simple example like this:
<?php mysql_connect("mysql.eample.com", "user", "password"); mysql_select_db("test"); $result = mysql_query("DELETE FROM t WHERE id = 23"); ?>
This might do weird things. Let's assume the connect fails, as the error is not handled the script continues to run and will call mysql_select_db(). This won't directly fail but guess and try to connect to a server (most likely on localhost), if that fails the script still won't terminate but mysql_query() will again guess and try to connect. If all things come together this will suddenly work and the script will operate on a completely different database than expected which can have really bad consequences.
But that's not all. As said the code goes way back, it is grown with PHP and MySQL. It tries to be compatible with all versions of MySQL since at least 3.23 this all makes the code hard to maintain.
When PHP 5.0, which added mysqli, came along in 2004 it was decided that maintaining this is troublesome and that we won't add new features to that old extension but only to mysqli (as well as to pdo_mysql, which came along a bit later in PHP 5.1, as long as it makes sense) We also started to advertise these newer extensions over the old one.
So we lived on for a while, added features to mysqli, fixed a few bugs in mysql, normal operations. Over the time we noticed that people still use the old extension even for new projects and prevent them access from features (i.e. prepared statements or support for multiple result sets as needed for stored procedures etc.) but we also knew that we can't simply deprecate and remove the extension as it is way to commonly used. So in 2012 we started a "soft deprecation" process, which meant to add deprecation warnings to the documentation and suggesting alternatives using mysqli or PDO.
A bit later, with PHP 5.5 which was released in June 2013, it was decided to add such a deprecation notice to the code, so each time a script is connecting to a MySQL server using the mysql extension a deprecation notice would be triggered.
That's the state we are in and there is no date by which the old mysql extension will be removed from PHP. Atsome point in the future. Certainly not the upcoming PHP 5.6, though.
Why not? - Since we are aware of many projects with a long history who can't simply swap this out. One of these projects is Wordpress. And this brings us to the current discussion:
Wordpress is an old project, going back to the days of PHP 4 where there was only the old mysql extension and nothing else. Wordpress also doesn't live on its own but with tons of plugins extending all kinds of features. Some of these go equally long back, many need database access, so many make more or less direct use of ext/mysql. After quite some discussions and heated debate in different channels the Wordpress developers now decided to do the switch. As they are aware of the trouble this causes to the plugin environment they are carefully, though - they actually allow switching between both extensions, mysql and mysqli.
As always such major changes become heated and imprecise statements loose their context and thus wrong messages circulate. So nothing to worry about, while I'd like to encourage all users of the old mysql extension to follow Wordpress and other examples and do the switch.
I hope this helped to clear things up!
Oct 9: Sharding PHP with MySQL Fabric


PHP users who attended any of my recent PHP&MySQL related talks or read Ulf's blog will know our mysqlnd_ms plugin. This plugin hooks into PHP's mysqlnd library and provides transparent support for replication and load-balancing features. Without changing your application you get transparent load-balancing and read-writ splitting so all your reading queries will be sent to a slave while the writes go to a master server. The exact strategies for that can be defined in mysqlnd_ms's configuration so quite often no, or only few application changes are needed. But we have one limitation: The MySQL servers have to be configured in each configuration on all your PHP servers, this can be annoying when you're changing your environment like adding a new slave or promoting a machine to master in case the original master fails. But there's help coming!
At this year's MySQL Connect conference we've announced the initial labs release for MySQL Fabric. MySQL Fabric aims to be "an integrated environment for managing a farm of MySQL server supporting high-availability and sharding." Part of this framework is an RPC interface to query available servers which are managed by MySQL Fabric which delivers us the missing piece for mysqlnd_ms.
As this release of Fabric put the focus on sharding, this is what I want to show here, too. A general introduction to MySQL Fabric and its sharding features can be found on VN's blog so I'll be quite fast in some areas, for details please refer to the documentation and the mentiond blogs.
The first thing we need is the MySQL Utilities package with Fabric support which is available from labs.mysql.com. After installing this package you have to locate the main.cfg configuration file and configure the storage instance.
[storage] database = fabric user = fabric address = localhost:3300 connection_timeout = 6 password = i do no now
This is a MySQL server where Fabric will store its configuration and such. After that we can initialize Fabric and start the daemon.
$ mysqlfabric setup $ mysqlfabric start
The setup step creates tables in the configured database and the start starts the daemon process. Now we can o and configure our server groups. A server group contains a master server where the group's data is being written to and a number of slaves to which MySQL will replicate data. For our sample sharding setup I plan to create two shards and a global group. The purpose of the global group is to hold table definitions and data which Fabric will make available on all systems in our environment. Each of these groups will, in this example, have one master and one slave. This means we need six MySQL server instances running. These six instances should all be running MySQL 5.6. an except from having binary logging enabled and having different server ids there is no replication configuration needed before running these commands. In my example setup I'm running all of those on one machine, obviously that's only useful for tests:
$ mysqlfabric group create global $ mysqlfabric group add global 127.0.0.1:3301 root secret $ mysqlfabric group promote global $ mysqlfabric group add global 127.0.0.1:3302 root secret $ mysqlfabric group create shard1 $ mysqlfabric group add shard1 127.0.0.1:3303 root secret $ mysqlfabric group promote shard1 $ mysqlfabric group add shard1 127.0.0.1:3304 root secret $ mysqlfabric group create shard2 $ mysqlfabric group add shard2 127.0.0.1:3305 root secret $ mysqlfabric group promote shard2 $ mysqlfabric group add shard2 127.0.0.1:3306 root secret
So this creates the three groups and will configure the servers to replicate the servers as needed. With this setup the server on port 3301 will be the global master. 3302, 3303 and 3305 will e 3301's direct slaves and 304 will be configured to be a slave for 3303, as will 3306 to 3305.
Now we go to define our sharding rules. I'm going to use range based sharding with two shards. The first shard, which will be assigned to the server group shard1 created above will have shard id 1 to 9999 and the second shard, in group shard2 will have data for shard key values 10000+. We also define the table fabrictest in the test schema as our sharding tale and id as the shard column.
$ mysqlfabric sharding define RANGE global $ mysqlfabric sharding add_mapping 1 test.fabrictest id $ mysqlfabric sharding add_shard 1 shard1 ENABLED 1 $ mysqlfabric sharding add_shard 1 shard2 ENABLED 10000
Note that for range-based sharding we don't have to define the upper bound as that is defined by the lower bound of the next shard.
Now we have MySQL Fabric and our MySQL Servers configured and can go to PHP. As mentioned in the beginning we need mysqlnd_ms, to be precise the 1.6.0 alpha release which we can install using pecl:
$ sudo pecl install mysqlnd_ms-alpha
To configure PHP we firstly need a mysqlnd_ms configuration file. myslqnd_ms uses json and a simple confiuration using Fabric might look like this:
fabric.json:
{ "test" : { "fabric":{ "hosts": [ { "host": "localhost", "port": 8080 } ] } } }
This configures the application test to use a MySQL Fabric based setup where MySQL Fabric's RPC daemon runs on the local machine. Again: We put all on one machine for a test, not what one would do on a production setup.
Next we locate our system's php.ini file and enable mysqlnd_ms to use our config.
php.ini: extension=mysqlnd_ms.so mysqlnd_ms.enable=1 mysqlnd_ms.config_file=/path/to/fabric.json
And now we are finally done and run a test script.
<?php $c = new mysqli("test", "root", "", "test"); echo "Creating global table:\n"; mysqlnd_ms_fabric_select_global($c, "test.fabrictest"); var_dump($c->query("CREATE TABLE fabrictest (id INT NOT NULL)")); echo "Server used: ".mysqlnd_ms_get_last_used_connection($c)['scheme']."\n\n"; echo "Inserting with ID 10:\n"; mysqlnd_ms_fabric_select_shard($c, "test.fabrictest", 10); var_dump($c->query("INSERT INTO fabrictest VALUES (10)")); echo "Server used: ".mysqlnd_ms_get_last_used_connection($c)['scheme']."\n\n"; echo "Trying to read id 10 from 10010:\n"; mysqlnd_ms_fabric_select_shard($c, "test.fabrictest", 10010); $r = $c->query("SELECT * FROM fabrictest WHERE id = 10"); var_dump($r->fetch_row()); echo "Server used: ".mysqlnd_ms_get_last_used_connection($c)['scheme']."\n\n"; ?>
With this script we do a few things, first observation is the hostname test. The mysqlnd_ms plugin will recognize that as application name and will refer to its configuration. Second are the mysqlnd_ms_* functions. First we pick the global group and execute a CREATE TABLE operation there. mysqlnd_ms will detect that this is a write operation and therefore connect to the globals master. This should be 127.0.0.1:3301 which hopefully is printed by the echo call. Then we select the shard responsible for id 10 in the fabrictest table and insert data. mysqlnd_ms will, again, detect that this is a write operation and will therefore figure out where writes to that shard have to go to, which is 127.0.0.1:3303. Finally we do an operation which will not really succeed: We select the servers for shard 10010 which is shard2 from our setup and then query for id 10. the data we stored in shard1. This will query 127.0.0.1:3306 (slave of 3305 in shard2 group) which will return an empty result set.
I hope this is enough to get you started, you can now add shards or migrate them or take servers down and promote current slaves to masters etc. and see how the system reacts.
In a future post we will combine this with Doctrine stay tuned.
Note: This blog post features labs/alpha releases. Which aim at demonstrating functionality. They are not for production use. There might be stability issues, there certainly are performance restrictions we're working on. We'd like however to receive feedback.
Apr 2: Making use of PHP mysqlnd statistics

One of the great things of mysqlnd as a base library for PHP's MySQL support are the collected statistics. mysqlnd collects about 160 different statistical values about all the things going on. When having such an amount of raw data it, obviously, is quite hard to draw conclusions out of it. Therefore I recently created a PHP library sitting on top of this feature to collect all data, run some analysis and then provide some guidance and made it available from the JSMysqlndAnalytics GitHub repo (see there also for instructions for using Composer).
Using the library is relatively simple as the short instructions show. The library consists of two main parts. On the one side the "Collector" this is a wrapper around mysqli_get_client_stats() (even though this function is part of mysqli it will also work for applications using ext/mysql or PDO_mysql) which is creating the raw statistics which could be stored away or such and then the actual Analytics engine comparing the values to predefined rules. The current set of rules is a bit limited so I'm looking for input for ideas.
In case you're a Symfony user live is quite easy: Some time ago I already provided an Symfony bundle providing a Symfony Profiling Toolbar plugin showing the statistics. This JSMysqlndBundle has been extended to make use of these Analytics. The screenshot might give a rough idea on how this looks.
Hope this helps creating better applications! Happy hacking!
Oct 12: Analysing WHER-clauses in INFORMATION_SCHEMA table implemtations

The MySQL Server has a quite simple interface for plugins to create tables inside INFORMATION_SCHEMA. A minimal plugin for creating a table with nothing but a counter might look like this:
static int counter_fill_table(THD *thd, TABLE_LIST *tables, Item *cond) { ulonglong value= 0; while (1) { table->field[0]->store(value++, true); } return 0; } static ST_FIELD_INFO counter_table_fields[]= { {"COUNT", 20, MYSQL_TYPE_LONGLONG, 0, MY_I_S_UNSIGNED, 0, 0}, {0, 0, MYSQL_TYPE_NULL, 0, 0, 0, 0} }; static int counter_table_init(void *ptr) { ST_SCHEMA_TABLE *schema_table= (ST_SCHEMA_TABLE*)ptr; schema_table->fields_info= counter_table_fields; schema_table->fill_table= counter_fill_table; return 0; } static struct st_mysql_information_schema counter_table_info = { MYSQL_INFORMATION_SCHEMA_INTERFACE_VERSION }; mysql_declare_plugin(counter) { MYSQL_INFORMATION_SCHEMA_PLUGIN, &counter_table_info, /* type-specific descriptor */ "COUNTER", /* plugin and table name */ "My Name", /* author */ "An I_S table with a counter",/* description */ PLUGIN_LICENSE_GPL, /* license type */ counter_table_init, /* init function */ NULL, /* deinit function */ 0x10, /* version */ NULL, /* no status variables */ NULL, /* no system variables */ NULL, /* no reserved information */ 0 /* no flags */ } mysql_declare_plugin_end;
This is quite straight forward and documented inside the MySQL documentation. It also has an obvious issue: It will run forever (at least if we assume we don't run in an out of memory situation). Luckily we might have a user who foresees this issue and added a WHERE clause like here:
Read MoreSELECT COUNT FROM INFORMATION_SCHEMA.COUNTER WHERE COUNT < 10
Sep 30: MySQL, Memcache, PHP revised


Some time ago I was writing about the InnoDB Memcache Daemon plugin already for the MySQL server. Back then we had a labs release with a little preview only. Meanwhile quite some time passed and new developments were made - just in time for the MySQL 5.6 RC announced this weekend by Tomas.
The innodb_memcache daemon plugin is a plugin for the MySQL Server end contains an embedded memcached. This embedded memcached is configured to use MySQL's InnoDB engine as storage backend. By using this data stored inside an InnoDB table can be accessed using memcache's key-value protocol. Back in the times of the previous blog post this was limited to data from a single table, which maps easily to the key-value nature of memcache but is a clear limitation. The InnoDB obviously knows that and improvd it:
A user may now define multiple configurations at the same time and therefore access different tables at the same time - or the same table using different key-columns as memcache key, for accessing the data the memcache key names will then be prefixed by specially formatted configuration names.
Let's take a look at an simple example. Assume we have this configuration inside innodb_memcache.containers:
Name: prefix_test schema: test table: test key column: id key name: PRIMARY KEY value columns: lastname
We can then use the memcache configuration using a key like this:
set @@prefix_test.1 Schlüter get @@prefix_test.1
The first call will store my last name with id=1 in the test.test table. For accessing multiple configurtions we simply add entries to the containers list.
Of course we can still access miltiple columns, as in the previous version:
Name: test_first_last schema: test table: test key columns: id key name: PRIMARY value columns: firstname,lastname
And then we add my firstname:
set @@test_first_last.1 Johannes,Schlüter get @@test_first_last.1
The configurations above are, obviously, just a short introduction. For full information please check the documentation.
Now this blog entry is tagged a s PHP. Hs is that coming into play? - Well, on the one side we have this fast memcache interface, which allows to access almost arbitrarry data from the database. On the other side we have our PHP mysqlnd plugin interface where we can add special features, like query caching or load balancing, transparently to any PHP application. Why not combine those two things? - Good question. That's what we have done in the PECL mysqlnd_memcache PHP extension. This PHP extension is a plugin to mysqlnd intercepting queries sent to the server. In a quick analysis it tries to identfy whether an SQL statement can - transparently - be converted into memcache requests. We therefore exchange some computing power on the PHP server and gain more performance from the MySQL server. As SQL is a rather complex language and memcache is a quite limited key-value protocol this will only work for a limited subset of common queries though.
So let's take a look at some PHP code:
<?php $mysqli = new mysqli("localhost", "usr", "pass", "test"); $memcache = new memcached(); $memcache->addServer("localhost", 11211); mysqlnd_memcache_set($mysqli, $memcache); ?>
Here we ceate a MySQL connection using mysqli as well as a memcache connection using the pecl/memcached extension. Instead of mysqli we could, as with any mysqlnd plugin, use ext/mysql or the MySQL PDO driver. We then associate the MySQL connection with the memcache connection. As a consequence of this code the mysqlnd_memcache plugin will query the MySQL server for the current memcache configuration. Subsequently it will analyse SQL queries sent to the server:
<?php $q1 = $mysqli->query("SELECT firstname, lastname FROM test WHERE id = 1"); $q2 = $mysqli->query("SELECT * FROM test"); ?>
These are two normal queries and nothing special on first sight. In case there's no error $q1 and $q2 will hold mysqli_result instances where rows can be read using fetch_row() or in other provided ways. But there are things going on in the back: The PHP extension will see that the first one can be translated to a memcche request and then fetch the data using this shortcut. The second call tries to read all data from the table. The memcache protocol provides no way for doing that so this query will use the "classic" way of sending the SQL to the MySQL server.
In order to be fast and limit the overhead - mind: we have to check any query - we didn't add a full SQL parser to this plugin but the check is done using a regular expression which will be enriched using data collected from the MySQL Server. In case this reguar expression causes trouble it can be overriden when the inital association is established. There are a few other caveats in the initial 1.0.0-beta reease available from PECL therefore we'd love to hear from you to see what you need and how we can improve your MySQL experience!
Jan 12: Upcoming talks


Over the last few weeks I had been quite silent, but that's about to change: Over the next few weeks I'll give a few presentations. Feel free to join any of those.
- January, 18th: Erstellung hochperformanter PHP-Anwendungen mit MySQL (German)
MySQL Webinar, Online - February, 9th: MySQL Konnectoren (German)
OTN Developer Day: MySQL, Frankfurt, Germany - February 24th/25th: PHP under the hood (English)
PHP UK Conference, London, UK
Nov 17: High Performance PHP Session Storage on Scale


One of the great things about the HTTP protocol, besides status code 418, is that it's stateless. A web server therefore is not required to store any information on the user or allocate resources for a user after the individual request is done. By that a single web server can handle many many many different users easily, and well if it can't anymore one can add a new server, put a simple load balancer in front and scale out. Each of those web servers then handles its requests without the need for communication which leads to linear scaling (assuming network provides enough bandwidth etc.).
Now the Web isn't used for serving static documents only anymore but we have all these fancy web apps. And those applications often have the need for a state. The most trivial information they need is the current user. HTTP is a great protocol and provides a way to do authentication which works well with its stateless nature - unfortunately this authentication is implemented badly in current clients. Ugly popups, no logout button, ... I don't have to tell more I think. For having nicer login systems people want web forms. Now the stateless nature of HTTP is a problem: The user may login and then browse around. On later requests it should still be known who that user is - with a custom HTML form based login alone this is not possible. A solution might be cookies. At least one might think so for a second. But setting a cookie "this is an authorized user" alone doesn't make sense as it could easily be faked. Better is to simply store a random identifier in a cookie and then keep a state information on the server. Then all session data is protected and only the user who knows this random identifier is authenticated. If this identifier is wisely chosen and hard to guess this works quite well. Luckily this is a mostly PHP- and MySQL-focused blog and as PHP is a system for building web applications this functionality is part of the core language: The PHP session module.
The session module, which was introduced in PHP 4, partly based on work on the famous phplib library, is quite a fascinating piece of code. It is open and extendable in so many directions but still so simple to use that everybody uses it, often newcomers learn about it on their first day in PHP land. Of course you can not only store the information whether the user is logged in but cache some user-specific data or keep the state on some transactions by the user, like multi-page forms or such.
In its default configuration session state will be stored on the web server's file system. Each session's data in its own file in serialized form. If the filesystem does some caching or one uses a ramdisk or something this can be quite efficient. But as we suddenly have a state on the web server we can't scale as easily as before anymore: If we add a new server and then route a user with an existing session to the new server all the session data won't be there. That is bad. This is often solved by a configuration of the load balancer to route all requests from the same user to the same web server. In some cases this works quite ok, but it is often seen that this might cause problems. Let's assume you want to take a machine down for maintenance. All sessions there will die. Or imagine there's a bunch of users who do complex and expensive tasks - then one of your servers will have a hard time, giving these users bad response times which feels like bad service, even though your other systems are mostly idle.
A nice solution for this would be to store the sessions in a central repository which can be accessed from all web servers.
Read MoreNov 14: mysqli_result iterations


For the last few months I had quite a few MySQL blog posts and didn't have anything from my "new features in PHP [trunk|5.4]" series. This article is a bridge between those two. The PHP 5.4 NEWS file has a small entry:
MySQLi: Added iterator support in MySQLi. mysqli_result implements Traversable. (Andrey, Johannes)
From the outside it is a really small change and easy to miss. The mentioned class, mysqli_result, implements an interface which adds no new methods. What once can't see is that this relates to some internal C-level functions which can be called by the engine for doing a foreach iteration on objects of this class. So with PHP 5.4 you don't have to use an "ugly" while construct anymore to fetch rows from a mysqli result but can simply do a foreach:
mysqli_report(MYSQLI_REPORT_STRICT); try { $mysqli = new mysqli(/* ... */); foreach ($myslqi->query("SELECT a, b, c FROM t") as $row) { /* Process $row which is an associative array */ } } catch (mysqli_sql_exception $e) { /* an error happened ... */ }
I'm configuring mysqli in a way to throw exceptions on error. This is useful in this case as mysqli::query() might return false in the case of an error. Passing false to a foreach will give a fatal error, so I'd need a temporary variable and a check in front of the foreach loop, with exceptions I simply do the error handling in the catch block.
One thing to note is that mysqli is using buffered results ("store result") by default. If you want to use unbuffered result sets ("use result") you can easily do that by setting the flag accordingly:
foreach ($myslqi->query("SELECT a, b, c FROM t", MYSQLI_USE_RESULT) as $row) { /* ... */ }
People who are advanced with iterators in PHP might ask "Why did you implement Traversable only, not Iterator?" - the main reason is that we simply didn't want to. The mysqli_result class already has quite a few methods and we didn't want to make the interface confusing. If you need an Iterator class for some purpose you can simply wrap mysqli_result in an IteratoIterator.
Sep 30: MySQL Query Analyzer and PHP


Today we've released a few things for PHP and MySQL users: One is the first (and probably only) beta of the mysqlnd_ms plugin for MySQL Replication and Load Balancing support and the first GA version of a PHP plugin for the Query Analyzer of the MySQL Enterprise Monitor.
Ulf blogged a lot about mysqlnd_ms, so I don't have to repeat him here. what I want to talk about is the other one. So what is that about?
When running a PHP-based application with MySQL it is often quite interesting to see what actually happens on the database sever. Besides monitoring of the system load etc. it is often interesting to see what queries are actually executed and which of them are expensive. A part of MySQL Enterprise Monitor is the MySQL Query Analyzer which helps answering these questions.
Traditionally the MySQL Query Analyzer was based on MySQL Proxy which is configured to sit between the application and the MySQL server and collects all relevant data.
Now in the new 2.3.7 release of the MySQL Enterprise Monitor we have enhanced this for PHP users: We now provide a plugin which can be loaded in PHP and which will provide data for the Query Analyzer directly from within PHP.
By that we don't only reduce the latency for the data collection but we can provide more information about the current environment.
In the query detail window you now don't only see general query statistics but also a stack trace from the application, so you can immediately identify the part of the application which should be improved. So above you can see a few screenshots I made from this server showing some insights of this blog where I was testing the plugin.
If you want to learn more checkout the documentation and product pages. Hope you like it!
Sep 14: mysqlnd plugins and json


Some time ago I was already writing about the power included with mysqlnd plugins and how they can they can be used transparently to help you with your requirements without changing your code. But well, as mysqlnd plugins in fact are regular PHP extensions they can export functions to the PHP userland and providing complete new functionality.
In my spare time I'm currently writing a shiny Web 2.0 application where I'm heavily using AJAX-like things, so what I do quite often in this application is, basically this: Check some pre-conditions (permissions etc.) then select some data from the database, do a fetch_all to get the complete result set as an array and run it through json_encode; or to have it in code:
<?php $m = new MySQLi(/*...*/); check_whether_the_user_is_checked_in_and_allowed_to_see_this(); $result = $m->query("SELECT a,b,c,d FROM t WHERE e=23"); echo json_encode($result->fetch_all()); ?>
Of course that example is simplified as I'm using the Symfony 2 framework for this project. When writing a similar function for the 5th time I wondered whether I really need to create the temporary array and all these temporary elements in it.
So I wrote a mysqlnd plugin.
The mysqlnd_query_to_json plugin (hey what a name!) provides a single function, mysqlnd_query_to_json(), which takes two parameters, a connection identifier and an SQL query, and returns a JSON string containing the result set. The connection identifier can be a mysql resource, a mysqli object or even a PDO object. The resulting JSON string will be created directly from the network buffer without the need of temporary complex structures. Using the above example would create code like this:
<?php $m = new MySQLi(/*...*/); check_whether_the_user_is_checked_in_and_allowed_to_see_this(); echo mysqlnd_query_to_json($m, "SELECT a,b,c,d FROM t WHERE e=23"); ?>
The plugin, which you can find here, requires PHP 5.4 and has a few limitations as it knows nothing about MySQL bitfields or escaping of unicode characters for creating fully valid JSON data and Andrey called it, for good reasons, a hack. Neither did I benchmark it, yet as I merely share it to show what's possible and maybe start some discussion on what is actually needed.
If you want to learn more on these topics I also suggest to check the MySQL Webinar page frequently as Ulf is going to hold a Webinar on myslqnd plugins in October!
Jul 1: OSCON 2011

This year I'll attend OSCON for the first time. I'll give two talks:
- PHP and MySQL - Recent Developments
PHP’s MySQL support recently received many changes under the hood. PHP 5.3 introduced mysqlnd – the MySQL native driver which is a replacement for libmysql deeply bound into PHP. In this presentation you will learn what the PHP and MySQL development teams were up to. After starting with an introduction of the PHP-stack, demystifying things like mysqli, mysqlnd or PDO, this presentation will show you how to build mysqlnd plugins as PHP C extension and hooking into mysqlnd from PHP userland. It will also discuss existing plugins like a client side query cache or a module for doing read-write-splitting, both working transparently, without changes to your application. - PHP Under the hood
The beauty of PHP is that everybody can read the code and see the inner workings of software. But understanding concepts from reading code isn’t often helpful. Especially if you are no pro in that language. This presentation will take apart many parts of the PHP runtime, describe the concepts behind so attendees understand the inner workings without actually reading C code. Concepts covered include HashTables, the foundation for PHP arrays and many other internal data structures, the reference counting mechanism, which is important for writing efficient code as well as the overall executor.
In case you can't make it to these talks but want to talk to me you'll probably find me at the Oracle booth where I'll also try to give some short talks on some topics to be defined (any wishes?)
In case you're not interested in me and my talks but MySQL there are a few sessions by other MySQL Engineers:
- MySQL Replication Update
- InnoDB: Performance and Scalability Features
- Python Utilities for Managing MySQL Databases
- The MySQL Time Machine