Nov 19: Using Terraform to configure MySQL Database Service

Recently the MySQL Database Service (MDS) was launched in Oracle Cloud Infrastructure (OCI). As Cloud is about automation you don't have to use the Web Console to configure your instances, but can do it via API, for instance with the oci command line tool or your favorite programming language. However often it is nice to define the world in a declarative way ("I want a network, like this, and a MySQL database like that") and let the tool figure out how to align the reality in the cloud with your wish. A tool doing this is Terraform.
With Terraform you can declare the state in description files, the tool creates a dependency graph and then applies what has to be applied and of course it supports OCI and as part of the default OCI Provider, there is even directly MDS support.
So let's build a set of description files for a cloud environment. I assume you have a tenancy and want to create a new compartment, with it's own VCN and a Compute instance with a client, like MySQL Shell to access the database.
For this configuration create a new empty directory. In there the first thing we need is to tell Terraform that we want to use the OCI provider for accessing OCI. We will come back to it, but for now this will be quite short. Whether we put everything in one file, or split it up and how our files are called doesn't matter for Terraform. It will scan for all files called something.tf and will build it's graph. I like relatively small files, one for each aspect of the configuration, but you are of course free. I start with oci.tf for my general configuration:
provider "oci" { version = "~> 3.95" region = var.region }
Here we say that we want at least version 3.95 for the OCI provider and configure our cloud region using a variable. All variables I use, and which can be set or overwritten, I put in a file called variables.tf, where I declare region like this:
variable "region" {}
As said the first thing I want to to create a Compartment. A Compartment in OCI is a grouping of different instances from services you are using. You can use Compartments for instance to group services different departments of your company are using and giving them different resource limits or having development and production systems separated or whatever you might need. By using a compartment here, we won't get in conflict with other services you are already using.
This is my compartment.tf:
resource "oci_identity_compartment" "mds_terraform" { name = "mds_terraform" description = "Compartment to house the MySQL Database and Terraform experiment" compartment_id = var.compartment_ocid enable_delete = true }
In the first line we declare that the following is a description of a resource of a type oci_identity_compartment, which inside our other Terraform resources will be called mds_terraform. Then we define the name of the compartment we want to have inside OCI. Here I'm using the same name both times, followed by a description, which might help your colleagues or your later self to understand the purpose. The compartment_id property here refers to the parent, as Compartments can be hierarchical nested. Finally setting the property enable_deleta means that Terraform will try to delete the Compartment, when we tell it to delete things.
As the parent Compartment is a variable again, we need to declare it, thus let's extend variables.tf:
variable "compartment_ocid" {}
With the compartment the first thing we need is our network. This is my vcn.tf:
resource "oci_core_vcn" "mds_terraform_vcn" { cidr_block = "10.0.0.0/16" dns_label = "mdsterraform" compartment_id = oci_identity_compartment.mds_terraform.id display_name = "mds_terraform_vcn" } resource "oci_core_internet_gateway" "internet_gateway" { compartment_id = oci_identity_compartment.mds_terraform.id vcn_id = oci_core_vcn.mds_terraform_vcn.id display_name = "gateway" } resource "oci_core_default_route_table" "default-route-table-options" { manage_default_resource_id = oci_core_vcn.mds_terraform_vcn.default_route_table_id route_rules { network_entity_id = oci_core_internet_gateway.internet_gateway.id cidr_block = "0.0.0.0/0" } } resource "oci_core_subnet" "test_subnet" { cidr_block = "10.0.2.0/24" display_name = "mds_tf_subnet" dns_label = "mdssubnet" security_list_ids = [oci_core_security_list.securitylist1.id] compartment_id = oci_identity_compartment.mds_terraform.id vcn_id = oci_core_vcn.mds_terraform_vcn.id route_table_id = oci_core_vcn.mds_terraform_vcn.default_route_table_id dhcp_options_id = oci_core_vcn.mds_terraform_vcn.default_dhcp_options_id } resource "oci_core_security_list" "securitylist1" { display_name = "securitylist1" compartment_id = oci_identity_compartment.mds_terraform.id vcn_id = oci_core_vcn.mds_terraform_vcn.id egress_security_rules { protocol = "all" destination = "0.0.0.0/0" } ingress_security_rules { protocol = "6" source = "0.0.0.0/0" tcp_options { min = 22 max = 22 } } ingress_security_rules { protocol = "6" source = "0.0.0.0/0" tcp_options { min = 3306 max = 3306 } } ingress_security_rules { protocol = "6" source = "0.0.0.0/0" tcp_options { min = 33060 max = 33060 } } }
This is quite a lot and I won't go through all things here, but this declares a VCN with a single subnet, adds an internet gateway, so that we can export services to the internet and can reach the internet from our VCN and sets ingress and egress firewall rules, to only allow traffic to MDS (ports 3306 and 33060) and SSH (port 22). What you might notice is how we are referring to the id of the Compartment we created before, by using oci_identity_compartment.mds_terraform.id and how the different network resources refer to each other in similar ways.
Now it's time to create our MDS instance! Here is my mysql.tf:
data "oci_mysql_mysql_configurations" "shape" { compartment_id = oci_identity_compartment.mds_terraform.id
type = ["DEFAULT"] shape_name = var.mysql_shape } resource "oci_mysql_mysql_db_system" "mds_terraform" { display_name = "Terraform Experiment" admin_username = var.mysql_admin_user admin_password = var.mysql_admin_password shape_name = var.mysql_shape configuration_id =data.oci_mysql_mysql_configurations.shape.configurations[0].id subnet_id = oci_core_subnet.test_subnet.id compartment_id = oci_identity_compartment.mds_terraform.idm.id availability_domain = data.oci_identity_availability_domain.ad.name data_storage_size_in_gb = var.mysql_data_storage_in_gb } output "mysql_url" { value = "mysqlx://${var.mysql_admin_user}:${var.mysql_admin_password}@${oci_mysql_mysql_db_system.mds_terraform.ip_address}:${oci_mysql_mysql_db_system.mds_terraform.port_x}" }
The actual MySQL Database Instance is declared in the second block, where we give it a name, configure the adminstrative user account, assign the subnet etc. Again we introduced some variables, so let's declare them in variables.tf:
variable "mysql_admin_user" { default = "root" } variable "mysql_admin_password" { } variable "mysql_shape" { default = "VM.Standard.E2.1" } variable "mysql_data_storage_in_gb" { default = 50 }
A few fields might need some extra explanation:
The shape is the machine type we want and defines CPU type, whether we want a VM, memory and so on. Here we default to VM.Standard.E2.1, which is the smallest type and good enough for an experiment. On a production system you probably want to override and use a larger shape.
Then MDS allows you to use different Configurations, so you can tune MySQL Configuration Variables for your application's needs. If you have your custom config you can provide the ID, but I want to use the default for that shape, so I use a data resource to look it up.
In many Cloud Region there are different Availability Domains, different data centers close to each other. The resources we created before span over ADs. However the MDS Instance has to live in a AD. To lookup the AD's ID based on the number of the AD we can put this in oci.tf:
data "oci_identity_availability_domain" "ad" { compartment_id = var.compartment_ocid ad_number = var.availability_domain }
And, again, I add another variable to variables.tf.
Now there's one more thing in the mysql.tf: An output block. This will ask Terraform to give us a summary once it is done.
With all these things ready we can execute it! For a start I want to use the Web Console and OCI's Resource Manager. For that I have to package my files, which I do from my command line:
$ zip mds-terraform.zip *.tf adding: compartment.tf (deflated 38%) adding: mysql.tf (deflated 61%) adding: network.tf (deflated 75%) adding: oci.tf (deflated 35%) adding: variables.tf (deflated 50%)
With that file we can login to the Console, and navigate to the Resource Manager.
After clicking the "Create Stack" button we can use the checkbox to tell the system that we have zip file and then either drag the file from a file manager or browse for the file.
Now we are being asked to fill the configuration variables we defined previously. No surprise is that our defaults are pre-filled, however the system also identified your region and Compartment ID! The Compartment ID suggested is the one which was used to create the Stack, which probably is the root aka. the tenancy's ID.
Now you could pick a password for the MySQL user and continue. However MDS has specific requirements on the password security and we would eventually fail later, so let's take a quick side tour and make this form a bit nicer. This can be done by providing a schema.yml file:
title: "MySQL Terraform Experiment" description: "An experimental Terraform setup to create MySQL Database Service Instances" schemaVersion: 1.1.0 version: "20190304" locale: "en" groupings: - title: "Basic Hidden" visible: false variables: - compartment_ocid - tenancy_ocid - region - title: "General Configuration" variables: - mysql_admin_user - mysql_admin_password variables: compartment_ocid: type: oci:identity:compartment:id # type: string required: true title: "Compartment" description: "The compartment in which to create compute instance(s)" mysql_admin_user: type: string required: true title: "MySQL Admin User" description: "Username for MySQL Admin User" minLength: 1 maxLength: 14 pattern: "^[a-zA-Z][a-zA-Z0-9]+$" mysql_admin_password: type: string required: true title: "MySQL Password" description: "Password for MySQL Admin User" pattern: "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%^&*()_+\\-=\\[\\]{};':\\\"\\\\|,.<>\\/?]).{8,32}$" outputGroups: - title: "MySQL Database Service" outputs: - mysql_url outputs: mysql_url: type: string title: "MySQL Connection String" visible: true
And then packing the zip file again:
$ zip mds-terraform.zip *.tf schema.yaml updating: compartment.tf (deflated 38%) updating: mysql.tf (deflated 61%) updating: network.tf (deflated 75%) updating: oci.tf (deflated 35%) updating: variables.tf (deflated 50%) adding: schema.yaml (deflated 57%)
In the stack configuration I now blick Back, upload the new file and get a nicer form.
So, let's enter a password and we can continue. (IMPORTANT: The password will be stored insecurely in the stack, for production usage you should secure it)
After completing the Wizard I come to an overview page for the Stack and can then pick the Terraform Apply Action. This will take about 15 minutes and create our resources.
After the process is done I browse to the MySQL Database Service page
But oh wait, there is no System in the list!? - Yes, since it is in the newly created compartment, so on the left I can select the mds_terraform Compartment. If it doesn't appear in the list my browser has an outdated version cached and I simply reload the page.
Now we have a MySQL Database Service Database Instance within a VCN and can't reach it. Not so good, so I add one more service to my Terraform configuration: A compute instance with pre-installed MySQL Shell. Here's the compute.tf:
data "oci_core_images" "images_for_shape" { compartment_id = oci_identity_compartment.mds_terraform.id operating_system = "Oracle Linux" operating_system_version = "7.8" shape = var.compute_shape sort_by = "TIMECREATED" sort_order = "DESC" } resource "oci_core_instance" "compute_instance" { availability_domain = data.oci_identity_availability_domain.ad.name compartment_id = oci_identity_compartment.mds_terraform.id display_name = "MySQL Database Service and Terraform Test" shape = var.compute_shape source_details { source_type = "image" source_id = data.oci_core_images.images_for_shape.images[0].id } create_vnic_details { assign_public_ip = true display_name = "primaryVnic" subnet_id = oci_core_subnet.test_subnet.id hostname_label = "compute" } metadata = { ssh_authorized_keys = var.public_key user_data = filebase64("init-scripts/compute-init.sh") } } output "compute_public_ip" { value = oci_core_instance.compute_instance.public_ip }
This creates a VM using the latest Oracle Linux 7.8 image and asks for a public IP address, so we can reach it from the outside. I also reference a script called init-scripts/compute-init.sh. This script looks like this and simply installs MySQL Shell from MySQL's yum repository:
#!/bin/sh cd /tmp wget https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm sudo rpm -i mysql80-community-release-el7-3.noarch.rpm sudo yum update sudo yum install -y mysql-shell
In variables.tf a new variable is to be added, which will ask for an SSH public key, so we can login to the machine and a variable to configure the shape with a sensible default:
variable "compute_shape" { default ="VM.Standard2.1" } variable "public_key" { }
For adding the new configuration field and new output to our form in the Resource Manager schema.yml needs a few minor updates, for simplicity here is the complete file:
title: "MySQL Terraform Experiment" description: "An experimental Terraform setup to create MySQL Database Service Instances" schemaVersion: 1.1.0 version: "20190304" locale: "en" groupings: - title: "Basic Hidden" visible: false variables: - compartment_ocid - tenancy_ocid - region - title: "General Configuration" variables: - mysql_admin_user - mysql_admin_password - public_key variables: compartment_ocid: type: oci:identity:compartment:id # type: string required: true title: "Compartment" description: "The compartment in which to create compute instance(s)" mysql_admin_user: type: string required: true title: "MySQL Admin User" description: "Username for MySQL Admin User" minLength: 1 maxLength: 14 pattern: "^[a-zA-Z][a-zA-Z0-9]+$" mysql_admin_password: type: password required: true title: "MySQL Password" description: "Password for MySQL Admin User" pattern: "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%^&*()_+\\-=\\[\\]{};':\\\"\\\\|,.<>\\/?]).{8,32}$" public_key: type: string title: "SSH Public Key" description: "An OpenSSH public key for accessing your compute instance" outputGroups: - title: "MySQL Database Service" outputs: - mysql_url - title: "Compute Instance" outputs: - compute_public_ip outputs: mysql_url: type: string title: "MySQL Connection String" visible: true compute_public_ip: type: string title: "Public IP" visible: true
Now I can package it up, again:
$ zip mds-terraform.zip *.tf schema.yaml init-scripts/compute-init.sh updating: compartment.tf (deflated 38%) updating: mysql.tf (deflated 62%) updating: network.tf (deflated 76%) updating: oci.tf (deflated 35%) updating: variables.tf (deflated 55%) updating: schema.yaml (deflated 57%) adding: compute.tf (deflated 54%) adding: init-scripts/compute-init.sh (deflated 39%)
And go back to the resource manager ... oh wait .. list is empty ... hah .. I'm in the wrong Compartment. Once that hurdle is bypassed I can select the Stack I created previously, click Edit and upload the new file. The wizard will now ask for the ssh key, which I copy from my $HOME/.ssh/id_rsa.pub before completing the wizard. then I again pick the Terraform Apply action and can observe how Terraform notices that most things already exist, but only the Compute instance is missing and creates it. A few minutes later it is done and the task completed.
On top of the page a new tab Application Information appeared and based on information from the schema.yml file giving me an mysqlx URL and an IP address. I then use that IP address to connect to the machine, using my ssh key and the usernamne opc. I have to confirm the server identity by typing yes and am on my Compute instance, which is in my VCN. I can then use MySQL Shell with the URL from the Terraform summary to connect to the MySQL instance.
MySQL Shell will by default start in JavaScript mode. If I'm not in mood for that I can type \sql and switch in SQL mode. I can also install other programs as I like, including my own, and connect to the MDS instance just like any other MySQL.
[opc@compute ~]$ mysqlsh 'mysqlx://root:...@10.0.2.5:33060' MySQL Shell 8.0.22 Copyright (c) 2016, 2020, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type '\help' or '\?' for help; '\quit' to exit. WARNING: Using a password on the command line interface can be insecure. Creating an X protocol session to 'root@10.0.2.5:33060' Fetching schema names for autocompletion... Press ^C to stop. Your MySQL connection id is 13 (X protocol) Server version: 8.0.22-u2-cloud MySQL Enterprise - Cloud No default schema selected; type \use <schema> to set one. MySQL 10.0.2.5:33060+ ssl JS > \sql Switching to SQL mode... Commands end with ; MySQL 10.0.2.5:33060+ ssl SQL > CREATE DATABASE foo; Query OK, 1 row affected (0.0052 sec) MySQL 10.0.2.5:33060+ ssl SQL > use foo Default schema set to `foo`. Fetching table and column names from `foo` for auto-completion... Press ^C to stop. MySQL 10.0.2.5:33060+ ssl foo SQL > CREATE TABLE t (id INT); Query OK, 0 rows affected (0.0236 sec) MySQL 10.0.2.5:33060+ ssl foo SQL > INSERT INTO t VALUES (1); Query OK, 1 row affected (0.0139 sec) MySQL 10.0.2.5:33060+ ssl foo SQL > SELECT * FROM t; +----+ | id | +----+ | 1 | +----+ 1 row in set (0.0007 sec)
Once you are done you can go back to the Web Console and the Stack's page and pick the Terraform Destroy Action and all things will be removed again.
Note: It can happen that the Cloud Init Script didn't finish, yet and MySQL Shell isn't installed. Then wait a few moments and try again. Also you might see an error like mysqlsh: error while loading shared libraries: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory. If that happens logout and back in. If the error persists run export LD_LIBRARY_PATH=/usr/lib/mysqlsh/ mysqlsh
as a workaround.
Now didn't I initially say that I want to automate it and not click in a Web Console? - Yeah I did and install the terraform tool locallyby downloading from terraform.io and then changing my oci.tf file. Previously I was inside OCI and could use my Web Session as authentication and gather data. From my local machine I have to configure more. The provider entry now looks like this:
provider "oci" { version = "~> 3.95" region = var.region tenancy_ocid = var.tenancy_ocid user_ocid = var.user_ocid fingerprint = var.fingerprint private_key_path = var.private_key_path }
There are new variables, so I add them to variables.tf:
variable "user_ocid" {} variable "fingerprint" {} variable "private_key_path" {}
Now I can run terraform init, which will read the files and download the oci provider. If I now run terraform apply it will ask me about all those variables. Best way to gather those is by installing the OCI command line tool and running oci setup bootstrap, which will guide you through the process to setup a client and putting relevant information in your $HOME/.oci/config file.
All these files are available on GitHub at https://github.com/johannes/mysql-database-service-terraform-example
Happy MySQLing.
If you want to see how to use a similar setup for running a serverless application using Node.js on OCI you can look at this Hands-on-Lab and I also suggest reserving some time to attend the Oracle Live event with a big MySQL announcement on December 2nd.
Sep 17: Using JSON features to restructure results

Recently there was a question about which clients were connected to a server being asked in the MySQL Community Slack. The relevant information is available from performance schema, as most connectors will send information about themselves when connecting:
select * from performance_schema.session_connect_attrs; +----------------+-----------------+------------------------+------------------+ | PROCESSLIST_ID | ATTR_NAME | ATTR_VALUE | ORDINAL_POSITION | +----------------+-----------------+------------------------+------------------+ | 130 | _pid | 17412 | 0 | | 130 | _platform | x86_64 | 1 | | 130 | _os | Linux-5.4.0 | 2 | | 130 | _source_host | maniacmansion | 3 | | 130 | _client_name | mysql-connector-nodejs | 4 | | 130 | _client_version | 8.0.22 | 5 | | 130 | _client_license | GPL-2.0 | 6 | | 130 | program_name | mysqlx-shell | 7 | | 129 | _client_name | libmysqlxclient | 0 | | 129 | _client_version | 8.0.21 | 1 | | 129 | _os | Linux | 2 | | 129 | _platform | x86_64 | 3 | | 129 | _client_license | GPL | 4 | | 129 | _pid | 17257 | 5 | | 129 | program_name | mysqlsh | 6 | | 131 | _pid | 17510 | 0 | | 131 | _platform | x86_64 | 1 | | 131 | _os | Linux | 2 | | 131 | _client_name | libmysql | 3 | | 131 | os_user | johannes | 4 | | 131 | _client_version | 8.0.22 | 5 | | 131 | program_name | mysql | 6 | +----------------+-----------------+------------------------+------------------+ 22 rows in set (0.0027 sec)
Now this gives us quite some information, but the person asking wanted to have it presented in a nice, transposed way, where for isntance the _client_name
and _client_version
appeared in their own column. As I like MySQL's JSON features I decided to use those for doing this. (There are other ways, which in many cases might be better, see Window functions, CTEs, sub-selects, ... this here is only to show JSON things)
First feature I needed is JSON Aggregation. JSON aggregation works similar to GROUP_CONCAT
or other aggregations: If we have a GROUP BY
clause we can build a JSON object, taking one column as key and one column as value:
SELECT PROCESSLIST_ID, JSON_PRETTY(JSON_OBJECTAGG(ATTR_NAME, ATTR_VALUE)) attribs FROM performance_schema.session_connect_attrs GROUP BY PROCESSLIST_ID; | PROCESSLIST_ID | attribs | | 129 | { "_os": "Linux", "_pid": "17257", "_platform": "x86_64", "_client_name": "libmysqlxclient", "program_name": "mysqlsh", "_client_license": "GPL", "_client_version": "8.0.21" } | 130 | { "_os": "Linux-5.4.0", "_pid": "17412", "_platform": "x86_64", "_client_name": "mysql-connector-nodejs", "_source_host": "maniacmansion", "program_name": "mysqlx-shell", "_client_license": "GPL-2.0", "_client_version": "8.0.22" } | 131 | { "_os": "Linux", "_pid": "17510", "os_user": "johannes", "_platform": "x86_64", "_client_name": "libmysql", "program_name": "mysql", "_client_version": "8.0.22" }
With this all data is properly aggregated. Now we can use my friend JSON_TABLE
to convert the JSON data back into a table:
SELECT PROCESSLIST_ID, i.* FROM ( SELECT PROCESSLIST_ID, JSON_OBJECTAGG(ATTR_NAME, ATTR_VALUE) j FROM performance_schema.session_connect_attrs GROUP BY PROCESSLIST_ID ) s, JSON_TABLE( s.j, '$' COLUMNS ( client_name VARCHAR(100) PATH "$._client_name", client_version VARCHAR(100) PATH "$._client_version" ) ) AS i ; +----------------+------------------------+----------------+ | PROCESSLIST_ID | client_name | client_version | +----------------+------------------------+----------------+ | 129 | libmysqlxclient | 8.0.21 | | 130 | mysql-connector-nodejs | 8.0.22 | | 131 | libmysql | 8.0.22 | +----------------+------------------------+----------------+
Any yay, we got a readable result about which active session is using which connector and can identify outdated ones. Of course we have all SQL things avaialble, thus adding a WHERE i.client_version != '8.0.22'
would filter the result accordingly.
Sep 15: MySQL 8.0: JSON Aggregation functions

In MySQL 5.7 the new JSON support had been added. The JSON support consists out of three features:
All three features combined allow building very powerful applications without committing to a fixed data structure on every part, but I was missing one thing a lot: Aggregation.
A topic I'm thinking a lot about is finding ways to select nested data. In MySQL we typically have two ways for this. Either one sends multiple queries to retrieve different nesting levels or one builds JOINs which tend to deliver repetitive responses. A tool we got for a while is GROUP_CONCAT. With GROUP_CONCAT we can get some of our nested information:
mysql> SELECT blog.*, GROUP_CONCAT(blog_comment.id) AS comments FROM blog JOIN blog_comment ON blog.id = blog_comment.blog_id GROUP BY blog_id; +----+-------------+----------------------+----------+ | id | title | body | comments | +----+-------------+----------------------+----------+ | 1 | Hello World | This is my new Blog! | 1,2 | | 2 | spam trap | spam goes here | 3 | +----+-------------+----------------------+----------+ 2 rows in set (0.07 sec)
This gives us the ids for all comments to a specific blog post. But it's not really nice to process. To improve this we can try to abuse GROUP_CONCAT like this:
mysql> SELECT blog.*, CONCAT('[', GROUP_CONCAT( JSON_OBJECT("author", blog_comment.author, "body", blog_comment.body)), ']') AS comments FROM blog JOIN blog_comment ON blog.id = blog_comment.blog_id GROUP BY blog_id; +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | id | title | body | comments | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | 1 | Hello World | This is my new Blog! | [{"body": "Great blog!", "author": "Your Friend"},{"body": "Great new blog", "author": "Best Friend"}] | | 2 | spam trap | spam goes here | [{"body": "Buy these pills to enlarge your blog", "author": "Visitor"}] | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec)
But this has multiple issues, not only is it kind of ugly and error-prone to build our JSON string manually using CONCAT() like this, but also GROUP_CONCAT uses an internal buffer with a default size of only 1024, which, while configurable, can easily be too little for our use case.
Now in the new development milestone release for MySQL 8.0 we can see two new aggregation functions: JSON_ARRAYAGG() and JSON_OBJECTAGG() which build proper JSON arrays or objects without any need for a hack:
mysql> SELECT blog.*, JSON_ARRAYAGG( JSON_OBJECT("author", blog_comment.author, "body", blog_comment.body) ) AS comments FROM blog JOIN blog_comment ON blog.id = blog_comment.blog_id GROUP BY blog_id; +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | id | title | body | comments | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ | 1 | Hello World | This is my new Blog! | [{"body": "Great blog!", "author": "Your Friend"},{"body": "Great new blog", "author": "Best Friend"}] | | 2 | spam trap | spam goes here | [{"body": "Buy these pills to enlarge your blog", "author": "Visitor"}] | +----+-------------+----------------------+--------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec)
This is now quite nice and by using the proper aggregation function we can be sure that our result will always be valid JSON, which allows further easy processing.
Sep 3: Types in PHP and MySQL


Since PHP 7.0 has been released there's more attention on scalar types. Keeping types for data from within your application is relatively simple. But when talking to external systems, like a database things aren't always as one eventually might initially expect.
For MySQL the type we see -- in the first approximation -- is defined by the network protocol. The MySQL network protocol by default converts all data into strings. So if we fetch an integer from the database and use PHP 7's typing feature we get an error:
<?php declare(strict_types=1); function getInteger() : int { $mysqli = new mysqli(...); return $mysqli->query("SELECT 1")->fetch_row()[0]; } var_dump(getInteger()); ?> Fatal error: Uncaught TypeError: Return value of getInteger() must be of the type integer, string returned in t.php:6
Of course the solution is easy: Either we cast ourselves or we disable the strict mode and PHP will cast for us.
Now let's take a look at another case. Assume we have an application where we fetch an integer ID from the database. We know MySQL will send us a string and we treat the ID as opaque data anyways so we have the type check for a string. Now we refactor the code slightly and make use of prepared statements. What will the result be?
<?php declare(strict_types=1); function getId() : string { $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT 1"); $stmt->execute(); return $stmt->get_result()->fetch_row()[0]; } var_dump(getId()); ?> Fatal error: Uncaught TypeError: Return value of getId() must be of the type string, integer returned in t.php:8
Wait! - What's up there!? -- Didn't I just say that the MySQL protocol will always send a string, thus we retrieve a string in PHP!? - Yes I did and that's true for "direct queries." It's not true for results from prepared statements. With prepared statements the MySQL protocol uses a binary encoding of the data and therefore mysqlnd and mysqli will try to find the matching PHP type. This isn't always possible, especially if we're going into the range of big values. So let's query for PHP_INT_MAX and PHP_INT_MAX+1 and look at the types:
<?php $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT 9223372036854775807, 9223372036854775808"); $stmt->execute(); var_dump($stmt->get_result()->fetch_row()); ?> array(2) { [0]=> int(9223372036854775807) [1]=> string(19) "9223372036854775808" }
Here 9223372036854775807 is the largest value a PHP integer can represent and thus is an integer. 9223372036854775808 however is to large and can't fit in a signed 64bit integer thus it is converted in a string, as this keeps all information and can be handled at least to some degree.
Similar things happens to other types which can't be properly represented in PHP:
<?php $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT 1.23"); $stmt->execute(); var_dump($stmt->get_result()->fetch_row()); ?> array(2) { [0]=> string(4) "1.23" }
Yay - yet another wtf! So what is going on this time? -- Well, a literal in SQL is treated as DECIMAL. A DECIMAL field is supposed to be precise. If this were to be converted into a PHP float aka. double we probably would loose the precision, thus treating it as string again makes sure we're not loosing information. If we had a FLOAT or DOUBLE field this could safely be represented as float in PHP:
<?php $mysqli = new mysqli(...); $stmt = $mysqli->prepare("SELECT RAND()"); $stmt->execute(); var_dump($stmt->get_result()->fetch_row()); ?> array(2) { [0]=> float(0.16519711461402206) }
So to summarize:
- For a direct query the MySQL server sends strings, PHP returns all data as string
- For prepared statements MySQL sends data in binary form and PHP will use a corresponding type
- If the value could only be represented with a potential data loss in PHP it is converted to a string by PHP, even with prepared statements
Now we might expect the same when using PDO. Let's check:
<?php $pdo = new PDO("mysql:host=localhost", "...", "..."); $stmt = $pdo->prepare("SELECT 9223372036854775808, RAND()"); $stmt->execute(); var_dump($stmt->fetch(PDO::FETCH_NUM)); ?> array(2) { [0]=> string(1) "1" [1]=> string(18) "0.3217373297752229" }
This example uses prepared statements, but returns strings!? The reason is that PDO by default doesn't use prepared statements on the network layer but an emulation within PHP. This means PHP will replace potential placeholders and then runs a direct query. As mentioned above with a direct query the MySQL server will send strings, thus PHP will represent all data as string. However we can easily ask PDO to disable the emulation:
<?php $pdo = new PDO("mysql:host=localhost", "...", "..."); $pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false); $stmt = $pdo->prepare("SELECT 1, RAND()"); $stmt->execute(); var_dump($stmt->fetch(PDO::FETCH_NUM)); ?> array(2) { [0]=> int(1) [1]=> float(0.24252333421495) }
This leaves the question whether you should disable the emulation in order to get the correct types. Doing this has some impact on performance characteristics: With native prepared statements there will be a client-server round-trip during the prepare and another round-trip for the execute. With emulation only during the execute. The native prepared statements also require some server resources to store the handle. However if a single statement is executed multiple times there might be some savings. Also the type representation means that different type conversions happen and a different amount of data is transfered. For most cases this shouldn't have notable impact, but in the end only a benchmark will tell.
Hope this helps to give a better understanding, or more confusion
Jan 24: MySQL Tour Munich


In February my colleague Morgan Tocker will travel through Europe and talk about MySQL 5.7, which recently went GA.
In Munich we're having a meeting for PHPers and other Web developers, where Morgan will talk about MySQL and JSON and I'll also be around.
Summary of the talk:
With the newly added JSON support in MySQL, you can combine the flexibility of NoSQL with the strength of a relational database. In this session, Morgan will explain the new JSON datatype, and the new set of functions for handling JSON documents, available storage and indexing options. In addition, he will present benchmarks showing how much one can expect from the MySQL server in different use case scenarios.
About Morgan: Morgan rejoined the MySQL team at Oracle in 2013 as MySQL Community Manager, having previously worked for MySQL and Sun Microsystems. He blogs from tocker.ca, and especially likes the new features in MySQL that make operations life easier.
The meeting will be held at Oracle's Munich Office, Meeting Room Van Gogh, Ground floor building B, Riesstr. 25 (close to underground station Olympiaeinkaufszentrum) on February 4th at 18:30. Open for anybody interested.
Feb 24: On rumors of "PHP dropping MySQL"


Over the last few days different people asked me for comments about PHP dropping MySQL support. These questions confused me, but meanwhile I figured out where these rumors come from and what they mean.
The simple facts are: No, PHP is not dropping MySQL support and we, Oracle's MySQL team, continue working with the PHP community.
For the long story we first have to remember what "PHP's MySQL support" includes. There key part are four extensions which are part of the main PHP tree:
- ext/mysql
- ext/mysqli
- ext/pdo_mysql
- ext/mysqlnd
The first one, ext/mysql provides the mysql_* functions. This is the classic interface taught in many (old) books and used by lots of (old) software. mysqli is "mysql improved", this is a younger extension providing access to all MySQL features. pdo_mysql contains the driver for PDO, PHP's database API abstraction layer. mysqlnd is the MSQL native driver, this module goes mostly unseen and provides a common set of functionality for the three other modules in order to talk to the MySQL Server.
Additionally we maintain a bunch of PECL extensions plugging into mysqlnd, which I won't mention in detail, the PHP documentation has info on most.
Now that's a lot of code and we an look at what is actually happening:
The old mysql extension, ext/mysql, is old. The code goes back to the early days of PHP and MySQL and embraces some bad design decisions. For example if no explicit connection resource is passed all functions will try to use the last connection which was being used. So given a simple example like this:
<?php mysql_connect("mysql.eample.com", "user", "password"); mysql_select_db("test"); $result = mysql_query("DELETE FROM t WHERE id = 23"); ?>
This might do weird things. Let's assume the connect fails, as the error is not handled the script continues to run and will call mysql_select_db(). This won't directly fail but guess and try to connect to a server (most likely on localhost), if that fails the script still won't terminate but mysql_query() will again guess and try to connect. If all things come together this will suddenly work and the script will operate on a completely different database than expected which can have really bad consequences.
But that's not all. As said the code goes way back, it is grown with PHP and MySQL. It tries to be compatible with all versions of MySQL since at least 3.23 this all makes the code hard to maintain.
When PHP 5.0, which added mysqli, came along in 2004 it was decided that maintaining this is troublesome and that we won't add new features to that old extension but only to mysqli (as well as to pdo_mysql, which came along a bit later in PHP 5.1, as long as it makes sense) We also started to advertise these newer extensions over the old one.
So we lived on for a while, added features to mysqli, fixed a few bugs in mysql, normal operations. Over the time we noticed that people still use the old extension even for new projects and prevent them access from features (i.e. prepared statements or support for multiple result sets as needed for stored procedures etc.) but we also knew that we can't simply deprecate and remove the extension as it is way to commonly used. So in 2012 we started a "soft deprecation" process, which meant to add deprecation warnings to the documentation and suggesting alternatives using mysqli or PDO.
A bit later, with PHP 5.5 which was released in June 2013, it was decided to add such a deprecation notice to the code, so each time a script is connecting to a MySQL server using the mysql extension a deprecation notice would be triggered.
That's the state we are in and there is no date by which the old mysql extension will be removed from PHP. Atsome point in the future. Certainly not the upcoming PHP 5.6, though.
Why not? - Since we are aware of many projects with a long history who can't simply swap this out. One of these projects is Wordpress. And this brings us to the current discussion:
Wordpress is an old project, going back to the days of PHP 4 where there was only the old mysql extension and nothing else. Wordpress also doesn't live on its own but with tons of plugins extending all kinds of features. Some of these go equally long back, many need database access, so many make more or less direct use of ext/mysql. After quite some discussions and heated debate in different channels the Wordpress developers now decided to do the switch. As they are aware of the trouble this causes to the plugin environment they are carefully, though - they actually allow switching between both extensions, mysql and mysqli.
As always such major changes become heated and imprecise statements loose their context and thus wrong messages circulate. So nothing to worry about, while I'd like to encourage all users of the old mysql extension to follow Wordpress and other examples and do the switch.
I hope this helped to clear things up!
Oct 9: Sharding PHP with MySQL Fabric


PHP users who attended any of my recent PHP&MySQL related talks or read Ulf's blog will know our mysqlnd_ms plugin. This plugin hooks into PHP's mysqlnd library and provides transparent support for replication and load-balancing features. Without changing your application you get transparent load-balancing and read-writ splitting so all your reading queries will be sent to a slave while the writes go to a master server. The exact strategies for that can be defined in mysqlnd_ms's configuration so quite often no, or only few application changes are needed. But we have one limitation: The MySQL servers have to be configured in each configuration on all your PHP servers, this can be annoying when you're changing your environment like adding a new slave or promoting a machine to master in case the original master fails. But there's help coming!
At this year's MySQL Connect conference we've announced the initial labs release for MySQL Fabric. MySQL Fabric aims to be "an integrated environment for managing a farm of MySQL server supporting high-availability and sharding." Part of this framework is an RPC interface to query available servers which are managed by MySQL Fabric which delivers us the missing piece for mysqlnd_ms.
As this release of Fabric put the focus on sharding, this is what I want to show here, too. A general introduction to MySQL Fabric and its sharding features can be found on VN's blog so I'll be quite fast in some areas, for details please refer to the documentation and the mentiond blogs.
The first thing we need is the MySQL Utilities package with Fabric support which is available from labs.mysql.com. After installing this package you have to locate the main.cfg configuration file and configure the storage instance.
[storage] database = fabric user = fabric address = localhost:3300 connection_timeout = 6 password = i do no now
This is a MySQL server where Fabric will store its configuration and such. After that we can initialize Fabric and start the daemon.
$ mysqlfabric setup $ mysqlfabric start
The setup step creates tables in the configured database and the start starts the daemon process. Now we can o and configure our server groups. A server group contains a master server where the group's data is being written to and a number of slaves to which MySQL will replicate data. For our sample sharding setup I plan to create two shards and a global group. The purpose of the global group is to hold table definitions and data which Fabric will make available on all systems in our environment. Each of these groups will, in this example, have one master and one slave. This means we need six MySQL server instances running. These six instances should all be running MySQL 5.6. an except from having binary logging enabled and having different server ids there is no replication configuration needed before running these commands. In my example setup I'm running all of those on one machine, obviously that's only useful for tests:
$ mysqlfabric group create global $ mysqlfabric group add global 127.0.0.1:3301 root secret $ mysqlfabric group promote global $ mysqlfabric group add global 127.0.0.1:3302 root secret $ mysqlfabric group create shard1 $ mysqlfabric group add shard1 127.0.0.1:3303 root secret $ mysqlfabric group promote shard1 $ mysqlfabric group add shard1 127.0.0.1:3304 root secret $ mysqlfabric group create shard2 $ mysqlfabric group add shard2 127.0.0.1:3305 root secret $ mysqlfabric group promote shard2 $ mysqlfabric group add shard2 127.0.0.1:3306 root secret
So this creates the three groups and will configure the servers to replicate the servers as needed. With this setup the server on port 3301 will be the global master. 3302, 3303 and 3305 will e 3301's direct slaves and 304 will be configured to be a slave for 3303, as will 3306 to 3305.
Now we go to define our sharding rules. I'm going to use range based sharding with two shards. The first shard, which will be assigned to the server group shard1 created above will have shard id 1 to 9999 and the second shard, in group shard2 will have data for shard key values 10000+. We also define the table fabrictest in the test schema as our sharding tale and id as the shard column.
$ mysqlfabric sharding define RANGE global $ mysqlfabric sharding add_mapping 1 test.fabrictest id $ mysqlfabric sharding add_shard 1 shard1 ENABLED 1 $ mysqlfabric sharding add_shard 1 shard2 ENABLED 10000
Note that for range-based sharding we don't have to define the upper bound as that is defined by the lower bound of the next shard.
Now we have MySQL Fabric and our MySQL Servers configured and can go to PHP. As mentioned in the beginning we need mysqlnd_ms, to be precise the 1.6.0 alpha release which we can install using pecl:
$ sudo pecl install mysqlnd_ms-alpha
To configure PHP we firstly need a mysqlnd_ms configuration file. myslqnd_ms uses json and a simple confiuration using Fabric might look like this:
fabric.json:
{ "test" : { "fabric":{ "hosts": [ { "host": "localhost", "port": 8080 } ] } } }
This configures the application test to use a MySQL Fabric based setup where MySQL Fabric's RPC daemon runs on the local machine. Again: We put all on one machine for a test, not what one would do on a production setup.
Next we locate our system's php.ini file and enable mysqlnd_ms to use our config.
php.ini: extension=mysqlnd_ms.so mysqlnd_ms.enable=1 mysqlnd_ms.config_file=/path/to/fabric.json
And now we are finally done and run a test script.
<?php $c = new mysqli("test", "root", "", "test"); echo "Creating global table:\n"; mysqlnd_ms_fabric_select_global($c, "test.fabrictest"); var_dump($c->query("CREATE TABLE fabrictest (id INT NOT NULL)")); echo "Server used: ".mysqlnd_ms_get_last_used_connection($c)['scheme']."\n\n"; echo "Inserting with ID 10:\n"; mysqlnd_ms_fabric_select_shard($c, "test.fabrictest", 10); var_dump($c->query("INSERT INTO fabrictest VALUES (10)")); echo "Server used: ".mysqlnd_ms_get_last_used_connection($c)['scheme']."\n\n"; echo "Trying to read id 10 from 10010:\n"; mysqlnd_ms_fabric_select_shard($c, "test.fabrictest", 10010); $r = $c->query("SELECT * FROM fabrictest WHERE id = 10"); var_dump($r->fetch_row()); echo "Server used: ".mysqlnd_ms_get_last_used_connection($c)['scheme']."\n\n"; ?>
With this script we do a few things, first observation is the hostname test. The mysqlnd_ms plugin will recognize that as application name and will refer to its configuration. Second are the mysqlnd_ms_* functions. First we pick the global group and execute a CREATE TABLE operation there. mysqlnd_ms will detect that this is a write operation and therefore connect to the globals master. This should be 127.0.0.1:3301 which hopefully is printed by the echo call. Then we select the shard responsible for id 10 in the fabrictest table and insert data. mysqlnd_ms will, again, detect that this is a write operation and will therefore figure out where writes to that shard have to go to, which is 127.0.0.1:3303. Finally we do an operation which will not really succeed: We select the servers for shard 10010 which is shard2 from our setup and then query for id 10. the data we stored in shard1. This will query 127.0.0.1:3306 (slave of 3305 in shard2 group) which will return an empty result set.
I hope this is enough to get you started, you can now add shards or migrate them or take servers down and promote current slaves to masters etc. and see how the system reacts.
In a future post we will combine this with Doctrine stay tuned.
Note: This blog post features labs/alpha releases. Which aim at demonstrating functionality. They are not for production use. There might be stability issues, there certainly are performance restrictions we're working on. We'd like however to receive feedback.
Oct 12: Analysing WHER-clauses in INFORMATION_SCHEMA table implemtations

The MySQL Server has a quite simple interface for plugins to create tables inside INFORMATION_SCHEMA. A minimal plugin for creating a table with nothing but a counter might look like this:
static int counter_fill_table(THD *thd, TABLE_LIST *tables, Item *cond) { ulonglong value= 0; while (1) { table->field[0]->store(value++, true); } return 0; } static ST_FIELD_INFO counter_table_fields[]= { {"COUNT", 20, MYSQL_TYPE_LONGLONG, 0, MY_I_S_UNSIGNED, 0, 0}, {0, 0, MYSQL_TYPE_NULL, 0, 0, 0, 0} }; static int counter_table_init(void *ptr) { ST_SCHEMA_TABLE *schema_table= (ST_SCHEMA_TABLE*)ptr; schema_table->fields_info= counter_table_fields; schema_table->fill_table= counter_fill_table; return 0; } static struct st_mysql_information_schema counter_table_info = { MYSQL_INFORMATION_SCHEMA_INTERFACE_VERSION }; mysql_declare_plugin(counter) { MYSQL_INFORMATION_SCHEMA_PLUGIN, &counter_table_info, /* type-specific descriptor */ "COUNTER", /* plugin and table name */ "My Name", /* author */ "An I_S table with a counter",/* description */ PLUGIN_LICENSE_GPL, /* license type */ counter_table_init, /* init function */ NULL, /* deinit function */ 0x10, /* version */ NULL, /* no status variables */ NULL, /* no system variables */ NULL, /* no reserved information */ 0 /* no flags */ } mysql_declare_plugin_end;
This is quite straight forward and documented inside the MySQL documentation. It also has an obvious issue: It will run forever (at least if we assume we don't run in an out of memory situation). Luckily we might have a user who foresees this issue and added a WHERE clause like here:
Read MoreSELECT COUNT FROM INFORMATION_SCHEMA.COUNTER WHERE COUNT < 10
Sep 30: MySQL, Memcache, PHP revised


Some time ago I was writing about the InnoDB Memcache Daemon plugin already for the MySQL server. Back then we had a labs release with a little preview only. Meanwhile quite some time passed and new developments were made - just in time for the MySQL 5.6 RC announced this weekend by Tomas.
The innodb_memcache daemon plugin is a plugin for the MySQL Server end contains an embedded memcached. This embedded memcached is configured to use MySQL's InnoDB engine as storage backend. By using this data stored inside an InnoDB table can be accessed using memcache's key-value protocol. Back in the times of the previous blog post this was limited to data from a single table, which maps easily to the key-value nature of memcache but is a clear limitation. The InnoDB obviously knows that and improvd it:
A user may now define multiple configurations at the same time and therefore access different tables at the same time - or the same table using different key-columns as memcache key, for accessing the data the memcache key names will then be prefixed by specially formatted configuration names.
Let's take a look at an simple example. Assume we have this configuration inside innodb_memcache.containers:
Name: prefix_test schema: test table: test key column: id key name: PRIMARY KEY value columns: lastname
We can then use the memcache configuration using a key like this:
set @@prefix_test.1 Schlüter get @@prefix_test.1
The first call will store my last name with id=1 in the test.test table. For accessing multiple configurtions we simply add entries to the containers list.
Of course we can still access miltiple columns, as in the previous version:
Name: test_first_last schema: test table: test key columns: id key name: PRIMARY value columns: firstname,lastname
And then we add my firstname:
set @@test_first_last.1 Johannes,Schlüter get @@test_first_last.1
The configurations above are, obviously, just a short introduction. For full information please check the documentation.
Now this blog entry is tagged a s PHP. Hs is that coming into play? - Well, on the one side we have this fast memcache interface, which allows to access almost arbitrarry data from the database. On the other side we have our PHP mysqlnd plugin interface where we can add special features, like query caching or load balancing, transparently to any PHP application. Why not combine those two things? - Good question. That's what we have done in the PECL mysqlnd_memcache PHP extension. This PHP extension is a plugin to mysqlnd intercepting queries sent to the server. In a quick analysis it tries to identfy whether an SQL statement can - transparently - be converted into memcache requests. We therefore exchange some computing power on the PHP server and gain more performance from the MySQL server. As SQL is a rather complex language and memcache is a quite limited key-value protocol this will only work for a limited subset of common queries though.
So let's take a look at some PHP code:
<?php $mysqli = new mysqli("localhost", "usr", "pass", "test"); $memcache = new memcached(); $memcache->addServer("localhost", 11211); mysqlnd_memcache_set($mysqli, $memcache); ?>
Here we ceate a MySQL connection using mysqli as well as a memcache connection using the pecl/memcached extension. Instead of mysqli we could, as with any mysqlnd plugin, use ext/mysql or the MySQL PDO driver. We then associate the MySQL connection with the memcache connection. As a consequence of this code the mysqlnd_memcache plugin will query the MySQL server for the current memcache configuration. Subsequently it will analyse SQL queries sent to the server:
<?php $q1 = $mysqli->query("SELECT firstname, lastname FROM test WHERE id = 1"); $q2 = $mysqli->query("SELECT * FROM test"); ?>
These are two normal queries and nothing special on first sight. In case there's no error $q1 and $q2 will hold mysqli_result instances where rows can be read using fetch_row() or in other provided ways. But there are things going on in the back: The PHP extension will see that the first one can be translated to a memcche request and then fetch the data using this shortcut. The second call tries to read all data from the table. The memcache protocol provides no way for doing that so this query will use the "classic" way of sending the SQL to the MySQL server.
In order to be fast and limit the overhead - mind: we have to check any query - we didn't add a full SQL parser to this plugin but the check is done using a regular expression which will be enriched using data collected from the MySQL Server. In case this reguar expression causes trouble it can be overriden when the inital association is established. There are a few other caveats in the initial 1.0.0-beta reease available from PECL therefore we'd love to hear from you to see what you need and how we can improve your MySQL experience!
Jan 12: Upcoming talks


Over the last few weeks I had been quite silent, but that's about to change: Over the next few weeks I'll give a few presentations. Feel free to join any of those.
- January, 18th: Erstellung hochperformanter PHP-Anwendungen mit MySQL (German)
MySQL Webinar, Online - February, 9th: MySQL Konnectoren (German)
OTN Developer Day: MySQL, Frankfurt, Germany - February 24th/25th: PHP under the hood (English)
PHP UK Conference, London, UK
Nov 17: High Performance PHP Session Storage on Scale


One of the great things about the HTTP protocol, besides status code 418, is that it's stateless. A web server therefore is not required to store any information on the user or allocate resources for a user after the individual request is done. By that a single web server can handle many many many different users easily, and well if it can't anymore one can add a new server, put a simple load balancer in front and scale out. Each of those web servers then handles its requests without the need for communication which leads to linear scaling (assuming network provides enough bandwidth etc.).
Now the Web isn't used for serving static documents only anymore but we have all these fancy web apps. And those applications often have the need for a state. The most trivial information they need is the current user. HTTP is a great protocol and provides a way to do authentication which works well with its stateless nature - unfortunately this authentication is implemented badly in current clients. Ugly popups, no logout button, ... I don't have to tell more I think. For having nicer login systems people want web forms. Now the stateless nature of HTTP is a problem: The user may login and then browse around. On later requests it should still be known who that user is - with a custom HTML form based login alone this is not possible. A solution might be cookies. At least one might think so for a second. But setting a cookie "this is an authorized user" alone doesn't make sense as it could easily be faked. Better is to simply store a random identifier in a cookie and then keep a state information on the server. Then all session data is protected and only the user who knows this random identifier is authenticated. If this identifier is wisely chosen and hard to guess this works quite well. Luckily this is a mostly PHP- and MySQL-focused blog and as PHP is a system for building web applications this functionality is part of the core language: The PHP session module.
The session module, which was introduced in PHP 4, partly based on work on the famous phplib library, is quite a fascinating piece of code. It is open and extendable in so many directions but still so simple to use that everybody uses it, often newcomers learn about it on their first day in PHP land. Of course you can not only store the information whether the user is logged in but cache some user-specific data or keep the state on some transactions by the user, like multi-page forms or such.
In its default configuration session state will be stored on the web server's file system. Each session's data in its own file in serialized form. If the filesystem does some caching or one uses a ramdisk or something this can be quite efficient. But as we suddenly have a state on the web server we can't scale as easily as before anymore: If we add a new server and then route a user with an existing session to the new server all the session data won't be there. That is bad. This is often solved by a configuration of the load balancer to route all requests from the same user to the same web server. In some cases this works quite ok, but it is often seen that this might cause problems. Let's assume you want to take a machine down for maintenance. All sessions there will die. Or imagine there's a bunch of users who do complex and expensive tasks - then one of your servers will have a hard time, giving these users bad response times which feels like bad service, even though your other systems are mostly idle.
A nice solution for this would be to store the sessions in a central repository which can be accessed from all web servers.
Read MoreNov 14: mysqli_result iterations


For the last few months I had quite a few MySQL blog posts and didn't have anything from my "new features in PHP [trunk|5.4]" series. This article is a bridge between those two. The PHP 5.4 NEWS file has a small entry:
MySQLi: Added iterator support in MySQLi. mysqli_result implements Traversable. (Andrey, Johannes)
From the outside it is a really small change and easy to miss. The mentioned class, mysqli_result, implements an interface which adds no new methods. What once can't see is that this relates to some internal C-level functions which can be called by the engine for doing a foreach iteration on objects of this class. So with PHP 5.4 you don't have to use an "ugly" while construct anymore to fetch rows from a mysqli result but can simply do a foreach:
mysqli_report(MYSQLI_REPORT_STRICT); try { $mysqli = new mysqli(/* ... */); foreach ($myslqi->query("SELECT a, b, c FROM t") as $row) { /* Process $row which is an associative array */ } } catch (mysqli_sql_exception $e) { /* an error happened ... */ }
I'm configuring mysqli in a way to throw exceptions on error. This is useful in this case as mysqli::query() might return false in the case of an error. Passing false to a foreach will give a fatal error, so I'd need a temporary variable and a check in front of the foreach loop, with exceptions I simply do the error handling in the catch block.
One thing to note is that mysqli is using buffered results ("store result") by default. If you want to use unbuffered result sets ("use result") you can easily do that by setting the flag accordingly:
foreach ($myslqi->query("SELECT a, b, c FROM t", MYSQLI_USE_RESULT) as $row) { /* ... */ }
People who are advanced with iterators in PHP might ask "Why did you implement Traversable only, not Iterator?" - the main reason is that we simply didn't want to. The mysqli_result class already has quite a few methods and we didn't want to make the interface confusing. If you need an Iterator class for some purpose you can simply wrap mysqli_result in an IteratoIterator.
Oct 7: mysqlnd_qc and Symfony2


Previously I was writing about combining Symfony2 and mysqlnd to get more statistics on what is going on below the surface in the database communication when using a Symfony2 application via the Symfony2 profiler. Now that's not all that can be done and I gave some ideas for extending this. One idea was adding mysqlnd_qc support. mysqlnd_qc is the client side query cache plugin for mysqlnd. This provides a client-side cache for query results transparently without changing the application.
A nice thing about this plugin, for this context here, is the function mysqlnd_qc_get_query_trace_log() which provides information about each query being executed. Not only the query string but also some timing (execution time, result storage time) and a stack trace so you can see where in the code a query was executed. I've added this functionality to the JSMysqlndBundle as you can see in the screenshot. I won't show a screenshot about what happens if you click the stacktrace link as this currently breaks the layout a bit, but maybe somebody wants to make this nicer? - Or maybe even feels motivated to make it even better using mysqlnd_uh (which, as of today, has docs, thanks to Ulf) Feel free to contact me to talk about ideas!
Oct 2: Symfony 2 and mysqlnd


In a previous blog posting I was mentioning that I'm working on a small hobby PHP project. As I'm using this project to update myself to current frameworks I've decided to use Symfony2. Symfony provides a nice feature, which is the Symfony Profilier, an extensive logging and reporting system for Symfony2 developers to understand what's going on. A part of it is the Doctrine query logger which lists all database queries executed by Doctrine and their execution time.
This is nice but when we're using mysqlnd in our PHP build we have more information available. "So why not use that information," I thought and built a new bundle for Symfony2 doing exactly that. The JSMysqlndBundle will take all the 150 or so statistic values collected, so they can be seen in the profiler (click screenshot for a larger view).
As this is the initial value, a quick Sunday morning hack, it has not all features I can imagine. Things one could do include
- Provide information on caching decisions and behavior when mysqlnd_qc is used
- Provide replication-related decisions when the new mysqlnd replication and load balancing plugin is used
- Take David's mysqlnd_uh-based query logging ideas and provide more information on any executed query
- ....
Sep 30: MySQL Query Analyzer and PHP


Today we've released a few things for PHP and MySQL users: One is the first (and probably only) beta of the mysqlnd_ms plugin for MySQL Replication and Load Balancing support and the first GA version of a PHP plugin for the Query Analyzer of the MySQL Enterprise Monitor.
Ulf blogged a lot about mysqlnd_ms, so I don't have to repeat him here. what I want to talk about is the other one. So what is that about?
When running a PHP-based application with MySQL it is often quite interesting to see what actually happens on the database sever. Besides monitoring of the system load etc. it is often interesting to see what queries are actually executed and which of them are expensive. A part of MySQL Enterprise Monitor is the MySQL Query Analyzer which helps answering these questions.
Traditionally the MySQL Query Analyzer was based on MySQL Proxy which is configured to sit between the application and the MySQL server and collects all relevant data.
Now in the new 2.3.7 release of the MySQL Enterprise Monitor we have enhanced this for PHP users: We now provide a plugin which can be loaded in PHP and which will provide data for the Query Analyzer directly from within PHP.
By that we don't only reduce the latency for the data collection but we can provide more information about the current environment.
In the query detail window you now don't only see general query statistics but also a stack trace from the application, so you can immediately identify the part of the application which should be improved. So above you can see a few screenshots I made from this server showing some insights of this blog where I was testing the plugin.
If you want to learn more checkout the documentation and product pages. Hope you like it!