May 14: Firefox Add-ons
Modular software can be pain - you end up installing tons of add-opns. I recently configured a new desktop box and had to reconfigure Firefox. Here's the lsit of extensions I installed, I'd be interested in comments about better or missing add-ins
- Adblock Plus
This is a tricky one. I get content for free for looking at some ads. Might be a fair deal, but often it's too much. On some sites I allow ads, on others not. - BetterPrivacy
Prevent Flash Cookies from tracking me - Brief
An RSS/Atom/... Feed reader - Certificate Patrol
Notifies me when sites change their SSL certifactes, might prevent fraud. - Change Rerer Button
I don't like being tracked. At least not always
- CookieSafe
Firefox's cookie handling is damn limited. With this add-in I can edit/delete single cookies way simpler and easily change the per site settings - Firebug
Good for fixing broken sites - FoxyProxy Standard
When switching between company VPN and my local net I need different proxy configuration. With this extension I don'T have to go through the settings - Greasemonkey
Improve websites according to my interest - JSON View
Renders json more nicely. Good when working with JSON-based protocols - NoScript
per-site configuration of javascript and stuff. Some websites have nice JS free versions. - Open in Browser
Adds a new option to the download dialog to open in Browser as web page or plain text. Useful for files a server sends with a "wrong" header. - Tab Kit
The wider the screen the more I want to have the tabs on the side. This add-in shows tabs in a sidebar with a tree structure and different colors and such - Tamper Data
Edit HTTP Request before the browser sends them, initially I've got this for security research, nowadays I use it to work-around issues in bad WEb-UIs - User Agent Switcher
Maybe switching the User agent every now and then makes tracking a bit harder, certainly it can be nice to get a different view. Many sites have a mobile version which is focussed on the content way mor than the regular site. - Web Developer
Nice tool for working with HTML stuff
Apr 14: Not only SQL - memcache and MySQL 5.6
This week there are two big events for the MySQL community: The O'Reilly MySQL Conference and Oracle Collaborate run by the IOUG. At these events our Engineering VP, Tomas Ulin, announced the latest milestone releases for our main products. MySQL 5.6 and MySQL Cluster 7.1 as well as our new Windows Installer. There's lots of cool stuff in there but one feature really excited me: MySQL 5.6 contains a memcache interface for accessing InnoDB tables. This means you can access data stored in MySQL not only using SQL statements but also by using a well established and known noSQL protocol.
This works by having the memcache daemon running as plugin as part of the MySQL server. This daemon can then be configured in three ways: Either
- to do what memcached always did - use an in memory hash table to store its data - or
- to access an InnoDB table to store and read data from or
- to use its own hash table in memory and fall back to InnoDB if data is not found directly in memcache.
This combines the power of MySQL and InnoDB's persistent storage with the lightweight protocol memcache uses, which has faster connecting times (no authorization handshake etc.) and faster data access (no SQL parsing, optimization etc.) while you're still able to query the data using SQL when you're doing more complex operations.
Of course I had to give it a run with PHP.
First step for using this is fetching the MySQL preview release and configuring it accordingly. My colleague Jimmy Yang from the InnoDB team has a nice blog posting showing these first steps. After that we have to configure PHP where we have two choices: We can use the a bit older memcache module or the newer memcached module. I've chosen the first one as that was already configured on my system. On most systems the installation should be as easy as querying your package manager or using PECL:
# pecl install memcache or # pecl install memcached
And then adding the corresponding entry (extension=memcache[d].so) to your php.ini file.
So let's do a first test from command line:
$ php -r '$m = memcache_connect("localhost", 11211); ' \
'$m->add("key", "value"); var_dump($m->get("key"));'
string(5) "value"
So we store a value in memcache and then load it again to see if it was stored properly. Now we verify the results directly in MySQL:
mysql> SELECT * FROM demo_test WHERE c1 = 'key'; Empty set (0.00 sec)
Uh, what's wrong? - O simple: We didn't read Jimmy's article properly:
If you would like to take a look at what’s in the “demo_test” table, please remember we had batched the commits (32 ops by default) by default. So you will need to do “read uncommitted” select to find the just inserted rows
So we can apply that knowledge and query again:
mysql> set session TRANSACTION ISOLATION LEVEL read uncommitted; Query OK, 0 rows affected (0.00 sec) mysql> SELECT * FROM demo_test WHERE c1 = 'key'; +------+------+------+------+-------+------+------+------+------+------+------+ | cx | cy | c1 | cz | c2 | ca | CB | c3 | cu | c4 | C5 | +------+------+------+------+-------+------+------+------+------+------+------+ | NULL | NULL | key | NULL | value | NULL | NULL | 0 | NULL | 1 | NULL | +------+------+------+------+-------+------+------+------+------+------+------+ 1 row in set (0.00 sec)
And yay! - We see our value in between the other columns for meta-data and other things.
Both PHP modules provide a session handler so you can store your session data easily in memcacheInnoDB. For configuring this we first need to add two entries to our php.ini file:
; when using the "memcache" extension: session.save_handler=memcache ; when using the "memcached" extension: ; session.save_handler=memcached session.save_path="tcp://localhost:11211"
After restarting the web server, so it reads the new configuration we can test it with a simple script:
<?php session_start(); echo "<pre>Session ID: ".session_id()."\n"; var_dump($_SESSION); $_SESSION['foo'] = 'bar'; ?>
When first requesting this we will receive an output like
m1h4iqmp6hc7e4l85qlld0gtd
array(0) {
}
Then we reload the page and see:
m1h4iqmp6hc7e4l85qlld0gtd1
array(1) {
["foo"]=>
string(3) "bar"
}
After that we can, again, look directly into MySQL:
mysql> select * from demo_test where c1 = 'm1h4iqmp6hc7e4l85qlld0gtd1'; +------+------+----------------------------+------+----------------+------+------+------+------+------+------+ | cx | cy | c1 | cz | c2 | ca | CB | c3 | cu | c4 | C5 | +------+------+----------------------------+------+----------------+------+------+------+------+------+------+ | NULL | NULL | m1h4iqmp6hc7e4l85qlld0gtd1 | NULL | foo|s:3:"bar"; | NULL | NULL | 0 | NULL | 4 | NULL | +------+------+----------------------------+------+----------------+------+------+------+------+------+------+ 1 row in set (0.00 sec)
I hope this helps you to get started. If you'd like to learn more about MySQL 5.6, MySQL Cluster 7.1 (which btw, also can be access using memcache!) our new installer or such you can watch a recording of Tomas' keynote or visit dev.mysql.com.
Dec 3: Upload Progress in PHP trunk
File uploads via HTTP are an annoyance. Web Browser know quite a lot but still the give little feedback to the users. Some a bit more, most close to no feedback. Now over the years this led to man unhappy users. Over the last year, with all these AJAX things, solutions emerged so that one can periodically poll the web server on a second connection for the status. For implementing this we have one architectural problem: PHP implements, for very good reasons, a shared nothing architecture. So one request from connection has no insight into another request/connection - but this is needed for the upload progress. Different people thought about this and implemented solutions. One of the first things was implemented in APC, another one by a special uploadprogress extension. They are nice and found quite some adoption but they have two problems. For one they are not fully native to PHP, so they have to be installed additionally, and the use a local storage to transport the status. APC uses the system's shared memory. upload_progress the filesystem (yes, file systems could be shared, I do know that). not very satisfying for having PHP as the language for solving the web problem.
The obvious solution, of course, would be to use PHP's session handling system for this. The PHP session system is an integral part of PHP and can be configured to use different storage handlers, like the local file system or memcache, which can be useful to share session data in a load-balanced cluster. Now there were some technical issues why this wasn't done at first ... but then Arnaud Le Blanc sat down and created a proper implementation of an upload progress storage handler which has been commit to PHP trunk.
Long story short: In the next version of PHP (5.4?) you will, mot likely, have an Upload Progress mechanism built-in.
Arnaud wrote a nice RFC explaining this functionality. So we configure our PHP to enable this feature, by making sure we have the default values:
session.upload_progress.enabled = 1 session.upload_progress.prefix = upload_progress_ session.upload_progress.name = PHP_SESSION_UPLOAD_PROGRESS
And we are set to go. Then we obviously need an HTML file upload form:
<form action="upload.php" method="POST" enctype="multipart/form-data"> <input type="hidden" name="<?php echo ini_get("session.upload_progress.name"); ?>" value="johannesupload" /> <input type="file" name="file1" /> <input type="file" name="file2" /> <input type="submit" /> </form>
And we can upload a file. If the file is big enough (and the connection slow enough) we can then periodically poll the server and read and read data about the progress from the $_SESSION["upload_progress_johannesupload"] variable. The full contents is mentioned in the RFC, so I won't quote it here.
So far so good. But how to poll? Well, take the sample by Rasmus about the above mentioned APC-based solution and adopt it. I let this to the experienced reader as exercise ![]()
Disclaimer: This article is describing features in not released software. Things may change without notice till it is release. Feel free to read other blog postings from my series of new features!
P.S. I still think the browser should give better feedback by itself. As should it offer better integration of HTTP-Auth, like a simple logout button ...
Nov 23: Für unsere Sicherheit!
Wurde ja langsam Zeit:
Die Regierung erwägt angesichts der Terrorgefahr eine Aufrüstung der Geheimdienste. Der Bund Deutscher Kriminalbeamter fordert Bundeswehr-Unterstützung im Inneren.
(Quelle)
Dann können die Leute mit den Maschinenpistolen am Bahnhof wenigstens damit umgehen, wenn da wild in die Menge geschossen wird.
Der Einsatz dieser Waffen erlangt seine Bedeutung vor allem gegen Gruppenziele auf kurze Distanz, da eine hohe Feuerdichte erreicht wird
(Quelle)
hachja ... Propagandadrohungen sind ja was tolles!
Nov 23: More on scalar type hints in PHP trunk
Some time ago I wrote an article about the implementation of type hints for non-object types for PHP. Meanwhile many things happened and that implementation was replaced by a different one. Readers of my previous post might know that I have doubts about type hints in PHP. People who met me in person and asked me about it know for sure ![]()
So what's the status now? - Well type hints, for non-object types, exist and they don't. There is a valid syntax which looks like this:
function foo(int $i) {
echo $i;
}
What's the consequence of this code? - Well the type hint is simply ignored. This means that
foo("Hello world");
Will run without any error and print Hello world. So there is an syntax looking like another part of the language which throws errors but behaves completely different.
function foo(bar $b) { }
foo("bar");'
Catchable fatal error: Argument 1 passed to foo() must be an instance of bar, string given [...]
The int hint is just one of them, there are a few more:
- bool, boolean
- string, binary
- scalar
- numeric
- int, integer, long
- real, double, float
- resource
- object
Nov 22: Changes in PHP trunk: No more extension for sqlite version 2
I plan to continue my series about new stuff in PHP's trunk but for now just a short note about something which was removed: PHP 5.3 has different ways to access SQLite databases of all kinds. Two of them are provided by the sqlite extension: The sqlite_ group of functions and the pdo_sqlite2 driver. The issue there is that this depends on the SQLite 2 library which isn't supported by upstream anymore for a few years. It was a logical step therefore to remove this extension from PHP trunk. The support for the sqlite3 extension and the PDO_sqlite driver (same link as above, read it carefully), which use version 3 of the library, are continued. Please note that the disk format changed from version 2 to 3 so you might not only have to change the application but also to recreate the database file. This change will most likely appear in the feature version of PHP, which will most likely be called PHP 5.4.
Nov 6: mysqlnd plugins for PHP in practice
If you follow my blog or twitter stream you might know I've recently been at Barcelona to attend the PHP Barcelona conference. Conferences are great for exchanging ideas one of the ideas I discussed with Combell's Thijs Feryn: They are a hosting company providing managed MySQL instances to their customers, as such they run multiple MySQL servers and each server serves a few of their customers. Now they have to provide every customer with database credentials, including a host name to connect to. The issue there is that a fixed hostname takes flexibility out of the setup. Say you have db1.example.com and db2.example.com over time you figure out that there are two high load customers on db1 while db2 is mostly idle. You might want to move the data from one customer over to db2 to share the load. This means you have to ask the customer to change his application configuration at the time you're moving the data. Quite annoying task.
Now there's a solution: MySQL Proxy. The proxy is a daemon sitting in between of the application/web servers and MySQL something like in the picture below.

The proxy can be scripted using lua so it is not too hard to implement a feature which chooses the database server to actually connect to. The customer is then told to connect to the proxy and depending on the username given he is redirected to a specific system. All magic happens transparent in the background. This is nice but not without issues: There is one more daemon to monitor, the proxy sitting in between adds latency, and so on.
In case you attended a recent talk by Ulf or me you certainly learned about mysqlnd plugins. We always compare mysqlnd plugins with the MySQL Proxy, so let's take a closer look: The plugins are PHP extensions, usually written in C, hooking into mysqlnd, the native driver for PHP, overriding parts of mysqlnd's internals. mysqlnd, introduced in PHP 5.3, is the implementation of the MySQL Client-Server-Protocol sitting invisible below the PHP extensions ext/mysql, mysqli and PDO_mysql. This means any plugin to mysqlnd can transparently change the behavior without an changes to the actual application.
Now with this plugin facility we can move the code for the server selection from the proxy directly in PHP. By doing this we will have almost no overhead and due to the deep integration less work for monitoring and no additional fault component.

So let's look in the implementation of such a simple plugin: The goal is having an extension which overrides the server name given by the user by one set in a special configuration file so the user is transparently redirected. The configuration file format used is a INI file. As said above a mysqlnd plugin is a regular PHP extension, even though we usually won't export functions to PHP userland. A quick note before we really start: I won't discuss all parts of the PHP API in detail, please see the resources linked below for more on that.
The first thing PHP looks at while loading an extension is a module entry. In our case there is one special thing: We add a dependency to mysqlnd, to make sure mysqlnd was initialised before this extension is initialised. You can also see that I have chosen the name mysqlnd_server_locator.
static const zend_module_dep mysqlnd_server_locator_deps[] = {
ZEND_MOD_REQUIRED("mysqlnd")
{NULL, NULL, NULL}
};
zend_module_entry mysqlnd_server_locator_module_entry = {
STANDARD_MODULE_HEADER_EX,
NULL,
mysqlnd_server_locator_deps,
"mysqlnd_server_locator",
NULL,
PHP_MINIT(mysqlnd_server_locator),
PHP_MSHUTDOWN(mysqlnd_server_locator),
NULL,
NULL,
NULL,
"0.1",
STANDARD_MODULE_PROPERTIES
};
On PHP startup the module initializer, MINIT, is being called. We want to override the connect method from mysqlnd's connection related functions. Additionally I initialize a HashTable which will hold the translation table.
static int plugin_id;
static func_mysqlnd_conn__connect orig_mysqlnd_conn_connect_method;
static HashTable server_list;
static int server_list_init = 0;
PHP_MINIT_FUNCTION(mysqlnd_server_locator)
{
struct st_mysqlnd_conn_methods *conn_methods;
plugin_id = mysqlnd_plugin_register();
conn_methods = mysqlnd_conn_get_methods();
orig_mysqlnd_conn_connect_method = conn_methods->connect;
conn_methods->connect = MYSQLND_METHOD(mysqlnd_server_locator, connect);
if (zend_hash_init(&server_list, 10, NULL, free, 1) == FAILURE) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Failed to init server_list table");
return FAILURE;
}
return SUCCESS;
}
One thing to note here is that I don't actually load the translation table, yet. This is due to issues I had while using the ini scanner during PHP's initialization phase and having the mechanism to load it later has the benefit of being ale to update the table without having to restart PHP. Anyways the above function should be relatively clear. We tell mysqlnd that a plugin is around, store the connection method pointer in a safe place and set our own connection method and then init the HashTable.
During PHP shutdown we will free this table again:
PHP_MSHUTDOWN_FUNCTION(mysqlnd_server_locator)
{
zend_hash_destroy(&server_list);
return SUCCESS;
}
Now let's look at the implementation of the overridden connect method. At first this looks complex as it takes tons of parameters but we simply pass them through and don't have to care about them. All we care about are two things: Firstly we make sure the the translation table was initilised, then we look for the username in the table, if the user exists in the table we take the hostname given in the table, else we connect to the host requested by the user.
static enum_func_status MYSQLND_METHOD(mysqlnd_server_locator, connect)(MYSQLND * conn,
const char *host, const char *user,
const char *passwd, unsigned int passwd_len,
const char *db, unsigned int db_len,
unsigned int port,
const char * socket_or_pipe,
unsigned int mysql_flags
TSRMLS_DC)
{
char **new_host;
char *actual_host = host;
if (!server_list_init) {
mysqlnd_server_locator_init_server_list(TSRMLS_C);
server_list_init = 1;
}
if (zend_hash_find(&server_list, user, strlen(user) + 1, (void**)&new_host) == SUCCESS) {
actual_host = *new_host;
}
return orig_mysqlnd_conn_connect_method(conn, actual_host, user, passwd, passwd_len, db, db_len, port, socket_or_pipe, mysql_flags TSRMLS_CC);
}
Please note that this method is not thread-safe and should, in this form, only be used in non-threaded environments. This is fixed in a version linked below, which also does one more thing: It will always check whether the ini file was modified since we read it, but let's keep it simple here. As said the configuration is a ini file which simply consists of username=host pairs:
johannes=db1.example.com guybrush=db1.example.com sam=db2.example.com max=db2.example.com bernard=db1.example.com
Such files can be parsed by PHP, I won't go into the details of the implementation here.
static void mysqlnd_server_locator_ini_parser_cb(zval *arg1, zval *arg2, zval *arg3, int callback_type, void *list_v TSRMLS_DC)
{
HashTable *list = (HashTable*)list_v;
char *hostname;
if (!arg1 || !arg2) {
return;
}
switch (callback_type)
{
case ZEND_INI_PARSER_ENTRY:
hostname = pestrndup(Z_STRVAL_P(arg2), Z_STRLEN_P(arg2), 1);
zend_hash_update(list, Z_STRVAL_P(arg1), Z_STRLEN_P(arg1) + 1, &hostname, sizeof(char *), NULL);
break;
case ZEND_INI_PARSER_SECTION:
break;
case ZEND_INI_PARSER_POP_ENTRY:
php_error_docref(NULL TSRMLS_CC, E_NOTICE, "Array syntax not allowed in ini file");
break;
default:
php_error_docref(NULL TSRMLS_CC, E_NOTICE, "Unexpected callback_type while parsing server list ini file");
break;
}
}
static int mysqlnd_server_locator_init_server_list(TSRMLS_D)
{
zend_file_handle fh;
memset(&fh, 0, sizeof(fh));
fh.filename = "/tmp/server.ini";
fh.type = ZEND_HANDLE_FILENAME;
if (zend_parse_ini_file(&fh, 0, ZEND_INI_SCANNER_NORMAL, mysqlnd_server_locator_ini_parser_cb, &server_list TSRMLS_CC) == FAILURE) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "Failed to parse server list ini file");
return FAILURE;
}
return SUCCESS;
}
And that's it. Now let's have a look at some PHP code running while this extension is loaded:
$ php -r 'mysql_connect("loalhost", "johannes", "supersecretpasswordforthis");'
Warning: mysql_connect(): php_network_getaddresses: getaddrinfo failed: node name or
service name not known in Command line code on line 1
Warning: mysql_connect(): [2002] php_network_getaddresses: getaddrinfo failed: node
name or servi (trying to connect via tcp://db1.example.com:3306) in Command line code on line 1
Neat, isn't it? - I also packaged this code in an slightly improved version. This version uses a php.ini setting for configuring the location of the extension's ini file, solves the threading issue mentioned above and automatically reloads the configuration file in case it was changed. Note that this code comes for educational purpose as-is only and I take no responsibility of any form.
This won't solve all problem's in the case of Combell as they want to provide external access or access from other applications, too. But I could imagine a solution using such a plugin for PHP as the overhead is minimal (in the version above one hash lookup, in the download version one hash lookup and a, well cached, stat call during connect which both can be neglected) and a proxy-based solution for other systems.
Some more resources:
Oct 31: Slides from IPC and PHP Barcelona
Read More
Sep 2: iSCSI devices in VirtualBox
While playing with my disc-less environment, booting via gPXE from an iSCSI device, I found out that VirtualBox has direct support for iSCSI while this isn't shown in the GUI. To register an iSCSI device as virtual disc one has to use VBoxManage on the command line:
$ VBoxManage addiscsidisk \
--server iscsi.home.schlueters.de \
--target iqn.1986-03.com.sun:02:f4635770-f010-61be-e8f7-83403206bec5
The result can be checked:
$ VBoxManage list hdds [...] UUID: 292b93ef-0fef-41c4-ad9c-700c38d5537e Parent UUID: base Format: iSCSI Location: iscsi.home.schlueters.de|iqn.1986-03.com.sun:02:f4635770-f010-61be-e8f7-83403206bec5 State: created Type: normal Usage: test (UUID: 0e9dac71-1a33-42f3-8320-1f157582e931)
Once the disc is registered it can be selected in the GUI (or from command line) while configuring a VM. In my test I once lost the network connection to the iSCSI target, VBox handled that in a safe way by pausing the VM. Quite handy.
Aug 22: HashTables
In case you ever heard me talking about PHP internals I certainly mentioned something along the lines of "Everything in PHP is a HashTable" that's not true, but next to a zval the HashTable is one of the most important structures in PHP. While preparing my "PHP Under The Hood" talk for the Dutch PHP Conference there was a question on IRC about extension_loaded() being faster than function_exists(), which might be strange as both of them are simple hash lookups and a hash lookup is said to be O(1). I started to write some slides for it but figured out that I won't have the time to go through it during that presentation, so I'm doing this now, here:
You all should know PHP arrays. They allow you to create a list of elements where every element may be identified by a key. This key may either be an integer or a string. Now we need a way to store this association in an effective way in memory. An efficient way to store a collection of data in memory is a "real" array, an array of elements indexed by a sequence of numbers, starting at 0, up to n. As memory is essentially a sequence of numbered storage blocks (this obviously is simplified, there might be segments and offsets, there might be virtual pages, etc.) we can efficiently access an element by knowing the address of the first element, the size of the elements (we assume all have the same size, which is the case here), and the index: The address of the n-th element is start_address + (n * size_of_element). That's basically what C arrays are.
Now we're not dealing with C and C arrays but also want to use string offsets so we need a function to calculate a numeric value, a hash, for each key. An hash function you most likely know is MD5, MD5 is creating a 128 bit numeric value which is often represented using 32 hexadecimal characters. For our purpose 128 bit is a bit much and MD5 is too slow so the PHP developers have chosen the "DJBX33A (Daniel J. Bernstein, Times 33 with Addition)" algorithm. This hash function is relatively fast and gives us an integer of the value, the trouble with this algorithm is that it is more likely to produce conflicts, that means to string values which create the same numeric value.
Now back to our C array: For being able to safely read any element, to see whether it is used, we need to pre-allocate the entire array with space for all possible elements. Given our hash function returning a system dependent (32 bit or 64 bit) integer this is quite a lot (size of an element multiplied wit the max int value), so PHP does another trick: It throws some digits away. For this a table mask is calculated. The a table mask is a number which is a power of two and then one subtracted and ideally higher than the number elements in the hash table. If one looks at this in binary representation this is a number where all bits are set. Doing a binary AND operation of our hash value and the table this gives us a number which is smaller than our table mask. Let's look at an example: The hash value of "foobar" equals, in decimal digits, to 3127933054. We assume a table mask of 2047 (2¹¹-1).
3127933054 10111010011100000111100001111110 & 2047 00000000000000000000011111111111 = 126 00000000000000000000000001111110
Wonderful - we have an array Index, 126, for this string and can set the value in memory!
If life were that easy. Remember: We used a hashing function which is by far not collision free and then dropped almost two thirds of the binary digits by using the table mask. This makes it rather likely that some collisions appear. These collisions are handled by storing values with the same hash in a linked list. So for accessing the an element one has to
- Calculate the hash
- Apply the table mask
- locate the memory offset
- check whether this is the element we need, traverse the linked list.
Now the question initially asked was why extension_loaded() might be faster than function_exists() and we can tell: For many random reasons and since you have chosen a value which probably conflicts in the first, not in the second. So now the question is how often such collisions happen for this I did a quick analysis: Given the function table of a random PHP build of mine with 1106 functions listed I have 634 unique hash values and 210 hash values calculated from different functions. Worst is the value of 471 which represents 6 functions.
Full results are online but please mind: These results are very specific to my environment. Also mind that the code actually works on a copy of my function table so the table mask might be different, which changes results. Also note that the given PHP code won't work for you as I added special functions exporting the table mask and hash function to mz version of PHP. And then yet another note: doing performance optimizations on this level is the by far worst place as to many unknown factors go in. And you don't have any measurable performance win. Mind readability. Clear code is worth way more than the 2 CPU cycles you probably gain! But you may have learned how hash tables work.
Aug 20: References and foreach
References in PHP are bad, as I mentioned before, and you certainly should avoid using them. Now there is one use case which leads to an, at first, unexpected behavior which I didn't see as a real live issue when I stumbled over it at first, but then there were a few bug reports about it and recently a friend asked me about it ... so here it goes:
What is the output of this code:
<?php
$a = array('a', 'b', 'c', 'd');
foreach ($a as &$v) { }
foreach ($a as $v) { }
print_r($a);
?>
We are iterating two times over an array without doing anything. So the result should be no change. Right? - Wrong! The actual result looks like this:
Array
(
[0] => a
[1] => b
[2] => c
[3] => c
)
For understanding why this happens let's take a step back and look at the way PHP variables are implemented and what references are:
A PHP variable basically consists out of two things: A "label" and a "container." The label is the entry in a hash table (there are a few optimizations in the engine so it is not always in a hashtable but well) which may represents a symbol table of a function, and array or an object's property table. So we have a name and a pointer to the container. The container, internally called "zval", stores the value and some meta information, this container can also be a new hashtable with a set of labels pointing to other containers if we now create an reference this will cause a second label to point to the same container as another label. Both label from then on have the same powers of the container.
Now let's look at the situation from above. In a picture it looks like this:

So we have six containers (the global symbol table on the top, a container holding the array called $a on the left and one container for each element on the right) Now we start the first iteration. So the global symbol gets a new entry for $v and v is made a reference to the container of the first array element.

So an change to either $a[0] or $v goes to the same container and therefore has an effect to the other. When the iteration continues the reference is broken and $v is made a reference to the different elements. So after the iteration ends $v is a reference to the last element.

Remember: $v being a reference means that any change to $v effects the other references, in this situation $a[3]. Up till now nothing special happened. but now the second iteration begins. This one assigns the value of the current element to $v for each step. Now $v is a reference to the same element as $a[3] so by assigning a value to $v $a[3] is changed, too:

This continues for he next steps, too.


And now we can easily guess what will happen at the last step: $v is being assigned the value of the last element, $a[3], and as $a[3] is a reference to $v it therefore assignees itself to itself so effectively nothing happens.

And this is the result we saw above.
So to make this story full of pictures short: Be careful about references! They can have really strange effects.
Aug 19: MySQL at FrOSCon
Oh time is flying! - This weekend it is already time for FrOSCon, the Free and Open source Conference in St. Augustin close to Western Germany's former capitol Bonn. The conference consists out of a main track and different side tracks, like the PHP developer room and the OpenSQL sub-conference.
In the PHP developer room I will give an overview over things that happened at MySQL, especially in regards to PHP in recent times. My colleague Ulf Wendel will then go and talk about plugins to mysqlnd - the MySQL native driver for PHP - in detail.
In the OpenSQL Camp track you can find other interesting MySQL related talks which will, unfortunately, not leave you with enough time to watch all the interesting talks of the other tracks. And that all for an entrance fee of just 5€! So if you have a chance: Go there and say hi!
Aug 7: Scalar type hints in PHP trunk
So in my blog series I try to cover all additions to PHP trunk so I have to mention scalar type hints.
<?php function print_float(float $f) { echo $f."\n"; } for ($i = 1; $i < 5; $i++) { print_float( $i / 3 ); } ?>0.33333333333333
0.66666666666667
Catchable fatal error: Argument 1 passed to print_float() must be of the type double, integer given, called in typehints.php on line 7 and defined in typehints.php on line 2
Is expected behavior in PHP's trunk. If you want such a thing to work please use the numeric type hint.
In case that wasn't enought fun: There's more!
<?php function handle_result(int $i) { echo $i."\n"; } $pdo = new PDO("mysql:host=localhost;dbname=test", "user", "pass"); $pdo->setAttribute(PDO::MYSQL_ATTR_DIRECT_QUERY, false); $result = $pdo->query("SELECT 42 AS id"); $row = $result->fetch(); handle_result($row['id']); $pdo = new PDO("mysql:host=localhost;dbname=test", "user", "pass"); $pdo->setAttribute(PDO::MYSQL_ATTR_DIRECT_QUERY, true); $result = $pdo->query("SELECT 42 AS id"); $row = $result->fetch(); handle_result($row['id']); ?>42
Catchable fatal error: Argument 1 passed to handle_result() must be of the type integer, string given, called in typehints.php on line 16 and defined in typehints.php on line 2
So what happens here? - Depending on the PDO::MYSQL_ATTR_DIRECT_QUERY option PDO will either emulate prepared statements (which would be irrelevant here, but that's another story) or use native prepared statements. When using prepared statements MySQL switches over to its "binary" protocol which returns the native types, else it always returns strings. When using PDO_mysql linked against mysqlnd the native type is returned to PHP. If I would use libmysql both times a string would be returned. In other words: The behavior is system dependent. The default behavior for other drivers isn't defined either. As PHP is a dynamically typed language this shouldn't matter, so depending on your driver the results vary. Great. Oh and did I mention that the types aren't specified, so a later version of a driver might change it. And PDO is just a simple example for this ...
The key point of this all is that as soon as you mix strong typing - by using type hints - with PHP's weak type system you open a can of worms.
And to be overly clear: I can understand the need for strict type systems and see where such a thing helps. But adding a strong type system, stricter than most other languages (hey, you can't pass an integer where a float is expected!), makes little sense to me. But that's just me.
As in my previous posts in the series of blog posts about features in PHP trunk. Please try the snapshots. Mind that these features might (not) end up in the next release of PHP, feedback is welcome.
Jul 31: Features in PHP trunk: Array dereferencing
I was writing about new features in the upcoming PHP version (5.4, 6.0?) before. Today's topic reads like this in the NEWS file:
- Added array dereferencing support. (Felipe)
Now you might wonder what this typical short entry means. when doing OO-style PHP you might make use of a sntax feature which one might call "object dereferencing" which looks like this:
<?php
class Foo {
public function bar() { }
}
function func() {
return new Foo();
}
func()->bar();
?>
So one can chain method calls or property access. Now for a long time people requested the same thing for array offset access. This was often rejected due to uncertainties about memory issues, as we don't like memory leaks. But after proper evaluation Felipe committed the patch which allows you to do things like
<?php
function foo() {
return array(1, 2, 3);
}
echo foo()[2]; // prints 3
?>
Of course this also works with closures:
<?php
$func = function() { return array('a', 'b', 'c'); };
echo $func()[0]; // prints a
?>
And even though the following example is stupid I might accept this feature as one of the few places where it is ok to use references in PHP:
<?php
$data = array('me', 'myself', 'you');
function &get_data() {
return $GLOBALS['data'];
}
get_data()[2] = 'I'; // $data will now contain 'me', 'myself' and 'I'
?>
Wonderful, isn't it? If you want to test it please take a look at the recent snapshots for PHP trunk and send us your feedback! Please mind that all features in PHP trunk may or may not appear in the next major PHP release.
Jul 25: ZFS and VirtualBox
With ZFS you may not only do "file system" stuff but ZFS may also provide raw block devices ("zvol") which benefit from ZFS space accounting, snapshotting, checksumming, etc. A purpose of these is to use these zvols and exporting them via iSCSI or give them to applications which can store data on them. One application for this I'm using s VirtualBox and as I always forget the exact commands needed to create a zvol and making it available to VBox I decided to write it down.
Reasons for me for using zvols instead of regular VBox disks are that I can easily snapshot them (every 15 minutes a snapshot ...) individually and can easily clone them (around one second and barely any disk space needed to get a clone of a VM to do some experimental stuff...) and incremental backups using snapshots and zfs send. That said there's at least one - possibly - negative factor: A regular virtual disk file can be shared with other people and other operating systems, a zvol has to be dumped into a regular vmdk first.
Anyways here are the steps needed:
# zfs create -V 10G tank/some_name # chown johannes /dev/zvol/rdsk/tank/some_name # VBoxManage internalcommands createrawvmdk \ -filename /home/johannes/VBoxdisks/some_name.vmdk \ -rawdisk /dev/zvol/rdsk/tank/some_name # VBoxManage registerimage disk /home/johannes/VBoxdisks/some_name.vmdk
So first we create the zvol with a size of 10G. This won't be allocated but everybody asking for the size of the device will get this information back and this is the maximum size that will be used - as one can use compression and dedup there this often might be way less usage. Then, as I'm running VBox under my user account, I give my user all the rights needed by making the regular user the owner. The third step creates a vmdk file pointing to the raw device location which is then registered with VirtualBox so a VM can be configured for using it.
Works nicely.

