Dec 30: MySQL Storage Engine based on PHP


Sometimes one has weird ideas, or am I the only one? - This specific one is at least a year old, now, during the Christmas days, waiting for New Year's Eve I had the time and mood to finally try it out: MySQL 5.1 has a plugin interface to easily add storage engines. PHP can easily embedded into other applications. So why not combine these two things? - Writing a MySQL Storage Engine which reads data by calling a PHP script.
Let's start with a simple example first:
<?phpfunction create_table($table, $data) {
    return true;
}
function open_table($table) {
    return new ArrayIterator(array(
        array('id' => 1, 'dat' => 'foo'),
        array('id' => 2, 'a' => 'bar')
    ));
}
?>
This is the bare minimum storage engine my plugin supports. create_table() is called for creating the table, open_table() to access it, the later one then returns an iterator which is used for a full table scan. This example uses an ArrayIterator, which implements the SeekableIterator and the Countable interfaces, the first one provides a seek() method, which is called to read specific rows after sorting for instance, the later provides a method count() which gives the optimizer a hint.
Let's use this table:
mysql> CREATE TABLE php_test (id int, val CHAR(3)) ENGINE=PHP;
Query OK, 0 rows affected (0.04 sec)
mysql> SELECT * FROM php_test;
+------+------+
| id | val |
+------+------+
| 1 | foo |
| 2 | bar |
+------+------+
2 rows in set (0.00 sec)
Ok, of course that's nice and shiny but well, it's read only. To solve that you can implement a few interfaces provided by the plugin to handle writes:
<?php
class Test extends ArrayIteratorÂ
implements MySQLStorage_Writable, MySQLStorage_Updatable, MySQLStorage_Deletable {
    public function write($data) {
        $this[] = $data;
    }
    public function update($data) {
        $this[$this->key()] = $data;
    }
    public function delete() {
        unset($this[$this->key()]);
    }
}
function create_table($table, $data) {
    return true;
}
function open_table($table) {
    return new Test(array(
        array('id' => 1, 'dat' => 'foo'),
        array('id' => 2, 'a' => 'bar')
    ));
}
?>
Again, we can test it:
mysql> UPDATE php_test SET val = 'baz' WHERE id = 1;
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> DELETE FROM php_test WHERE id = 2;
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO php_test VALUES(3, 'bar');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM php_write;
+------+------+
| id | val |
+------+------+
| 1 | baz |
| 3 | bar |
+------+------+
2 rows in set (0.00 sec)
As a reminder: This is calling PHP for all these operations.
So what might real life use cases be, once the major issues in the code are fixed? I have a few ideas like
- A live-logfile query tool, not sure that's really need, see the primitive Apache httpd access_log parser which is provided with the code as an example
- Combine it with the embedded MySQL server and use this storage engine for your unit tests, "mock tables" ...
Any other ideas? - Leave a comment
Oh, and like most MySQL stuff nowadays: There's a launchpad project for this plugin.
Dec 25: Goto your Christmas presents with PHP 5.3

Over the last few days I already mentioned a few hidden gems from PHP 5.3. Now at Christmas I wanted to take a look at some new language feature of the upcoming PHP version:
Added "jump label" operator (limited "goto"). (Dmitry, Sara)
The entry is a imprecise on purpose, since it's not to be advertised too much to not be abused too much, but well, you're reading this on Christmas instead of spending time with your family, so I guess you're a geek and know already: PHP 5.3 introduces not only namespaces but also goto. Yes it's a "goto label;" the limitations i, mentioned in the NEWS entry, are: You can only jump within the same execution unit (function or global part of the same file file) and you can't jump into loops.
When you know about goto I'm sure you know it's bad, so why did we added? - Well there's a very limited set of problems where it's ok. One is generated code, a code generator using goto can be written way better than without goto and nobody is supposed to read that code anyways. The second situation is when having a longer piece of code, where situations might occur where you cancel execution sin the middle of the code but want to do some cleanup nonetheless. A short pseudo-code example:
<?php
function process_file($filename) {
    $fp = fopen($filename, "r");
    if (!$fp) {
        goto cleanup;
    }
    $row = fread($fp, 1024);
    // do something with the row
    if ($error_while_processing) {
        goto cleanup;
    }
    $a_few_bytes = fread($fp, 4);
    // do something again ...
        if ($error_while_processing) {
        goto cleanup;
    }
    /* ... */cleanup
:
    fclose($fp);
}?>
There are alternatives available, like wrapping this all in loops and using break or wrap it in an try {} catch block and throw exceptions, but goto can be cleaner. So have fun and use it with care!
Dec 22: Improved getopt() in PHP 5.3

So PHP 5.3 has lots of new stuff offer, so let's take a look at one change:
Added long-option feature to getopt() and made getopt() available also on win32 systems by adding a common getopt implementation into core. (David Soria Parra, Jani)
PHP's focus is clearly in the Web but using the CLI SAPI you can use it for command line scripts, too. A common task to every command line script is reading parameters from the command line. PHP offers a getopt() function for some time which is based on the getopt() C function provided by the operating system. The problem there is that all of these behave a little bit different and some systems, like Windows, don't offer getopt() at all, so PHP's function is disabled there, too. David now sat down and used an implementation of getopt, which was actually already used y the CLI SAPI itself, and implemented PHP userland getopt() on top of it. The benefits are stated in the above NEWS entry. Examples can, like in the previous articles of this series, found in the fine manual.
Dec 19: Data structures in PHP 5.3

In the programming world there are quite a few well understood and explored data structures. Which are commonly used in tons of applications, still the only things PHP offered till 5.3 in regards to structuring data were arrays (more precise: hash tables) and objects. So people had to either abuse them for other structures or implement the structures themselves on top of these. Thanks to Etienne things now are changing and PHP's Standard PHP Library (SPL) extension will offer quite a few standard implementations of data structures. The implemented data structures are implemented using these classes:
- SplDoublyLinkedList
- SplStack
- SplQueue (SplPriorityQueue)
- SplHeap (SplMinHeap, SplMaxHeap)
- SplFixedArray
Except for SplFixedArray the ideas behind these classes should be clear. If not you can read about them over at Wikipedia or in basically every general introduction to software programming book. For the exact naming and details of the implementation chosen for PHP you can read the fine manual. So why should you use these instead of implementing your own? - There are a few reasons the important ones are:
- Why do work others have done before again and again? - Code used in more places is better tested and has less bugs.
- Having code implemented native in C is faster than doing this in PHP userland.
- Using default data structures makes the code more maintainable
The only question left there might be: "Ok, that's nice and cool and I could use them, but I have to support older PHP versions for at least some time. What can I do?" - The solution there is easy as well: Most stuff in SPL doesn't do any special trick that requires an implementation in C but can also be implemented in userland. In fact most stuff is tested as userland code first to design the interfaces nicely. Others are implemented in PHP after being implemented in C as kind of documentation. These implementations are distributed with PHP source releases or via CVS, so you can simply take that code and use it.
But now let's get back to the one class which might not be clear: SplFixedArray. SplFixedArray is an implementation of a data structure similar to PHP's arrays with a big limitation: It only supports numeric indexes in a predefined range (starting at 0 and going up) like the ArrayObject from 5.2 it's class based so it can't be used with all thousands of array functions PHP offers, but within it's domain and fucntionality it's very fast and uses less memory than classic PHP arrays. so if you have a big array where numeric indexing is enough: Use it!
Dec 17: NetBeans plugin for running phpt tests


One of the things I do quite often is running PHP's regression test suite which is using a custom test format called .phpt. The PHP source distribution, and CVS checkouts, include a nice script for running them called run-tests.php.
run-tests.php gives a summary of failed tests in the end. As a developer I'm now interested for the reasons for the failures. The test system therefore produces a bunch of files, a file containing the expected output, one containing the actual output and a diff between these as relevant files. The problem there is that the diff, for being portable, is using a quite simple machnism which only shows the lines which differ without any context. This makes it quite hard to read. Therefore I usually diff the .exp and .out files myself for doing that I have a few simple shell scripts which I call with the test name and then get a proper diff.
Lately I've changed my way of working and use vim less, I still use it, but I use NetBeans as an IDE more and more. So I thought a bit about that test issue and searched my brain for my Java skills and started playing around to see whether I manage to write a NetBeans plugin which can run the tests and report the results in a usable way. Over the time it looked quite promising, so I registered a launchpad project and imported the code to a repository there. Now I think I've reached a true milestone which should basically work for interested folks. And released a version 0.6.0.
The module registers an menu entry and a toolbar icon to start a wizard. This wizard asks you for the PHP binaries to use and test directory to run. It will then run the tests and open a window with the results of each test. By double clicking or using a context menu you can then get the actual test code and a diff. If you have PHP support installed in NetBeans the code will also be highlighted.
There are still a few issues most importantly you need a special version of run-tests.php, which is bundled, and not all things, like redirect tests, are supported. There are certainly other issues, but the aim is mostly to help me. If you're interested feel free to report a bug and maybe I look into it. If you're brave you can also fetch the code and start hacking it - but keep in mind I'm no Java developer and had no idea what I'm aiming at and how to write NetBeans modules when I started, you can see that in the code
This code requires Java 6 (since I make use of SwingWorker) and I didn't test older NetBeans versions than 6.5 since that version has proper PHP support and there's little sense in having older versions installed.
P.S. The screenshot is a bit faked: The inner windows don't appear exactly like that, yet, but that's the aim and you can easily use Drag'n'Drop inside the IDE to arrange them till I found out how to that from within the code.
Dec 17: A hidden gem in PHP 5.3: fileinfo

PHP 5.3.0 Alpha 3 has recently been released and marks the feature-complete set of stuff the upcoming release will offer. There was lots of talk about namespaces so other stuff could easily get lost in the long NEWS file so I plan to present some of the more hidden gems, sometimes just a new function, sometimes new language constructs. This series is not meant to be complete but some personal choice, these blog postings are also no replacement for documentation, but just pointers. My goal is that you try out 5.3 right now so we can fix bugs as soon as possible before releasing it
The NEWS file has a quite short entry for my first subject:
Added fileinfo extension as replacement for mime_magic extension. (Derick)
So what's fileinfo? - Fileinfo is a solution for a quite common problem: Consider you offer downloads and want to set the correct Content-Type HTTP header. How do you get the mime type to use? Many applications go by the file extension, keep a list of them and then set the header accordingly. Problem there: You have jsut a very limited set of extensions in your list and so chances are high your file isn't there. Other workarounds include calling "file" using exec() or mime_content_type() which has been deprecated for ages.
So why is it better than the old mime_content_type()? The biggest benefit is simpler configuration, for using the old function you had to configure a magic file somewhere, fileinfo includes that file so there's no need for special configuration. Additionally fileinfo can not only work on files in the filesystem but, for instance, also on strings in memory. So if you have your data in memory finfo_buffer() will tell you the mime type of some variable content or you can use PHP stream wrappers.
For people with legacy code there's a nice change included: mime_content_type() is now using fileinfo, too, internally so people using it benefit from the improvements in fileinfo without any trouble. For getting started: Take a look at the fine manual which includes examples to get you started.
Dec 15: OpenSolaris ....
Dec 8: magic_quotes
Mal ein deutsches Posting zu PHP: Im Rahmen der "Weg mit magic_quotes"-Diskussion muss ich einfach Karl Valentin zitieren:
Mögen hätt ich schon wollen, aber dürfen hab ich mich nicht getraut.