This is a random item from my wishlist:
Large-Scale C++ Software Design (Addison-Wesley Professional Computing).
The full list is on amazon.de
Ich verstehe vieles nicht, dazu gehört folgendes: Warum geben Leute alle ihre mails, ihre Kontaktdataten, ihre Kalendereinträge, ihre Dokumente an ein Unternehmen, dass, dank Werbung, weiß auf welchen Webseiten man sich rumtreibt, was einen interessiert? Warum kauft man ein Telefon, dass dem Unternehmen auch die genau Position mitteilt, Sprecherkennung auf zentralen Servern verarbeitet udn alle Einstellungen an den Hersteller übeträgt. Warum nutzt man einen Browser, dessen Hauptaufgabe es ist Daten zusammeln und an dne Hersteller zu liefern? Nein das verstehe ich nicht.
Aber was ich dabei ganz besonders nicht verstehe: Warum stören sich Leute an sowas wie Vorratsdatenspeicherung, aber nicht daran, dass ein Unternehmen, das in einem Land mit schwäscheren Datenschutzbestimmungen sitzt und auf Datensammlung und -auswertung spezialisiert ist, weit mehr Daten sammelt? - Nein das verstehe ich wirklich nicht.
Natürlich ist es nett, wenn sich die Daten zwischen Telefonbuch und Mail-Anwendung ohne Spezialanwendung synchronisieren, natürlcih ist es nett tolle Auswertungen über Website user zu haben, natürlich, ... aber zu welchem Preis?
Last year I spoke at eight conferences and attended a few more multiple times at most of them I found myself in discussions about references and PHP as many users seem to have wrong understandings about them. Before going to deep into the subject let's start with a quick reminder what references are and clear some confusion about objects which are "passed by reference."
References are a way to have multiple variables referencing the same variable container using different names -- so whatever name you're using an operation on that variable will always have an effect on the others.
Let's look into it with some code to make this all clearer. For a start we simply do a regular assignment from one variable to the other and change it:
<?php $a = 23; $b = $a; $b = 42; var_dump($a); var_dump($b); ?>
This script will tell us that $a still is 23 and $b equals 42. So what happened here is that we got a copy (more on what actually happened later...) now let's do the same with a reference:
<?php $a = 23; $b = &$a; $b = 42; var_dump($a); var_dump($b); ?>
Now suddenly $a changes to 42, too. In fact there is no difference between $a and $b and both are using the same internal variable container (aka. zval). The only way to separate these two is by invalidating one of the variables using unset().
References in PHP can't only be created in regular assignments but also for function parameters or return values:
<?php
function &foo(&$param) {
$param = 42;
return $param;
}
$a = 23;
echo "\$a before calling foo(): $a\n";
$b = foo($a);
echo "\$a after the call to foo(): $a\n";
$b = 23;
echo "\$a after touching the returned variable: $a\n";
?>
The result from this is, well what do you expect? Right - it looks like this:
$a before calling foo(): 23 $a after the call to foo(): 42 $a after touching the returned variable: 42
So we initialize a variable, pass it to a function as referenced parameter. The function changes it and it has the new value. The function returns the same variable, we change the returned variable and the original value ... wait it didn't change!? - Yes references are mean. What happened is the following: The function returned a reference, referencing the same zval as $a and the = assignment operator creates a copy of it.
To fix this we have to add one & more:
$b = &foo($a);
Then the result is what one would expect:
$a before calling foo(): 23 $a after the call to foo(): 42 $a after touching the returned value: 23
Summary so far: PHP references are alias to the same variable and properly using them can be hard. For details on the reference counting, whichisthebase for this, check the according section in the manual.
When PHP 5 came to live one of the big changes was how objects were handled. The general explanation is something like this:
In PHP 4 objects are treated like other variables so when using them as function parameters or doing assignments they are copied. In PHP 5 they are always passed by reference.
Which isn't entirely correct. The issue to solve was about object oriented patterns: Objects are passed as parameters to some function or method, this function sends a signal to the object (aka calls a method) which then might change the object's state (aka. its properties). For this to work the object has to be the same. PHP 4 OO users now always passed explicit references, which is, as we saw above, tricky to do correctly. To make this nicer in PHP 5 an object storage which is independent from the variable container was introduced. So inside the variable we don't store the whole object anymore (which basically means the properties table plus class information) but a reference to an object inside an object storage - so if we create a copy of the variable we don't copy the object but this reference (or: handle) so it feels like an reference, but be aware it is no reference but a different concept. The difference can be seen by directly changing the variable:
<?php // create an object and a copy as well as a reference to the variable $a = new stdclass; $b = $a; $c = &$a; // Do something with the object $a->foo = 42; var_dump($a->foo); var_dump($b->foo); var_dump($c->foo); // Now change the variable itself $a = 42; var_dump($a); var_dump($b); var_dump($c); ?>
When running this you can see that the access to the property really affects the copy, too but in the last assignment you can see the difference to an reference as $b is not affected by it. This is the behavior most (all?) people with OO experience expect.
So OO was one valid reason for using references, but as PHP 4 is dead for over one year now old code using this should really cleaned up!
Another reason people use reference is since they think it makes the code faster. But this is wrong. It is even worse: References mostly make the code slower!
Yes, references often make the code slower - Sorry, I just had to repeat this to make it clear.
When coming from other languages from other languages people read in style guides that passing copies of large structures or strings should be avoided as creating a copy takes time. In some environments complex structures have to be passed as pointers, which is a fundamentally different model from references, and people take this to PHP references. But PHP is not that other language but PHP with PHP's runtime and in PHP we do copy-on-write.
With copy-on-write we don't copy on an assignment or function call but just note that there are multiple independent variables pointing at one and the same variable container and only if there is a write operation we separate the variable, which is written to, from the others. This means that even so a variable looks like a copy it's in fact no copy and the function call takes no penalty do to big parameters. The problem with references now is that they disable the copy-on-write mechanism so any following non-reference assignment using this variable will create an immediate copy. This in itself won't be bad - you could simply use references everywhere, well not really: PHP is built around the copy-on-write availability so most internal functions expect copies.
Somewhere I found code which something looks like this:
<?php
function foo(&$data) {
for ($i = 0; $i < strlen($data); $i++) {
do_something($data{$i});
}
}
$string = "... looooong string with lots of data .....";
foo(string);
?>
Now the first issue with this code is obvious: It is calling strlen() in a loop for each iteration while the length is calculated. So that's strlen($data) function calls while a single one would be enough. Now with strlen() it won't be too bad as, unlike in a language like C, strings in PHP directly carry the length so no calculation is needed, in general. But now in this case the developer tried to be smart and save time by passing a reference. But well, strlen() expects a copy. copy-on-write can't be done on references so $data will be copied for calling strlen(), strlen() will do an absolutely simple operation - in fact strlen() is one of the most trivial functions in PHP - and the copy will be destroyed immediately.
If no reference is being used no copy is needed which makes the code way faster and even if strlen() would take the reference you wouldn't have won anything.
Summary so far:
Now a third thing which is done with references is bad API design by returning via reference parameters. The issue here is, again, that people forget that PHP is PHP and not another language.
In PHP you can return multiple types from the same function - so if the function was successful you could return a string and a boolean false in case of an error. PHP also allows to return complex structures like arrays and objects, so if multiple things are to be returned they can be packed together. Additionally there are exceptions as a way to return from a function.
Using referenced parameters is a bad thing, additionally to the fact that references are bad and cause performance penalties using references in this way makes code hard to maintain. Having such a function call:
do_something($var);
Would you expect that $var will change? - No. But if do_something() takes it as a reference it could happen.
Another problem with such APIs is that function calls can't be nested but you always have to use a temporary variable, now nesting function calls can also reduce readability, but there are enough situations where nesting makes the code clearer.
My personal favorite example for a bad design decision in regards to references is PHP's own sort() function. sort() takes an array as reference parameter which will be returned in sorted order by reference. It would be way nicer to return the sorted array as regular return value. The reason for this is history: sort() is older than copy-on-write. Copy-on write was introduced with PHP 4, while sort() is way older and from times before PHP really was its own language but a shortcut to do some things in the Web.
To sum it up: References in PHP are bad. Do not use them. They hurt and will just mess with things and do not expect to be able to outsmart the engine with references!
Two years ago at some conference I had a conversation with Sebastian about the need for a way to overload the new operator in PHP so, probably, bad designed code can be tested more easily by replacing specific classes with mocks. On the train ride home -- I like coding on a train without the disturbing Internet -- I came up with a proof-of-concept extension for PHP and sent it to Sebastian.
Then we more or less forgot about it or did other things until a few days ago where both, Sebastian and I, independently remembered it. Sebastian then pushed the code as part of a new test_helpers extension with some documentation to github and I fixed some bugs in it. The aim of the extension is to collect functionality which might be beneficial for phpUnit and other test scenarios but which should never reach a production environment.
Currently only the new overloading is part of this extension. A simple example might look like this:
<?php
class Foo {}
class Bar {}
function callback($className) {
if ($className == 'Foo') {
$className = 'Bar';
}
return $className;
}
var_dump(get_class(new Foo));
set_new_overload('callback');
var_dump(get_class(new Foo));
?>
Which will print
string(3) "Foo" string(3) "Bar"
Today Sebastian was brave and released it as 1.0.0 on phpUnit's PEAR channel. Please refer to the README for further information.
Ok, so this site (and some other stuff) is now running on OpenSolaris. The previous previous article was mostly a test entry for me to see whether the DNS update was through but as some people wonder why I'm using this system that "fails while trying to copy Linux" I decided to discuss some of the reasons in more detail.
Some people already know that my main system meanwhile runs OpenSolaris. The reason there is DTrace - a great way to see what the system, from the kernel, over userspaces programs, into a VM like the JVM or PHP's Zend VM, ... is doing which is a big help while debugging and developing applications. Even though DTrace is meant to do such analysis on live machines this wasn't the main reason for this choice on the server. For the server I actually didn't plan a change, ok, the old Linux box wasn't maintained well but it worked good enough for the few things it does, but then David came along and had the idea to share a server so I started thinking about dropping the old contract and getting a new machine for us both - and possible some other friends. And there we find the actual reasons for the OS choice:
So we were planing to share a box as both of us are doing Web/PHP-related stuff it was clear that it's likely that both of us would might need special versions and configurations of some software components which will then conflict with each other. Additionally I want to be able to do a killall apache in case I configured something wrong and I don't want the others to be affected too much while configuring my web servers as I need/want them. The obious solution these days? - Virtualization.
Now virtualization comes in many flavors. The simple one most people know is Desktop Virtualization, so you take a software like VirtualBox, which is running as a regular userspace application and holds a complete operating stack. In there one has a kernel of the virtualized system which thinks it's running directly on physical hardware. The big benefit is that one can run any operating system in the VM but also has negative effects in areas like disk buffers (the virtualized and the host kernel buffer independently) or overall process scheduling (the VM is scheduled by the host and then schedules itself again..) or syscalls (an application running in the VM does a syscall to the VM's kernel which then calls a Hypervisor-provided hardware emulation function which then triggers a syscall on the hostsystem)
Another approach is Operating System Virtualization like Solaris Zones. Here the operating system handles the virtualiztion. With zones this works in a way were one has a single kernel and multiple userland instances. By this one has one kernel with one scheduler (ok, Solaris allows using different schedulers and so on - let's ignore this and look at the default) and one disk IO layer. Inside a Zone one has Zone-specific userland with service management an own network device (see more on this below), an own user database (/etc/passwd, LDAP, ...) and so on. But as of the syscall interface it all runs on one kernel which also means that all processes are handled equally by the kernel (unless configured otherwise)
The result of using Solaris Zones is that one has a lightweight isolation of independent userland environments. Now as said the virtualisation has one boundary at the syscall layer, so the userland has to be Solaris - one thinks. But that's not true: There are Branded Zones which emulate another syscall interface,by that one can run a Linux userland on a Solaris kernel so Linux-only apps benefit from stuff like ZFS and DTrace - but that's not relevant for me here.
So to summarize: Zones are great for lightweight isolation (and other stuff)
Now I was mentioning that each Zone can have it's own network interface assigned.This is nice if you have a box with many network devices - now a typical server you get as a root-server for little money usually has just one. Now what you traditionally can do is assigning multiple IPs to that device and then use the single device shared over multiple zones. That works but is inconvenient as you can't really check the status (which device/zone is producing how much traffic?) or add bandwidth limitations (I want to be able to reduce one zones bandwidth in case an article is slashdotted without going to deep into everything to keep other parts of the system running) and additionally IP addresses are limited and I don't want all zones to be publicly accessible - for instance my MySQL zone can't be reached from the outside.
Now crossbow - that's the name of the Solaris network virtualization layer introduced with OpenSolaris 2009.06 - for me always was a so what thing till I started using it. Well yes you can create virtual switches and virtual network interfaces. So what? Well combined with zones I can achieve what I described in the above paragraph.
So let's build a network:
dladm create-etherstub mystub0
dladm create-vnic -l mystub0 vnic0
dladm create-vnic -l mystub0 vinc1
That's all that's needed to create an internal ethernet with two devices. Next step is to assign them to zones and configure IP for this network. In my current setup I have a zone for this web site and one zone for the MySQL server. The MySQL zone has a vnic for an internal network, the web-zone has two vnics - one is used for the internal network and the second is configured to work on top of the physical networking device so it can talk to the outside using its own public IP address. For limiting resources and stuff there's the flowadm tool for simple access to control network resource limits or service priorities (ssh connections have higher priorities so the system can be controlled in case the network is busy)
And even for me, who tries to stay above the TCP layer, this is quite trivial to setup.
Now one of the most cited features of Solaris is the zfs filesystem. While zfs is more than just a filesystem - it's a combination of volume manager, raid controller and other related things. The key feature there for me is snapshotting: zfs is using a copy on write mechanism so zfs can create snapshot which in itself has barely no costs. Only if data is changed a new block is being written and the old one is kept untouched by that the snapshots costs only the space the difference needs. Additionally this allows clones so one gets a copy of a directory and it will cost space only if data is changed - that's of special interest with zones. As said each zone is it's own userspace system. By using zfs clones they share the same blocks on disk. Really useful. In the next version this will even be better thanks to deduplication in zfs ...
Coming from Linux there are - of course - different problems, as I'm using OpenSolaris on other boxes for sometime now I'm used to many administration tools but I learn new things every time i work on the system.
A bit more problematic is that the main OpenSolaris package repository doesn't offer as much software as typical linux distributions, but for most software packages can be found in other repositories, too. This is a bit annoying but as one can see the growth and has access to above mentioned features this is no big problem - especially on a server where most of the tools exist for Solaris, too.
Oh, and for the German speakers: David and I discussed some experience while installing the server in the latest HELDENFunk podcast.
So, this website moved. It isn't the citizen of a Linux box anymore but is running inside a zone on an OpenSolaris host. The only non-default software powering this server I compiled myself is a current svn snapshot of PHP 5.3.2-dev. Let's see if I can keep this system clean or whether it becomes such a mess as the old Linux box. For now I'm happy about the isolation using zones, snapshots with ZFS before playing around and DTrace in case something goes wrong ![]()
Just a quick heads-up: After quite some time from RC1 PHP 5.3.1RC2 has finally been packaged and released. The PHP bug tracker is welcoming reports about issues, I also welcome positive feedback.
Downloads:
(This release candidate is not meant to be used in production systems, wait for the final release for that but please test this version)
... das hoffe ich zumindest ![]()
Diesen Sonntag ist Bundestagswahl, hingehen, zwei Kreuzchen machen, fertig. (naja, gut, in Brandenburg und Schleswig-Holstein darf man etwas mehr Kreuzen) Dabei beachten: Es gibt mehr als 5 Parteien, d.h. auch wenn man die "etablierten" Parteien nicht mag gibt es jemanden den man wählen kann. Es kommt auf jede Stimme an! - Wer nicht wählt soll sich am ende auch nicht beschweren.
Seit einiger Zeit habe ich ja meine eigene Wahlumfrage via Facebook. Sehr spannende Ergebnisse da, aber auch da hoffe ich noch auf mehr Tipps um die Prognose genauer zu machen.
Um da mal etwas Licht und eine andere Perspektive zu bekommen habe ich vor kurzem eine kleine Anwendung bei Facebook gebaut, den Wahltipp. Im Gegensatz zur "Sonntagsfrage" frage ich nicht nach der eigenen Wahl sondern lasse auf das Ergebnis tippen in der Hoffnung, dass da kollektiv eine gute Prognose raus kommt. Am Ende kommen ein paar Ergebnisse in Tabellen- und Grafikform raus. Bin ja mal gespannt wie das am Ende passt. Über Leute die mit machen freue ich mich. Ja,man muss Facebook User sein, so muss ich nur ne von Facebook erzeugte Benutzer ID speichern und kann auf eigene Userverwaltung verzichten und muss deutlich weniger persönliche Daten speicherm...
To learn a bit about PHP Gtk I'm working on some small GUI application to read the PHP manual, quite rough and the sole purpose is to play with PHP Gtk. People who know me know that I really love iterators in PHP so obviously this app is using iterators, too. In this post I want to share an example where iterators are really useful:
As said the app is about browsing the PHP manual. The manual is provided as tar.gz to the app and I wanted to have a fulltext search. For accessing the tar.gz content I'm using phar. Yes, phar is not only for phar files but can work on different kinds of archives (tar.gz, tar.bz2, zip), too.
So that's my search implementation:
class FullTextSearch extends FilterIterator {
protected $needle;
public function __construct(PharData $archive, $needle) {
$flags = RecursiveIteratorIterator::LEAVES_ONLY;
$it = new RecursiveIteratorIterator($archive, $flags);
parent::__construct($it);
$this->needle = $needle;
}
public function accept() {
$current = $this->current();
// This is not 100% perfect but should be good enough for this case:
if (strpos($current->getFilename(), '.htm') === false) {
return false;
}
// This is bad for larger files ...
$content = file_get_contents($current->getPathname());
return strpos($content, $this->needle) !== false;
}
}
$needle = 'search';
$archive = new PharData('php_manual_en.tar.gz');
$search = new FullTextSearch($archive, $needle);
foreach ($search as $filename => $fileobject) {
echo "Found in $filename.\n";
}
The code has some places marked which might need some improvements for general purpose but shows how nice iterator-based solutions can be.
If you aren't used to iterators you most likely wonder what's going on, so let's look into it:
First we need some basic knowledge. An Iterator in PHP is, basically, an object that can be used in foreach statements and does something - an ArrayIterator, for instance, walks over an array returning all the array elements. Now PharData objects are RecursiveDirectoryIterators. This means you can put the phar data object into a foreach statement and you'll get a list of all files - oh wait it's not that easy. Actually you will receive only the root elements. Confused? - On the one hand I said it's recursive but on the other hand it only returns the root elements? - Well having a RecursiveIterator means that the object provides methods to check whether the current element has children and can provide an Iterator to iterate over these children. foreach won't call these methods - that's the job of the RecursiveIteratorIterator (RII). The RII is a so-called outer iterator which means it iterates over the elements of another iterator. In this case it will walk over the files in the archive and will, for every file, check whether it is a directory. In case the current entry is an directory it will work, recursively, in that directory till all files were returned.
Having this basic knowledge we could write code like this:
$it = new RecursiveIteratorIterator($archive);
foreach ($it as $filename => $fileobject) {
echo $filename."\n";
}
This gives use a list of all files and directories in the archive. The next thing I'm having in the first snippet is a FilterIterator. A FilterIterator is - again - an outer iterator doing exactly what the name says: It filters the elements from it's inner iterator. For that it calls the accept() when stepping to the next element. If accept() returns true the element is given to the caller (being the foreach statement or another iterator) if it returns false the element is ignored and the FilterIterator checks the next element. So in this case I only care about elements containing the needle from my search.
With all this there's just one little thing left in the code: Treating directories like file and searching through them won't work and in this case I absolutely don't care about the directories themselves so I ask the RecursiveIteratorIterator to step over the directory handles and directly go to the children.
In a previous blog post I mentioned the possibility for using non ASCII characters as part of identifiers. Nils Langner, who runs the German "PHP hates me (but that's ok)" blog, then wondered whether it makes sense to use German (in his case) terms for identifiers in code in the comments it was also discussed what language is best to be used in comments.
The argument for using only English is simple: Every developer should know at least basic English and using English helps when either outsourcing, opensourcing or reaching new markets.
Now there are arguments against English, too, one is that not all developers are able of using English properly so reading and writing comments is more difficult for them - now you don't have to write long prosaic texts in comments, but still ...
The other argument goes by domain-specific terminology. In some cases - for instance when bound to legal environments - such terms can't be translated properly. For instance when dealing with financial data accounting by US-GAAP differs in many parts from the German accounting rules: For some positions you find translations which have the same meaning but are calculated slightly different so using the English term would imply the values would be calculated by the US rules (sorry British folks for this example - I have no idea about the British terminology and accounting rules, but have basic understanding of German rules and US-GAAP
). Ignoring this problem you still have to consider that you'd still talk German to your customer so the customer would use German terminology and you have to keep a dictionary for these terms which is a pain for everybody involved.
So now one could use mostly English with some local terms which gives a nice mixture ("function getEigenkapitalrendite()") or invent complete new terminology ("function erhalteEigenkapitalrendite()") which is stupid in comments, too ("The Eigenkapitalrendite is depending on the Gewinn and the Eigenkapital") ...
Now I'm in the "lucky" position that most things I do these days are either done for some American company or a popular world wide used open source project and is mostly about stuff where the typical terminology is English so I can use my bad English everywhere - but I'm curious what others are doing and what others think.
When I see people talking about Unicode and PHP 6 I often see them mentioning one fact as a big change: PHP 6 allows (mostly any) arbitrary Unicode character as (part of an) identifier. So you can have code like this:
function 新日本石油() {
echo "Let's hope this isn't an offensive function name... ";
echo "it's copied from some news site";
}
新日本石油();
Well yes, that's funny, at first but serves a purpose: Consider you have an application tied to an environment with a special terminology, then translating this terms to English might be extremely confusing (especially as programmers often don't really know the correct terminology of that domain) and it's good to call the thing by it's name - while that can be quite complicated, too, in a previous job we had such a case and often used the German terms which produced quite funny names for getters and setters which didn't satisfy us ... but that's not what I wanted to talk about.
The purpose of this were some bad news: That's nothing new. The relevant scanner rule hasn't changed since 4.0 - the only change is that PHP 6 doesn't treat it as random set of bytes anymore but knows about Unicode codepoints and interprets is as such.
Out of interest I did some little digging into the PHP repository's history:
$ svn annotate trunk/Zend/zend_language_scanner.l
...
34779 zeev LABEL [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
...
$ svn log -r 34779
------------------------------------------------------------------------
r34779 | zeev | 2000-10-29 15:35:34 +0100 (Sun, 29 Oct 2000) | 2 lines
Unify the names of these last 3 files...
------------------------------------------------------------------------
The result? - The rule, as it is in PHP 6 wasn't changed since 2000, really nothing new there with PHP 6, and even then the only change was that the scanner file was renamed...
In my blog post for the 5.3 release I already mentioned it and some others posted it, too: We're planning a release party in Munich, currently we have around 60 people who registered. That's the official announcement:
We like to invite you to the PHP 5.3 release party on Friday, the 17th of July in Munich. The release party offers a chance to talk to other PHP enthusiasts and enjoy that PHP is alive and kicking. If you just can't resist a decent barbecue together with some beer and other drinks you are also welcomed.The happening will take place at the Waldwirtschaft beer garden irrespective of the weather. We will meet at 19:00 o'clock - open end.
The location is famous for its huge beer garden (2,500 available seats, a children’s playground) and its typical Bavarian but also international food. On sunny weather you may even enjoy live-music and listen to the sounds of Jazz, Blues, Swing or Dixi.
Catering will be provided and as a special delicacy you may enjoy a suckling pig!
If you like to join the event please register at PHPUG-Munich Wiki and follow it for updates.
Alternatively you may register at Facebook as well and follow this for updates.
For any questions please visit IRC channel: #phprp on irc.uni-erlangen.de.
The PHP 5.3. BBQ release party is sponsored by:
Supporters for the PHP 5.3 BBQ release party are:
PHP 5.3 is released and after the release stress is over my mind is open for new ideas. While relaxing yesterday I thought about many things, among them was the Resultset iterator I recently discussed.
Now I wondered where to go next with this and had the idea that an individual Resultset is a child of the whole result and this might be wrapped in an Recursive Iterator. For doing so we don't implement the Iterator interface but RecursiveIterator. RecursiveIterator extends a typical Iterator with two methods: hasChildren() and getChildren(). But now we have a problem: The Iterator returned by getChildren() has to be a RecursiveIterator, too, which makes sense, in general. But I want to return a MySQLi Resultset which isn't recursive - so making this a RecursiveIterator is wrong. My solution now is to introduce yet another Iterator which goes by the name of MySQLi_PseudoRecursiveResultIterator and is implemented by extending IteratorIterator which will wrap the MySQLi_Result and implements RecursiveIterator telling the caller that there are no children.
As a sidenote: In our experimental tree Andrey made MySQLi_Result an iterator but that's not yet in php.net's CVS (might need some more testing, and probably we might change the design there...) so I'm emulating this with MySQLi_Result::fetch_all() combined with an ArrayIterator, using the experimental code the constructor can be dropped.
So let's finally look at the code of these two classes:
<?php
class MySQLi_ResultsetIterator implements RecursiveIterator
{
private $mysqli;
private $counter = 0;
private $current = null;
private $rewinded = false;
public function __construct(mysqli $mysqli) {
$this->mysqli = $mysqli;
}
private function freeCurrent() {
if ($this->current) {
$this->current->free();
$this->current = null;
}
}
public function rewind() {
if ($this->rewinded) {
throw new Exception("Already rewinded");
}
$this->freeCurrent();
$this->counter = 0;
$this->rewinded = true;
}
public function valid() {
$this->current = $this->mysqli->store_result();
return (bool)$this->current;
}
public function next() {
$this->freeCurrent();
$this->counter++;
$this->mysqli->next_result();
}
public function key() {
return $this->counter;
}
public function current() {
if (!$this->current) {
throw new Exception("valid() not called");
}
return $this->current;
}
public function hasChildren() {
return true;
}
public function getChildren() {
return new MySQLi_PseudoRecursiveResultIterator($this->current);
}
}
class MySQLi_PseudoRecursiveResultIterator
extends IteratorIterator
implements RecursiveIterator
{
public function __construct(MySQLi_Result $result) {
// This ctor can be dropped with the experimental bzr sources
// as IteratorIterator::__construct() directly works with
// MySQLi_Result
parent::__construct(new ArrayIterator($result->fetch_all()));
}
public function hasChildren() {
return false;
}
public function getChildren() {
throw new Exception("This should never be called");
}
}
?>
Now we can use this code. For properly using a RecursiveIterator one should use a RecursiveIteratorIterator (RII). To get some nice labels I'm extending the RII and then have a single foreach:
<?php
class MyRecursive_IteratorIterator
extends RecursiveIteratorIterator
{
public function __construct(MySQLi $mysqli, $flags = 0) {
parent::__construct(
new Mysqli_ResultSetIterator($mysqli),
$flags | RecursiveIteratorIterator::LEAVES_ONLY);
}
public function beginChildren() {
echo "Next ResultSet:\n";
}
}
$mysqli = new MySQLi("localhost", "root", "", "test");
$query = "SELECT 1,2 UNION SELECT 3, 4;".
"SELECT 'hi world' UNION SELECT 'foobar'";
if ($mysqli->multi_query($query)) {
foreach (new MyRecursive_IteratorIterator($mysqli) as $key => $row) {
printf(" %s\n", $row[0]);
}
}
?>
Now calling this code gives us a result similar to the following:
Next ResultSet:
1
3
Next ResultSet:
hi world
foobar
Isn't that nice? - I think that's a cool API! What do you think? Do you have use cases for such an API? Should we implement this in C and bundle it with PHP? Any feedback welcome!
It was a long run and I'm sure it felt like an eternity for many - for me it certainly did. PHP 5.3 was branched of over two years ago and finally is ready to be called 5.3.0.
The php.net website and many other blogs discuss the features - from often loved closures, to well discussed namespaces to the sometimes hated goto - so I think I don't have to this here but instead can focus on that what really matters:
So with that: In case you didn't do already: Browse over to php.net and grab your copy, it's for free!
If you want to celebrate the release and are close to Munich: We're planing a PHP Release Party on July 17th, details on that will follow.
Over at phpdeveloper.org I was pointed to a blog post talking about MySQLi and stored procedures. That reminded me about a small thing I recently did: When using MySQLi's multi_query to send queries which return multiple result sets you have to use a rather unintuitive API which can certainly be improved.
Recently I sat down and cooked up a small improvement for that, being an iterator fan I, of course, had to use an iterator for that and implemented the following class:
<?php
class MySQLi_ResultsetIterator implements Iterator {
private $mysqli;
private $counter = 0;
private $current = null;
private $rewinded = false;
public function __construct(mysqli $mysqli) {
$this->mysqli = $mysqli;
}
private function freeCurrent() {
if ($this->current) {
$this->current->free();
$this->current = null;
}
}
public function rewind() {
if ($this->rewinded) {
throw new Exception("Already rewinded, rewinding multiple times is not allowed!");
}
$this->freeCurrent();
$this->counter = 0;
$this->rewinded = true;
}
public function valid() {
$this->current = $this->mysqli->store_result();
return (bool)$this->current;
}
public function next() {
$this->freeCurrent();
$this->counter++;
$this->mysqli->next_result();
}
public function key() {
return $this->counter;
}
public function current() {
if (!$this->current) {
throw new Exception("valid() not called");
}
?>
This iterator is wrapping all that's needed an then can be used like that:
<php
$mysqli = new MySQLi("localhost", "root", "", "test");
$query = "SELECT 1,2 UNION SELECT 3, 4;".
"SELECT 'hi world' UNION SELECT 'foobar'";
if ($mysqli->multi_query($query)) {
foreach (new MySQLi_ResultsetIterator($mysqli) as $key => $result) {
echo 'MySQL Resultset #'.(1+$key).":\n";
while ($row = $result->fetch_row()) {
printf(" %s\n", $row[0]);
}
}
}
?>
The output will be something like
MySQL Resultset #1:
1
3
MySQL Resultset #2:
hi world
foobar
And is, in my opinion, way nicer than the classical way, which you can see on the multi_query docs page.
That code is the first revision of that idea, I'll try to improve it and port it over to C so that some future version of PHP will include it. As a disclaimer: If you plan on using this class be aware that a future PHP might bundle a class having that exact name so use your own name
Feedback welcome.