Yesterday was a quite thrilling day for the PHP development team and led to some imprecise news articles so let's take a look at what happened: Over the last months many of the core contributors agreed that the current approach to bring Unicode into PHP's engine wasn't the right approach and a good thing would be to rethink it from the start. By a provocative move of one contributor the stalled situation got some more movement and Rasmus declared the current implementation to be discontinued to restart.
When the foundation of what should have become PHP 6 was created a decision was made to use UTF-16 as internal encoding for "everything" inside the engine. The choice for UTF-16 was made due to the fact that PHP 6 would use the ICU library which is focused on offering UTF-16 string functions. By using UTF-16 as default encoding we'd have to convert the script code and all data passed from or to the script (request data, database results, output, ...) from another encoding, usually UTF-8, to UTF-16 or back. The need for conversion doesn't only require CPU time and more memory (a UTF-16 string takes double memory of a UTF-8 string in many cases) but makes the implementation rather complex as we always have to figure out which encoding was the right one for a given situation. From the userspace point of view the implementation brought some backwards compatibility breaks which would require manual review of the code. These all are pains for a very small gain for many users where many would be happy about a tighter integration of some mbstring-like functionality. This all led to a situation for many contributors not willing to use "trunk" as their main development tree but either develop using the stable 5.2/5.3 trees or refuse to do development at all.
Yesterday the stagnation created by the situation has been resolved and it was decided that our trunk in svn will be based on 5.3 and we'll merge features from the old trunk and new features there so that 5.3 will be a true stable branch. The EOL for 5.2 has not yet been defined but I suggest you to really migrate over to 5.3, which usually can be done with very little work, as soon as possible.
Right now we're starting different discussions to see what kind of Unicode support we really want. Many contributors react positive on a proposed "string class" which wraps string operations in Unicode and binary forms without going deep in the engine. In my opinion such an approach might also be a way to solve some of the often criticized inconsistencies in PHP's string APIs without the need to break old code. (new code uses the new class, old code the old functions) But that idea is far from a proper proposal or even the implementation, current status is about refocusing the development and get the requirement and design discussions going. By that it's a bit early to decide whether the next version of PHP will be called PHP 5.4, PHP 6 or maybe even PHP 7.