This post is archived and probably outdated.

References and foreach

2010-08-20 13:21:05

References in PHP are bad, as I mentioned before, and you certainly should avoid using them. Now there is one use case which leads to an, at first, unexpected behavior which I didn't see as a real live issue when I stumbled over it at first, but then there were a few bug reports about it and recently a friend asked me about it ... so here it goes:

What is the output of this code:

<?php
$a = array('a', 'b', 'c', 'd');

foreach ($a as &$v) { }
foreach ($a as $v) { }

print_r($a);
?>

We are iterating two times over an array without doing anything. So the result should be no change. Right? - Wrong! The actual result looks like this:

Array
(
    [0] => a
    [1] => b
    [2] => c
    [3] => c
)

For understanding why this happens let's take a step back and look at the way PHP variables are implemented and what references are:

A PHP variable basically consists out of two things: A "label" and a "container." The label is the entry in a hash table (there are a few optimizations in the engine so it is not always in a hashtable but well) which may represents a symbol table of a function, and array or an object's property table. So we have a name and a pointer to the container. The container, internally called "zval", stores the value and some meta information, this container can also be a new hashtable with a set of labels pointing to other containers if we now create an reference this will cause a second label to point to the same container as another label. Both label from then on have the same powers of the container.

Now let's look at the situation from above. In  a picture it looks like this:

So we have six containers (the global symbol table on the top, a container holding the array called $a on the left and one container for each element on the right) Now we start the first iteration. So the global symbol gets a new entry for $v and v is made a reference to the container of the first array element.

So an change to either $a[0] or $v goes to the same container and therefore has an effect to the other. When the iteration continues the reference is broken and $v is made a reference to the different elements. So after the iteration ends $v is a reference to the last element.

Remember: $v being a reference means that any change to $v effects the other references, in this situation $a[3]. Up till now nothing special happened. but now the second iteration begins. This one assigns the value of the current element to $v for each step. Now $v is a reference to the same element as $a[3] so by assigning a value to $v $a[3] is changed, too:

This continues for he next steps, too.

And now we can easily guess what will happen at the last step: $v is being assigned the value of the last element, $a[3], and as $a[3] is a reference to $v it therefore assignees itself to itself so effectively nothing happens.

And this is the result we saw above.

So to make this story full of pictures short: Be careful about references! They can have really strange effects.