PHP uses a copy-on-modification system.
Does $a = (string) $a;
($a is a already string) modify and copy anything?
Especially, this is my problem:
Parameter 1 is mixed
/ I want to allow to pass non-strings and convert them to strings.
But sometimes these strings are very large. So I want to omit copying of a param, that is already a string.
Can I use version Foo
or do I have to use version Bar
?
class Foo {
private $_foo;
public function __construct($foo) {
$this->_foo = (string) $foo;
}
}
class Bar {
private $_bar;
public function __construct($bar) {
if (is_string($bar)) {
$this->_bar = $bar;
} else {
$this->_bar = (string) $bar;
}
}
}
The answer is that yes, it does copy the string. Sort-of... Not really. Well, it depends on your definition of "copy"...
To see what's happening, let's look at the source. The executor handles a variable cast in 5.5 here.
zend_make_printable_zval(expr, &var_copy, &use_copy);
if (use_copy) {
ZVAL_COPY_VALUE(result, &var_copy);
// if optimized out
} else {
ZVAL_COPY_VALUE(result, expr);
// if optimized out
zendi_zval_copy_ctor(*result);
}
As you can see, the call uses zend_make_printable_zval()
which just short-circuits if the zval is already a string.
So the code that's executed to do the copy is (the else branch):
ZVAL_COPY_VALUE(result, expr);
Now, let's look at the definition of ZVAL_COPY_VALUE
:
#define ZVAL_COPY_VALUE(z, v) \
do { \
(z)->value = (v)->value; \
Z_TYPE_P(z) = Z_TYPE_P(v); \
} while (0)
Note what that's doing. The string itself is NOT copied (which is stored in the ->value
block of the zval). It's just referenced (the pointer remains the same, so the string value is the same, no copy). But it's creating a new variable (the zval part that wraps the value).
Now, we get into the zendi_zval_copy_ctor
call. Which internally does some interesting things on its own. Note:
case IS_STRING:
CHECK_ZVAL_STRING_REL(zvalue);
if (!IS_INTERNED(zvalue->value.str.val)) {
zvalue->value.str.val = (char *) estrndup_rel(zvalue->value.str.val, zvalue->value.str.len);
}
break;
Basically, that means that if it's an interned string, it won't be copied. but if it's not, it will be copied... So what's an interned string, and what does that mean?
In 5.3, interned strings didn't exist. So the string is always copied. That's really the only difference...
Well, in a case like this:
$a = "foo";
$b = (string) $a;
No copy of the string will happen in 5.4, but in 5.3 a copy will occur.
But in a case like this:
$a = str_repeat("a", 10);
$b = (string) $a;
A copy will occur for all versions. That's because in PHP, not all strings are interned...
Let's try it out in a benchmark: http://3v4l.org/HEelW
$a = "foobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisout";
$b = str_repeat("a", 300);
echo "Static Var\n";
testCopy($a);
echo "Dynamic Var\n";
testCopy($b);
function testCopy($var) {
echo memory_get_usage() . "\n";
$var = (string) $var;
echo memory_get_usage() . "\n";
}
Results:
5.4 - 5.5 alpha 1 (not including other alphas, as the differences are minor enough to not make a fundamental difference)
Static Var
220152
220200
Dynamic Var
220152
220520
So the static var increased by 48 bytes, and the dynamic var increased by 368 bytes.
5.3.11 to 5.3.22:
Static Var
624472
625408
Dynamic Var
624472
624840
The static var increased by 936 bytes while dynamic var increased by 368 bytes.
So notice that in 5.3, both the static and the dynamic variables were copied. So the string is always duplicated.
But in 5.4 with static strings, only the zval structure was copied. Meaning that the string itself, which was interned, remains the same and is not copied...
Another thing to note is that all of the above is moot. You're passing the variable as a parameter to the function. Then you're casting inside the function. So copy-on-write will be triggered by your line. So running that will always (well, in 99.9% of cases) trigger a variable copy. So at best (interned strings) you're talking about a zval duplication and associated overhead. At worst, you're talking about a string duplication...