Since PHP7 we can now use scalar typehint and ask for strict types on a per-file basis. Are there any performance benefits from using these features? If yes, how?
Around the interwebs I've only found conceptual benefits, such as:
Today, the use of scalar and strict types in PHP7 does not enhance performance.
PHP7 does not have a JIT compiler.
If at some time in the future PHP does get a JIT compiler, it is not too difficult to imagine optimizations that could be performed with the additional type information.
When it comes to optimizations without a JIT, scalar types are only partly helpful.
Let's take the following code:
<?php
function (int $a, int $b) : int {
return $a + $b;
}
?>
This is the code generated by Zend for that:
function name: {closure}
L2-4 {closure}() /usr/src/scalar.php - 0x7fd6b30ef100 + 7 ops
L2 #0 RECV 1 $a
L2 #1 RECV 2 $b
L3 #2 ADD $a $b ~0
L3 #3 VERIFY_RETURN_TYPE ~0
L3 #4 RETURN ~0
L4 #5 VERIFY_RETURN_TYPE
L4 #6 RETURN null
ZEND_RECV
is the opcode that performs type verification and coercion for the received parameters. The next opcode is ZEND_ADD
:
ZEND_VM_HANDLER(1, ZEND_ADD, CONST|TMPVAR|CV, CONST|TMPVAR|CV)
{
USE_OPLINE
zend_free_op free_op1, free_op2;
zval *op1, *op2, *result;
op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_LONG)) {
if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
result = EX_VAR(opline->result.var);
fast_long_add_function(result, op1, op2);
ZEND_VM_NEXT_OPCODE();
} else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
result = EX_VAR(opline->result.var);
ZVAL_DOUBLE(result, ((double)Z_LVAL_P(op1)) + Z_DVAL_P(op2));
ZEND_VM_NEXT_OPCODE();
}
} else if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_DOUBLE)) {
if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
result = EX_VAR(opline->result.var);
ZVAL_DOUBLE(result, Z_DVAL_P(op1) + Z_DVAL_P(op2));
ZEND_VM_NEXT_OPCODE();
} else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
result = EX_VAR(opline->result.var);
ZVAL_DOUBLE(result, Z_DVAL_P(op1) + ((double)Z_LVAL_P(op2)));
ZEND_VM_NEXT_OPCODE();
}
}
SAVE_OPLINE();
if (OP1_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op1) == IS_UNDEF)) {
op1 = GET_OP1_UNDEF_CV(op1, BP_VAR_R);
}
if (OP2_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op2) == IS_UNDEF)) {
op2 = GET_OP2_UNDEF_CV(op2, BP_VAR_R);
}
add_function(EX_VAR(opline->result.var), op1, op2);
FREE_OP1();
FREE_OP2();
ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}
Without understanding what any of that code does, you can see that it's rather complex.
So the target would be omitting ZEND_RECV
completely, and replacing ZEND_ADD
with ZEND_ADD_INT_INT
which doesn't need to perform any checking (beyond guarding) or branching, because the types of params are known.
In order to omit those, and have a ZEND_ADD_INT_INT
you need to be able to reliably infer the types of $a
and $b
at compile time. Compile time inference is sometimes easy, for example, $a
and $b
are literal integers, or constants.
Literally yesterday, PHP 7.1 got something really similar: There are now type specific handlers for some high frequency opcodes like ZEND_ADD
. Opcache is able to infer the type of some variables, it's even able to infer the types of variables within an array in some cases and change opcodes generated to use the normal ZEND_ADD
, to use a type specific handler:
ZEND_VM_TYPE_SPEC_HANDLER(ZEND_ADD, (res_info == MAY_BE_LONG && op1_info == MAY_BE_LONG && op2_info == MAY_BE_LONG), ZEND_ADD_LONG_NO_OVERFLOW, CONST|TMPVARCV, CONST|TMPVARCV, SPEC(NO_CONST_CONST,COMMUTATIVE))
{
USE_OPLINE
zval *op1, *op2, *result;
op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
result = EX_VAR(opline->result.var);
ZVAL_LONG(result, Z_LVAL_P(op1) + Z_LVAL_P(op2));
ZEND_VM_NEXT_OPCODE();
}
Again, without understanding what any of that does, you can tell that this is much simpler to execute.
These optimizations are very cool, however, the most effective, and most interesting optimizations will come when PHP has a JIT.