Are scalar and strict types in PHP7 a performance enhancing feature?

igorsantos07 picture igorsantos07 · Oct 5, 2015 · Viewed 7.7k times · Source

Since PHP7 we can now use scalar typehint and ask for strict types on a per-file basis. Are there any performance benefits from using these features? If yes, how?

Around the interwebs I've only found conceptual benefits, such as:

  • more precise errors
  • avoiding issues with unwanted type coercion
  • more semantic code, avoiding misunderstandings when using other's code
  • better IDE evaluation of code

Answer

Joe Watkins picture Joe Watkins · Oct 5, 2015

Today, the use of scalar and strict types in PHP7 does not enhance performance.

PHP7 does not have a JIT compiler.

If at some time in the future PHP does get a JIT compiler, it is not too difficult to imagine optimizations that could be performed with the additional type information.

When it comes to optimizations without a JIT, scalar types are only partly helpful.

Let's take the following code:

<?php
function (int $a, int $b) : int {
    return $a + $b;
}
?>

This is the code generated by Zend for that:

function name: {closure}
L2-4 {closure}() /usr/src/scalar.php - 0x7fd6b30ef100 + 7 ops
 L2    #0     RECV                    1                                         $a                  
 L2    #1     RECV                    2                                         $b                  
 L3    #2     ADD                     $a                   $b                   ~0                  
 L3    #3     VERIFY_RETURN_TYPE      ~0                                                            
 L3    #4     RETURN                  ~0                                                            
 L4    #5     VERIFY_RETURN_TYPE                                                                    
 L4    #6     RETURN                  null

ZEND_RECV is the opcode that performs type verification and coercion for the received parameters. The next opcode is ZEND_ADD:

ZEND_VM_HANDLER(1, ZEND_ADD, CONST|TMPVAR|CV, CONST|TMPVAR|CV)
{
    USE_OPLINE
    zend_free_op free_op1, free_op2;
    zval *op1, *op2, *result;

    op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
    op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
    if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_LONG)) {
        if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
            result = EX_VAR(opline->result.var);
            fast_long_add_function(result, op1, op2);
            ZEND_VM_NEXT_OPCODE();
        } else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
            result = EX_VAR(opline->result.var);
            ZVAL_DOUBLE(result, ((double)Z_LVAL_P(op1)) + Z_DVAL_P(op2));
            ZEND_VM_NEXT_OPCODE();
        }
    } else if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_DOUBLE)) {
        if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
            result = EX_VAR(opline->result.var);
            ZVAL_DOUBLE(result, Z_DVAL_P(op1) + Z_DVAL_P(op2));
            ZEND_VM_NEXT_OPCODE();
        } else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
            result = EX_VAR(opline->result.var);
            ZVAL_DOUBLE(result, Z_DVAL_P(op1) + ((double)Z_LVAL_P(op2)));
            ZEND_VM_NEXT_OPCODE();
        }
    }

    SAVE_OPLINE();
    if (OP1_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op1) == IS_UNDEF)) {
        op1 = GET_OP1_UNDEF_CV(op1, BP_VAR_R);
    }
    if (OP2_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op2) == IS_UNDEF)) {
        op2 = GET_OP2_UNDEF_CV(op2, BP_VAR_R);
    }
    add_function(EX_VAR(opline->result.var), op1, op2);
    FREE_OP1();
    FREE_OP2();
    ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}

Without understanding what any of that code does, you can see that it's rather complex.

So the target would be omitting ZEND_RECV completely, and replacing ZEND_ADD with ZEND_ADD_INT_INT which doesn't need to perform any checking (beyond guarding) or branching, because the types of params are known.

In order to omit those, and have a ZEND_ADD_INT_INT you need to be able to reliably infer the types of $a and $b at compile time. Compile time inference is sometimes easy, for example, $a and $b are literal integers, or constants.

Literally yesterday, PHP 7.1 got something really similar: There are now type specific handlers for some high frequency opcodes like ZEND_ADD. Opcache is able to infer the type of some variables, it's even able to infer the types of variables within an array in some cases and change opcodes generated to use the normal ZEND_ADD, to use a type specific handler:

ZEND_VM_TYPE_SPEC_HANDLER(ZEND_ADD, (res_info == MAY_BE_LONG && op1_info == MAY_BE_LONG && op2_info == MAY_BE_LONG), ZEND_ADD_LONG_NO_OVERFLOW, CONST|TMPVARCV, CONST|TMPVARCV, SPEC(NO_CONST_CONST,COMMUTATIVE))
{
    USE_OPLINE
    zval *op1, *op2, *result;

    op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
    op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
    result = EX_VAR(opline->result.var);
    ZVAL_LONG(result, Z_LVAL_P(op1) + Z_LVAL_P(op2));
    ZEND_VM_NEXT_OPCODE();
}

Again, without understanding what any of that does, you can tell that this is much simpler to execute.

These optimizations are very cool, however, the most effective, and most interesting optimizations will come when PHP has a JIT.