Guzzle async requests not really async?

Scalable picture Scalable · Sep 2, 2017 · Viewed 17.7k times · Source

Problem

We are trying to do concurrent asynchronous requests using guzzle. After going through a few resources, like this and this, we came up with some code that is shared below. However it is not working as expected.

It looks like Guzzle is doing these request synchronously rather than async.

Expectation

Just for test purposes, we are hitting an internal url, which does a 5 second sleep. With a concurrency of 10 we expect that all 10 requests will initially be queued and send to the server almost simultaneously, where they will wait for 5 seconds, and will then almost all of those will finish nearly at the same time. Which would make the guzzle client to pick up 10 new requests from iterator and so on.

Code

    $iterator = function() {
        $index = 0;
        while (true) {
            $client = new Client(['timeout'=>20]);
            $url = 'http://localhost/wait/5' . $index++;
            $request = new Request('GET',$url, []);
            echo "Queuing $url @ " . (new Carbon())->format('Y-m-d H:i:s') . PHP_EOL;
            yield $client
                ->sendAsync($request)
                ->then(function(Response $response) use ($request) {
                    return [$request, $response];
                });
        }
    };

    $promise = \GuzzleHttp\Promise\each_limit(
        $iterator(),
        10,  /// concurrency,
        function($result, $index) {
            /** GuzzleHttp\Psr7\Request $request */
            list($request, $response) = $result;
            echo (string) $request->getUri() . ' completed '.PHP_EOL;
        },
        function(RequestException $reason, $index) {
            // left empty for brevity
        }
    );
    $promise->wait();

Actual Results

We find that that Guzzle never made a second request until the first one is finished. and so on.

Queuing http://localhost/wait/5/1 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/2 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/3 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/4 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/5 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/6 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/7 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/8 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/9 @ 2017-09-01 17:15:28
Queuing http://localhost/wait/5/10 @ 2017-09-01 17:15:28
http://localhost/wait/5/1 completed
Queuing http://localhost/wait/5/11 @ 2017-09-01 17:15:34
http://localhost/wait/5/2 completed
Queuing http://localhost/wait/5/12 @ 2017-09-01 17:15:39
http://localhost/wait/5/3 completed
Queuing http://localhost/wait/5/13 @ 2017-09-01 17:15:45
http://localhost/wait/5/4 completed
Queuing http://localhost/wait/5/14 @ 2017-09-01 17:15:50 

OS / Version information

  • Ubuntu
  • PHP/7.1.3
  • GuzzleHttp/6.2.1
  • curl/7.47.0

The issue could be with \GuzzleHttp\Promise\each_limit .. which perhaps does not initiates or resolves the promise fast enough. It may be possible that we have to trick that into ticking externally.

Answer

Adam Lavin picture Adam Lavin · Sep 3, 2017

In the example code, you're creating a new GuzzleHttp\Client instance for every request you want to make. This might not seem important, however, during instantiation of GuzzleHttp\Client it will set a default handler if none is provided. (This value is then passed down to any request being sent through the Client, unless it is overridden.)

Note: It determines the best handler to use from this function. Though, it'll most likely end up defaulting to curl_mutli_exec.

What's the importance of this? It's the underlying handler that is responsible for tracking and executing multiple requests at the same time. By creating a new handler every time, none of your requests are properly being grouped up and ran together. For some more insight into this take a gander into the curl_multi_exec docs.

So, you kind of have two ways of dealing with this:

Pass through the client through to the iterator:

$client = new GuzzleHttp\Client(['timeout' => 20]);

$iterator = function () use ($client) {
    $index = 0;
    while (true) {
        if ($index === 10) {
            break;
        }

        $url = 'http://localhost/wait/5/' . $index++;
        $request = new Request('GET', $url, []);

        echo "Queuing $url @ " . (new Carbon())->format('Y-m-d H:i:s') . PHP_EOL;

        yield $client
            ->sendAsync($request)
            ->then(function (Response $response) use ($request) {
                return [$request, $response];
            });

    }
};

$promise = \GuzzleHttp\Promise\each_limit(
    $iterator(),
    10,  /// concurrency,
    function ($result, $index) {
        /** @var GuzzleHttp\Psr7\Request $request */
        list($request, $response) = $result;
        echo (string)$request->getUri() . ' completed ' . PHP_EOL;
    }
);
$promise->wait();

or create the handler elsewhere and pass it to the client: (Though I'm not sure why you'd do this, but it's there!)

$handler = \GuzzleHttp\HandlerStack::create();

$iterator = function () use ($handler) {
    $index = 0;
    while (true) {
        if ($index === 10) {
            break;
        }

        $client = new Client(['timeout' => 20, 'handler' => $handler])
        $url = 'http://localhost/wait/5/' . $index++;
        $request = new Request('GET', $url, []);

        echo "Queuing $url @ " . (new Carbon())->format('Y-m-d H:i:s') . PHP_EOL;

        yield $client
            ->sendAsync($request)
            ->then(function (Response $response) use ($request) {
                return [$request, $response];
            });

    }
};

$promise = \GuzzleHttp\Promise\each_limit(
    $iterator(),
    10,  /// concurrency,
    function ($result, $index) {
        /** @var GuzzleHttp\Psr7\Request $request */
        list($request, $response) = $result;
        echo (string)$request->getUri() . ' completed ' . PHP_EOL;
    }
);
$promise->wait();