how to tune the parallelism hint in storm

Shengjie picture Shengjie · Dec 4, 2013 · Viewed 11.7k times · Source

"parallelism hint" is used in storm to parallelise a running storm topology. I know there are concepts like worker process, executor and tasks. Would it make sense to make the parallelism hint as big as possible so that your topologies are parallelised as much as possible?

My question is How to find a perfect parallelism hint number for my storm topologies. Is it depending on the scale of my storm cluster or it's more like a topology/job specific setting, it varies from one topology to another? or it depends on both?

Answer

user2720864 picture user2720864 · Dec 5, 2013

Adding to what @Chiron explained

"parallelism hint" is used in storm to parallelise a running storm topology

Actually in storm the term parallelism hint is used to specify the initial number of executor (threads) of a component (spout, bolt) e.g

    topologyBuilder.setBolt("green-bolt", new GreenBolt(), 2)

The above statement tells storm to allot 2 executor thread initially (this can be changed in the run time). Again

    topologyBuilder.setBolt("green-bolt", new GreenBolt(), 2).setNumTasks(4) 

the setNumTasks(4) indicate to run 4 associated tasks (this will be same throughout the lifetime of a topology). So in this case each storm will be running two tasks per executor. By default, the number of tasks is set to be the same as the number of executors, i.e. Storm will run one task per thread.

Would it make sense to make the parallelism hint as big as possible so that your topologies are parallelised as much as possible

One key thing to note that if you intent to run more than one tasks per executor it does not increase the level of parallelism. Because executor uses one single thread to process all the tasks i.e tasks run serially on an executor.

enter image description here

The purpose of configuring more than 1 task per executor is it is possible to change the number of executor(thread) using the re-balancing mechanism in the runtime (remember the number of tasks are always the same through out the life cycle of a topology) while the topology is still running.

Increasing the number of workers (responsible for running one or more executors for one or more components) might also gives you a performance benefit, but this also relative as I found from this discussion where nathanmarz says

Having more workers might have better performance, depending on where your bottleneck is. Each worker has a single thread that passes tuples on to the 0mq connections for transfer to other workers, so if you're bottlenecked on CPU and each worker is dealing with lots of tuples, more workers will probably net you better throughput.

So basically there is no definite answer to this, you should try different configuration based on your environment and design.