Concurrency in running Oozie workflow: how many and how to throttle

user908645 picture user908645 · Feb 6, 2014 · Viewed 11.1k times · Source

Let us say we have a Oozie workflow that has a copy action node then a Shell action node. Can I start multiple instances of such a OOzie workflow and run them in parallel? How about the concurrency number could spike to thousands and/or even millions level. Is that possible, or even Oozie supports that high level concurrency?

If not, then we will have to consider throttling and enforce a cap on how many concurrent Oozie workflow instances can be. We'd prefer to throttle this on server/Oozie side (basically with any out of box Oozie software functionality), not on client/callee side. For example, we have a huge launch script with lines like this. We want to run that in a single shot, then let Oozie figure out how to throttle all these instances on itself. We don't want to split it into multiple smaller chunks, then kick off one chunk at a time.

oozie job -oozie http://myhost.com:11000/oozie -config job1.properties -run
oozie job -oozie http://myhost.com:11000/oozie -config job2.properties -run
......
oozie job -oozie http://myhost.com:11000/oozie -config job1000000.properties -run

Answer

Jeremy Beard picture Jeremy Beard · Feb 17, 2014

You will not be able to have a higher Oozie workflow concurrency than the number of map slots on your cluster because a Shell action is run by a one-mapper-zero-reducer MR job.

If you have many instances of a workflow to get through then the best mechanism is to use an Oozie coordinator. This will keep track of the completion of each instance and easily manage concurrency. An Oozie coordinator has a <concurrency> tag that controls how many instances of the workflow will execute in parallel, and a <throttle> tag that controls how many instances are brought into a waiting state before there is free concurrency for one to begin.

See: https://oozie.apache.org/docs/3.1.3-incubating/CoordinatorFunctionalSpec.html#a6.3._Synchronous_Coordinator_Application_Definition

Note that the default behavior of an Oozie coordinator is to wait 5 minutes between each polling of whether a new instance should be created. If your workflows run in less than 5 minutes then the process will bottleneck on this interval. You can change this with the oozie.service.CoordMaterializeTriggerService.lookup.interval property (in seconds) in your oozie-site.xml file.