Is there any equivalent in scala parallel collections to LINQ's withDegreeOfParallelism
which sets the number of threads which will run a query? I want to run an operation in parallel which needs to have a set number of threads running.
With the newest trunk, using the JVM 1.6 or newer, use the:
collection.parallel.ForkJoinTasks.defaultForkJoinPool.setParallelism(parlevel: Int)
This may be a subject to changes in the future, though. A more unified approach to configuring all Scala task parallel APIs is planned for the next releases.
Note, however, that while this will determine the number of processors the query utilizes, this may not be the actual number of threads involved in running a query. Since parallel collections support nested parallelism, the actual thread pool implementation may allocate more threads to run the query if it detects this is necessary.
EDIT:
Starting from Scala 2.10, the preferred way to set the parallelism level is through setting the tasksupport
field to a new TaskSupport
object, as in the following example:
scala> import scala.collection.parallel._
import scala.collection.parallel._
scala> val pc = mutable.ParArray(1, 2, 3)
pc: scala.collection.parallel.mutable.ParArray[Int] = ParArray(1, 2, 3)
scala> pc.tasksupport = new ForkJoinTaskSupport(new scala.concurrent.forkjoin.ForkJoinPool(2))
pc.tasksupport: scala.collection.parallel.TaskSupport = scala.collection.parallel.ForkJoinTaskSupport@4a5d484a
scala> pc map { _ + 1 }
res0: scala.collection.parallel.mutable.ParArray[Int] = ParArray(2, 3, 4)
While instantiating the ForkJoinTaskSupport
object with a fork join pool, the parallelism level of the fork join pool must be set to the desired value (2
in the example).