How is Heterogeneous Multi-Processing (HMP) scheduling implemented in Linux Kernel (Samsung Exynos5422)?

nico picture nico · Aug 26, 2014 · Viewed 12.4k times · Source

Does anybody know how Heterogeneous Multi-Processing (HMP) scheduling is implemented in the Linux kernel scheduler?

This has been implemented in the kernel supplied with the ODROID-XU3 board. (https://github.com/hardkernel/linux.git -b odroidxu3-3.10.y-android)

I roughly know that it calculates the load of a certain process and based on that load it will reschedule to a faster or slower CPU. I'm looking for a more detail explanation and if possible the code location of the functions that implement this functionality.

Answer

TheCodeArtist picture TheCodeArtist · Aug 26, 2014

Code:

Checkout the source-code under #ifdef CONFIG_SCHED_HMP mainly within kernel/sched/core.c


A (not so) brief overview:

big.LITTLE CPUs can be configured in 2 modes of operation:

  • IKSIn Kernel Switcher (also known as CPU Migration)
  • GTS - Global Task Scheduling (also known as big.LITTLE MP)

GTS is the heterogeneous form of operation i.e HMP.

At the most abstracted level, HMP is currently supported by simply extending DVFS and SMP load balancing. Both of these are made fully aware of the performance advantage of big cores (over the LITTLE cores) and schedule high-priority, cpu-intensive, foreground tasks accordingly.

Dynamic voltage and frequency scaling (DVFS) is used to adapt to instantaneous changes in required performance. The migration modes of big.LITTLE extends this concept by enabling a transition to "big" CPU cores above the highest DVFS operating point of the LITTLE cores. The migration takes about 30 microseconds. By contrast, the DVFS driver evaluates the performance of the OS and the individual cores typically every 50 milliseconds, although some implementations sample slightly more frequently. It takes about 100 microseconds to change voltage and frequency. Because the time taken to migrate a CPU or a cluster is shorter than the DVFS change time and an order of magnitude shorter than the OS evaluation period for DVFS changes, big.LITTLE transitions will enable the processors to run at lower operating points, more frequently, and further, be completely invisible to the user.

DVFS extended to handle big.LITTLE cores

In the Global Task Scheduling model, the DVFS mechanisms are still in operation, but the operating system kernel scheduler is aware of the big and LITTLE cores in the system and seeks to load balance high performance threads to high performance cores, and low performance or memory bound threads to the high efficiency cores. This is similar to SMP load balancers today, that automatically balance threads across the cores available in the system, and idle unused cores. In big.LITTLE Global Task Scheduling, the same mechanism is in operation, but the OS keeps track of the load history of each thread and uses that history plus real-time performance sampling to balance threads appropriately among big and LITTLE cores.

Reference: community.arm.com : Ten Things to Know About big.LITTLE