Some programming languages (like haskell) allow cyclic dependencies between modules. Since the compiler needs to know all definitions of all modules imported while compiling one module, it usually has to do some extra work if some modules import each other mutually or any other kind of cycle occurs. In that case, the compiler may not be able to optimize code as much as in modules that have no import cycles, since imported functions may have not yet been analyzed. Usually only one module of a cycle has to be compiled that way, as a binary object has no dependecies. Let's call such a module loop-breaker
Especially if the import cycles are interleaved it is interesting to know, how to minimize the number of loop-breakers when compiling a big project composed of hundreds of modules.
Is there an algorithm that given a set of dependecies outputs a minimal number of modules that need to be compiled as loop-breakers to compile the program successfully?
I try to clarify what I mean in this example.
Consider a project with the four modules A
, B
, C
and D
. This is a list of dependencies between these modules, an entry X y
means y
depends on x
:
A C A D B A C B D B
The same relation visualized as an ASCII-diagram:
D ---> B ^ / ^ | / | | / | | L | A ---> C
There are two cycles in this dependency-graph: ADB
and ACB
. To break these cycles one could for instance compile modules C
and D
as loop-breakers. Obviously, this is not the best approach. Compiling A
as a loop-breaker is completely sufficient to break both loops and you need to compile one less module as a loop-breaker.
This is the NP-hard (and APX-hard) problem known as minimum feedback vertex set. An approximation algorithm due to Demetrescu and Finocchi (pdf, Combinatorial Algorithms for Feedback Problems in Directed Graphs (2003)") works well in practice when there are no long simple cycles, as I would expect for your application.