Difference between INCLUDE and modules in Fortran

Nordico picture Nordico · Mar 27, 2013 · Viewed 14.8k times · Source

What are the practical differences between using modules with the use statement or isolated files with the include statement? I mean, if I have a subroutine that is used a lot throughout a program: when or why should I put it inside a module or just write it in a separate file and include it in every other part of the program where it needs to be used?

Also, would it be good practice to write all subroutines intended to go in a module in separate files and use include inside the module? Specially if the code in the subroutines is long, so as to keep the code better organized (that way all subroutines are packed in the mod, but if I have to edit one I don't need to go though a maze of code).

Answer

IanH picture IanH · Mar 27, 2013

The conceptual differences between the two map through to very significant practical differences.

An INCLUDE line operates at the source level - it accomplishes simple ("dumb") text inclusion. In the absence of any special processor interpretation of the "filename" (no requirement for that actually to be a file) in the include line the complete source could quite easily be manually spliced together by the programmer and fed to the compiler with no difference what-so-ever in the semantics of the source. Included source has no real interpretation in isolation - its meaning is completely dependent on the context in which the include line that references the included source appears.

Modules operate at the much higher entity level of the program, i.e. at the level where the compiler is considering the things that the source actually describes. A module can be compiled in isolation of its downstream users and once it has been compiled the compiler knows exactly what things the module can provide to the program.

Typically what someone using include lines is hoping to do is what modules were actually designed to do.

Example issues:

  • Because entity declarations can be spread over multiple statements the entities described by included source might not be what you expect. Consider the following source to be included:

    INTEGER :: i

    In isolation it looks like this declares the name i as an integer scalar (or perhaps a function? Who knows!). Now consider the following scope that includes the above:

    INCLUDE "source from above"
    DIMENSION :: i(10,10)

    i is now a rank two array! Perhaps you want to make it a POINTER? An ALLOCATABLE? A dummy argument? Perhaps that results in an error, or perhaps it is valid source! Throw implicit typing into the mix to really compound the potential fun.

    An entity defined in a module is "completely" defined by the module. Attributes that are specific to the scope of use can be changed (VOLATILE, accessibility, etc), but the fundamental entity remains the same. Name clashes are explicitly called out and can be easily worked around with a rename clause on the USE statement.

  • Fortran has restrictions on statement ordering (specification statements must go before executable statements, etc.). Included source is also subject to those restrictions, again in the context of the point of inclusion, not the point of source definition.

    Mix well with source ambiguity between statement function definitions (specification part) and assignment statements (executable part) for some completely obtuse error messages or, worse, silent acceptance by the compiler of erroneous code.

    There are requirements on where the USE statement that references a module appears, but the source for the actual module program unit is completely independent of its point of use.

  • Fancy having some global state to be shared across related procedures and you want to use include? Let me introduce you to common blocks and the associated underlying concept of sequence association...

    Sequence association is a unfortunate bleed-through of early underlying Fortran processor implementation that is an error prone, inflexible, anti-optimisation anachronism.

    Module variables make common blocks and their associated evils completely unnecessary.

  • If you were using include lines, then note that you don't actually include the source of a commonly used procedure (the suggestion in your first paragraph is just going to result in a morass of syntax errors from the compiler). What you would typically do is include source that describes the interface of the procedure. For any non-trivial procedure the source that describes the interface is different from the complete source of the procedure - implying that you now need to maintain two source representations of the same thing. This is an error prone maintenance burden.

    As mentioned - the compilers automatically gains knowledge of the interface of a module procedure (the compiler knowledge is "explicit" because it actually saw the procedure's code - hence the term "explicit interface"). No need for the programmer to do anything more.

    A consequence of the above is that external subprograms should not be used at all unless there are very good reasons to the contrary (perhaps the existence of circular or excessively extensive dependencies) - the basic starting point should be to put everything in a module or main program.

Other posters have mentioned the source code organisation benefits of modules - including the ability to group related procedures and other "stuff" into the one package, with control over accessibility of internal implementation details.

I accept there is a valid use of INCLUDE lines as per the second paragraph of the question - where large modules become unwieldy in size. F2008 has addressed this with submodules, which also bring a number of other benefits. Once they become widely supported the include line work-around should be abandoned.

A second valid use is to overcome a lack of support by the language for generic programming techniques (what templates provide in C++) - i.e. where the types of objects involved in an operation may vary, but the token sequence that describes what to do on those objects is essentially the same. It might be another decade or so before the language sorts that out.