Functional, Declarative, and Imperative Programming

curiouskitty picture curiouskitty · Mar 2, 2009 · Viewed 185.3k times · Source

What do the terms functional, declarative, and imperative programming mean?

Answer

Shelby Moore III picture Shelby Moore III · Dec 2, 2011

At the time of writing this, the top voted answers on this page are imprecise and muddled on the declarative vs. imperative definition, including the answer that quotes Wikipedia. Some answers are conflating the terms in different ways.

Refer also to my explanation of why spreadsheet programming is declarative, regardless that the formulas mutate the cells.

Also, several answers claim that functional programming must be a subset of declarative. On that point it depends if we differentiate "function" from "procedure". Lets handle imperative vs. declarative first.

Definition of declarative expression

The only attribute that can possibly differentiate a declarative expression from an imperative expression is the referential transparency (RT) of its sub-expressions. All other attributes are either shared between both types of expressions, or derived from the RT.

A 100% declarative language (i.e. one in which every possible expression is RT) does not (among other RT requirements) allow the mutation of stored values, e.g. HTML and most of Haskell.

Definition of RT expression

RT is often referred to as having "no side-effects". The term effects does not have a precise definition, so some people don't agree that "no side-effects" is the same as RT. RT has a precise definition:

An expression e is referentially transparent if for all programs p every occurrence of e in p can be replaced with the result of evaluating e, without affecting the observable result of p.

Since every sub-expression is conceptually a function call, RT requires that the implementation of a function (i.e. the expression(s) inside the called function) may not access the mutable state that is external to the function (accessing the mutable local state is allowed). Put simply, the function (implementation) should be pure.

Definition of pure function

A pure function is often said to have "no side-effects". The term effects does not have a precise definition, so some people don't agree.

Pure functions have the following attributes.

  • the only observable output is the return value.
  • the only output dependency is the arguments.
  • arguments are fully determined before any output is generated.

Remember that RT applies to expressions (which includes function calls) and purity applies to (implementations of) functions.

An obscure example of impure functions that make RT expressions is concurrency, but this is because the purity is broken at the interrupt abstraction layer. You don't really need to know this. To make RT expressions, you call pure functions.

Derivative attributes of RT

Any other attribute cited for declarative programming, e.g. the citation from 1999 used by Wikipedia, either derives from RT, or is shared with imperative programming. Thus proving that my precise definition is correct.

Note, immutability of external values is a subset of the requirements for RT.

  • Declarative languages don't have looping control structures, e.g. for and while, because due to immutability, the loop condition would never change.

  • Declarative languages don't express control-flow other than nested function order (a.k.a logical dependencies), because due to immutability, other choices of evaluation order do not change the result (see below).

  • Declarative languages express logical "steps" (i.e. the nested RT function call order), but whether each function call is a higher level semantic (i.e. "what to do") is not a requirement of declarative programming. The distinction from imperative is that due to immutability (i.e. more generally RT), these "steps" cannot depend on mutable state, rather only the relational order of the expressed logic (i.e. the order of nesting of the function calls, a.k.a. sub-expressions).

For example, the HTML paragraph <p> cannot be displayed until the sub-expressions (i.e. tags) in the paragraph have been evaluated. There is no mutable state, only an order dependency due to the logical relationship of tag hierarchy (nesting of sub-expressions, which are analogously nested function calls).

  • Thus there is the derivative attribute of immutability (more generally RT), that declarative expressions, express only the logical relationships of the constituent parts (i.e. of the sub-expression function arguments) and not mutable state relationships.

Evaluation order

The choice of evaluation order of sub-expressions can only give a varying result when any of the function calls are not RT (i.e. the function is not pure), e.g. some mutable state external to a function is accessed within the function.

For example, given some nested expressions, e.g. f( g(a, b), h(c, d) ), eager and lazy evaluation of the function arguments will give the same results if the functions f, g, and h are pure.

Whereas, if the functions f, g, and h are not pure, then the choice of evaluation order can give a different result.

Note, nested expressions are conceptually nested functions, since expression operators are just function calls masquerading as unary prefix, unary postfix, or binary infix notation.

Tangentially, if all identifiers, e.g. a, b, c, d, are immutable everywhere, state external to the program cannot be accessed (i.e. I/O), and there is no abstraction layer breakage, then functions are always pure.

By the way, Haskell has a different syntax, f (g a b) (h c d).

Evaluation order details

A function is a state transition (not a mutable stored value) from the input to the output. For RT compositions of calls to pure functions, the order-of-execution of these state transitions is independent. The state transition of each function call is independent of the others, due to lack of side-effects and the principle that an RT function may be replaced by its cached value. To correct a popular misconception, pure monadic composition is always declarative and RT, in spite of the fact that Haskell's IO monad is arguably impure and thus imperative w.r.t. the World state external to the program (but in the sense of the caveat below, the side-effects are isolated).

Eager evaluation means the functions arguments are evaluated before the function is called, and lazy evaluation means the arguments are not evaluated until (and if) they are accessed within the function.

Definition: function parameters are declared at the function definition site, and function arguments are supplied at the function call site. Know the difference between parameter and argument.

Conceptually, all expressions are (a composition of) function calls, e.g. constants are functions without inputs, unary operators are functions with one input, binary infix operators are functions with two inputs, constructors are functions, and even control statements (e.g. if, for, while) can be modeled with functions. The order that these argument functions (do not confuse with nested function call order) are evaluated is not declared by the syntax, e.g. f( g() ) could eagerly evaluate g then f on g's result or it could evaluate f and only lazily evaluate g when its result is needed within f.

Caveat, no Turing complete language (i.e. that allows unbounded recursion) is perfectly declarative, e.g. lazy evaluation introduces memory and time indeterminism. But these side-effects due to the choice of evaluation order are limited to memory consumption, execution time, latency, non-termination, and external hysteresis thus external synchronization.

Functional programming

Because declarative programming cannot have loops, then the only way to iterate is functional recursion. It is in this sense that functional programming is related to declarative programming.

But functional programming is not limited to declarative programming. Functional composition can be contrasted with subtyping, especially with respect to the Expression Problem, where extension can be achieved by either adding subtypes or functional decomposition. Extension can be a mix of both methodologies.

Functional programming usually makes the function a first-class object, meaning the function type can appear in the grammar anywhere any other type may. The upshot is that functions can input and operate on functions, thus providing for separation-of-concerns by emphasizing function composition, i.e. separating the dependencies among the subcomputations of a deterministic computation.

For example, instead of writing a separate function (and employing recursion instead of loops if the function must also be declarative) for each of an infinite number of possible specialized actions that could be applied to each element of a collection, functional programming employs reusable iteration functions, e.g. map, fold, filter. These iteration functions input a first-class specialized action function. These iteration functions iterate the collection and call the input specialized action function for each element. These action functions are more concise because they no longer need to contain the looping statements to iterate the collection.

However, note that if a function is not pure, then it is really a procedure. We can perhaps argue that functional programming that uses impure functions, is really procedural programming. Thus if we agree that declarative expressions are RT, then we can say that procedural programming is not declarative programming, and thus we might argue that functional programming is always RT and must be a subset of declarative programming.

Parallelism

This functional composition with first-class functions can express the depth in the parallelism by separating out the independent function.

Brent’s Principle: computation with work w and depth d can be implemented in a p-processor PRAM in time O(max(w/p, d)).

Both concurrency and parallelism also require declarative programming, i.e. immutability and RT.

So where did this dangerous assumption that Parallelism == Concurrency come from? It’s a natural consequence of languages with side-effects: when your language has side-effects everywhere, then any time you try to do more than one thing at a time you essentially have non-determinism caused by the interleaving of the effects from each operation. So in side-effecty languages, the only way to get parallelism is concurrency; it’s therefore not surprising that we often see the two conflated.

FP evaluation order

Note the evaluation order also impacts the termination and performance side-effects of functional composition.

Eager (CBV) and lazy (CBN) are categorical duels[10], because they have reversed evaluation order, i.e. whether the outer or inner functions respectively are evaluated first. Imagine an upside-down tree, then eager evaluates from function tree branch tips up the branch hierarchy to the top-level function trunk; whereas, lazy evaluates from the trunk down to the branch tips. Eager doesn't have conjunctive products ("and", a/k/a categorical "products") and lazy doesn't have disjunctive coproducts ("or", a/k/a categorical "sums")[11].

Performance

  • Eager

As with non-termination, eager is too eager with conjunctive functional composition, i.e. compositional control structure does unnecessary work that isn't done with lazy. For example, eager eagerly and unnecessarily maps the entire list to booleans, when it is composed with a fold that terminates on the first true element.

This unnecessary work is the cause of the claimed "up to" an extra log n factor in the sequential time complexity of eager versus lazy, both with pure functions. A solution is to use functors (e.g. lists) with lazy constructors (i.e. eager with optional lazy products), because with eager the eagerness incorrectness originates from the inner function. This is because products are constructive types, i.e. inductive types with an initial algebra on an initial fixpoint[11]

  • Lazy

As with non-termination, lazy is too lazy with disjunctive functional composition, i.e. coinductive finality can occur later than necessary, resulting in both unnecessary work and non-determinism of the lateness that isn't the case with eager[10][11]. Examples of finality are state, timing, non-termination, and runtime exceptions. These are imperative side-effects, but even in a pure declarative language (e.g. Haskell), there is state in the imperative IO monad (note: not all monads are imperative!) implicit in space allocation, and timing is state relative to the imperative real world. Using lazy even with optional eager coproducts leaks "laziness" into inner coproducts, because with lazy the laziness incorrectness originates from the outer function (see the example in the Non-termination section, where == is an outer binary operator function). This is because coproducts are bounded by finality, i.e. coinductive types with a final algebra on an final object[11].

Lazy causes indeterminism in the design and debugging of functions for latency and space, the debugging of which is probably beyond the capabilities of the majority of programmers, because of the dissonance between the declared function hierarchy and the runtime order-of-evaluation. Lazy pure functions evaluated with eager, could potentially introduce previously unseen non-termination at runtime. Conversely, eager pure functions evaluated with lazy, could potentially introduce previously unseen space and latency indeterminism at runtime.

Non-termination

At compile-time, due to the Halting problem and mutual recursion in a Turing complete language, functions can't generally be guaranteed to terminate.

  • Eager

With eager but not lazy, for the conjunction of Head "and" Tail, if either Head or Tail doesn't terminate, then respectively either List( Head(), Tail() ).tail == Tail() or List( Head(), Tail() ).head == Head() is not true because the left-side doesn't, and right-side does, terminate.

Whereas, with lazy both sides terminate. Thus eager is too eager with conjunctive products, and non-terminates (including runtime exceptions) in those cases where it isn't necessary.

  • Lazy

With lazy but not eager, for the disjunction of 1 "or" 2, if f doesn't terminate, then List( f ? 1 : 2, 3 ).tail == (f ? List( 1, 3 ) : List( 2, 3 )).tail is not true because the left-side terminates, and right-side doesn't.

Whereas, with eager neither side terminates so the equality test is never reached. Thus lazy is too lazy with disjunctive coproducts, and in those cases fails to terminate (including runtime exceptions) after doing more work than eager would have.

[10] Declarative Continuations and Categorical Duality, Filinski, sections 2.5.4 A comparison of CBV and CBN, and 3.6.1 CBV and CBN in the SCL.

[11] Declarative Continuations and Categorical Duality, Filinski, sections 2.2.1 Products and coproducts, 2.2.2 Terminal and initial objects, 2.5.2 CBV with lazy products, and 2.5.3 CBN with eager coproducts.