Continuing on from ideas in: Are there any provable real-world languages?
I don't know about you, but I'm sick of writing code that I can't guarantee.
After asking the above question and getting a phenomenal response (Thanks all!) I have decided to narrow my search for a provable, pragmatic, approach to Haskell. I chose Haskell because it is actually useful (there are many web frameworks written for it, this seems a good benchmark) AND I think it is strict enough, functionally, that it might be provable, or at least allow the testing of invariants.
Here's what I want (and have been unable to find)
I want a framework that can look at a Haskell function, add, written in psudocode:
add(a, b):
return a + b
- and check if certain invarients hold over every execution state. I'd prefer for some formal proof, however I would settle for something like a model-checker.
In this example, the invarient would be that given values a and b, the return value is always the sum a+b.
This is a simple example, but I don't think it is an impossibility for a framework like this to exist. There certainly would be an upper limit on the complexity of a function that could be tested (10 string inputs to a function would certainly take a long time!) but this would encourage more careful design of functions, and is no different than using other formal methods. Imagine using Z or B, when you define variables/sets, you make damn sure that you give the variables the smallest possible ranges. If your INT is never going to be above 100, make sure you initialise it as such! Techniques like these, and proper problem decomposition should - I think - allow for satisfactory checking of a pure-functional language like Haskell.
I am not - yet - very experienced with formal methods or Haskell. Let me know if my idea is a sound one, or maybe you think that haskell is not suited? If you suggest a different language, please make sure it passes the "has-a-web-framework" test, and do read the original question :-)
Well, a few things to start with, since you're taking the Haskell route:
Are you familiar with the Curry-Howard correspondence? There are systems used for machine-checked proofs based on this which are, in many ways, simply functional programming languages with very powerful type systems.
Are you familiar with the areas of abstract mathematics that provide useful concepts for analyzing Haskell code? Various flavors of algebra and some bits of category theory come up a lot.
Keep in mind that Haskell, like all Turing-complete languages, always has the possibility of nontermination. In general, it's much harder to prove that something will always be true than it is to prove that either something will be true or will depend on a nonterminating value.
If you're seriously going for proof, not merely testing, these are the sort of things to keep in mind. The basic rule is this: Make invalid states cause compiler errors. Prevent invalid data from being encoded in the first place, then let the type checker do the tedious part for you.
If you want to go even further, if memory serves me the proof assistant Coq has an "extract to Haskell" feature that will let you prove arbitrary properties about critical functions, then turn the proofs into Haskell code.
For doing fancy type system stuff directly in Haskell, Oleg Kiselyov is the Grand Master. You can find examples on his site of neat tricks like higher-rank polymorphic types to encode static proofs of array bounds checking.
For more lightweight stuff, you can do things like using a type-level certificate to mark a piece of data as having been checked for correctness. You're still on your own for the correctness check itself, but other code can at least rely on knowing that some data has, in fact, been checked.
Another step you can take, building off of lightweight verification and fancy type system tricks, is to use the fact that Haskell works well as a host language for embedding domain-specific languages; first construct a carefully restricted sublanguage (ideally, one that isn't Turing-complete) about which you can more easily prove useful properties, then use programs in that DSL to provide key pieces of core functionality in your overall program. For instance, you could prove that a two-argument function is associative in order to justify parallelized reduction of a collection of items using that function (since the ordering of function application doesn't matter, only the ordering of arguments).
Oh, one last thing. Some advice on avoiding the pitfalls that Haskell does contain, which can sabotage code that would otherwise be safe-by-construction: Your sworn enemies here are general recursion, the IO
monad, and partial functions:
The last is relatively easy to avoid: Don't write them, and don't use them. Make sure that every set of pattern matches handles every possible case, and never use error
or undefined
. The only tricky part is avoiding standard library functions that may cause errors. Some are obviously unsafe, like fromJust :: Maybe a -> a
or head :: [a] -> a
but others may be more subtle. If you find yourself writing a function that really, truly cannot do anything with some input values, then you're allowing invalid states to be encoded by the input type and need to fix that, first.
The second is easy to avoid on a superficial level by scattering stuff through assorted pure functions that are then used from an IO
expression. Better is to, as much as possible, move the entire program out into pure code so that it can be evaluated independently with everything but the actual I/O. This mostly becomes tricky only when you need recursion that's driven by external input, which brings me to the final item:
Words to the wise: Well-founded recursion and productive corecursion. Always ensure that recursive functions are either going from some starting point to a known base case, or are generating a series of elements on demand. In pure code, the easiest way to do this is by either recursively collapsing a finite data structure (e.g., instead of a function calling itself directly while incrementing a counter up to some maximum, create a list holding the range of counter values and fold it) or recursively generating a lazy data structure (e.g. a list of progressive approximations to some value), while carefully never mixing the two directly (e.g., don't just "find the first element in the stream meeting some condition"; it might not exist. Instead, take values from the stream up to some maximum depth, then search the finite list, handling the not-found case appropriately).
Combining the last two items, for the parts where you really do need IO
with general recursion, try to build the program as incremental components, then condense all the awkward bits into a single "driver" function. For instance, you could write a GUI event loop with a pure function like mainLoop :: UIState -> Events -> UIState
, an exit test like quitMessage :: Events -> Bool
, a function to get pending events getEvents :: IO Events
, and an update function updateUI :: UIState -> IO ()
, then actually run the thing with a generalized function like runLoopIO :: (b -> a -> b) -> b -> IO a -> (b -> IO ()) -> IO ()
. This keeps the complicated parts truly pure, letting you run the whole program with an event script and check the resulting UI state, while isolating the awkward recursive I/O parts into a single, abstract function that's easy to comprehend and will often be inevitably correct by parametricity.