What is a good way to design/structure large functional programs, especially in Haskell?
I've been through a bunch of the tutorials (Write Yourself a Scheme being my favorite, with Real World Haskell a close second) - but most of the programs are relatively small, and single-purpose. Additionally, I don't consider some of them to be particularly elegant (for example, the vast lookup tables in WYAS).
I'm now wanting to write larger programs, with more moving parts - acquiring data from a variety of different sources, cleaning it, processing it in various ways, displaying it in user interfaces, persisting it, communicating over networks, etc. How could one best structure such code to be legible, maintainable, and adaptable to changing requirements?
There is quite a large literature addressing these questions for large object-oriented imperative programs. Ideas like MVC, design patterns, etc. are decent prescriptions for realizing broad goals like separation of concerns and reusability in an OO style. Additionally, newer imperative languages lend themselves to a 'design as you grow' style of refactoring to which, in my novice opinion, Haskell appears less well-suited.
Is there an equivalent literature for Haskell? How is the zoo of exotic control structures available in functional programming (monads, arrows, applicative, etc.) best employed for this purpose? What best practices could you recommend?
Thanks!
EDIT (this is a follow-up to Don Stewart's answer):
@dons mentioned: "Monads capture key architectural designs in types."
I guess my question is: how should one think about key architectural designs in a pure functional language?
Consider the example of several data streams, and several processing steps. I can write modular parsers for the data streams to a set of data structures, and I can implement each processing step as a pure function. The processing steps required for one piece of data will depend on its value and others'. Some of the steps should be followed by side-effects like GUI updates or database queries.
What's the 'Right' way to tie the data and the parsing steps in a nice way? One could write a big function which does the right thing for the various data types. Or one could use a monad to keep track of what's been processed so far and have each processing step get whatever it needs next from the monad state. Or one could write largely separate programs and send messages around (I don't much like this option).
The slides he linked have a Things we Need bullet: "Idioms for mapping design onto types/functions/classes/monads". What are the idioms? :)
I talk a bit about this in Engineering Large Projects in Haskell and in the Design and Implementation of XMonad. Engineering in the large is about managing complexity. The primary code structuring mechanisms in Haskell for managing complexity are:
The type system
The profiler
Purity
Testing
Monads for Structuring
Type classes and existential types
Concurrency and parallelism
par
into your program to beat the competition with easy, composable parallelism.Refactor
Use the FFI wisely
Meta programming
Packaging and distribution
Warnings
-Wall
to keep your code clean of smells. You might also look at Agda, Isabelle or Catch for more assurance. For lint-like checking, see the great hlint, which will suggest improvements.With all these tools you can keep a handle on complexity, removing as many interactions between components as possible. Ideally, you have a very large base of pure code, which is really easy to maintain, since it is compositional. That's not always possible, but it is worth aiming for.
In general: decompose the logical units of your system into the smallest referentially transparent components possible, then implement them in modules. Global or local environments for sets of components (or inside components) might be mapped to monads. Use algebraic data types to describe core data structures. Share those definitions widely.