Basically, I know how to create graph data structures and use Dijkstra's algorithm in programming languages where side effects are allowed. Typically, graph algorithms use a structure to mark certain nodes as 'visited', but this has side effects, which I'm trying to avoid.
I can think of one way to implement this in a functional language, but it basically requires passing around large amounts of state to different functions, and I'm wondering if there is a more space-efficient solution.
You might check out how Martin Erwig's Haskell functional graph library does things. For instance, its shortest-path functions are all pure, and you can see the source code for how it's implemented.
Another option, like fmark mentioned, is to use an abstraction which allows you to implement pure functions in terms of state. He mentions the State monad (which is available in both lazy and strict varieties). Another option, if you're working in the GHC Haskell compiler/interpreter (or, I think, any Haskell implementation which supports rank-2 types), another option is the ST monad, which allows you to write pure functions which deal with mutable variables internally.