Common programming mistakes for Clojure developers to avoid

fogus picture fogus · Jan 7, 2010 · Viewed 21.3k times · Source

What are some common mistakes made by Clojure developers, and how can we avoid them?

For example; newcomers to Clojure think that the contains? function works the same as java.util.Collection#contains. However, contains? will only work similarly when used with indexed collections like maps and sets and you're looking for a given key:

(contains? {:a 1 :b 2} :b)
;=> true
(contains? {:a 1 :b 2} 2)
;=> false
(contains? #{:a 1 :b 2} :b)
;=> true

When used with numerically indexed collections (vectors, arrays) contains? only checks that the given element is within the valid range of indexes (zero-based):

(contains? [1 2 3 4] 4)
;=> false
(contains? [1 2 3 4] 0)
;=> true

If given a list, contains? will never return true.

Answer

Robert Campbell picture Robert Campbell · Jan 7, 2010

Literal Octals

At one point I was reading in a matrix which used leading zeros to maintain proper rows and columns. Mathematically this is correct, since leading zero obviously don't alter the underlying value. Attempts to define a var with this matrix, however, would fail mysteriously with:

java.lang.NumberFormatException: Invalid number: 08

which totally baffled me. The reason is that Clojure treats literal integer values with leading zeros as octals, and there is no number 08 in octal.

I should also mention that Clojure supports traditional Java hexadecimal values via the 0x prefix. You can also use any base between 2 and 36 by using the "base+r+value" notation, such as 2r101010 or 36r16 which are 42 base ten.


Trying to return literals in an anonymous function literal

This works:

user> (defn foo [key val]
    {key val})
#'user/foo
user> (foo :a 1)
{:a 1}

so I believed this would also work:

(#({%1 %2}) :a 1)

but it fails with:

java.lang.IllegalArgumentException: Wrong number of args passed to: PersistentArrayMap

because the #() reader macro gets expanded to

(fn [%1 %2] ({%1 %2}))  

with the map literal wrapped in parenthesis. Since it's the first element, it's treated as a function (which a literal map actually is), but no required arguments (such as a key) are provided. In summary, the anonymous function literal does not expand to

(fn [%1 %2] {%1 %2})  ; notice the lack of parenthesis

and so you can't have any literal value ([], :a, 4, %) as the body of the anonymous function.

Two solutions have been given in the comments. Brian Carper suggests using sequence implementation constructors (array-map, hash-set, vector) like so:

(#(array-map %1 %2) :a 1)

while Dan shows that you can use the identity function to unwrap the outer parenthesis:

(#(identity {%1 %2}) :a 1)

Brian's suggestion actually brings me to my next mistake...


Thinking that hash-map or array-map determine the unchanging concrete map implementation

Consider the following:

user> (class (hash-map))
clojure.lang.PersistentArrayMap
user> (class (hash-map :a 1))
clojure.lang.PersistentHashMap
user> (class (assoc (apply array-map (range 2000)) :a :1))
clojure.lang.PersistentHashMap

While you generally won't have to worry about the concrete implementation of a Clojure map, you should know that functions which grow a map - like assoc or conj - can take a PersistentArrayMap and return a PersistentHashMap, which performs faster for larger maps.


Using a function as the recursion point rather than a loop to provide initial bindings

When I started out, I wrote a lot of functions like this:

; Project Euler #3
(defn p3 
  ([] (p3 775147 600851475143 3))
  ([i n times]
    (if (and (divides? i n) (fast-prime? i times)) i
      (recur (dec i) n times))))

When in fact loop would have been more concise and idiomatic for this particular function:

; Elapsed time: 387 msecs
(defn p3 [] {:post [(= % 6857)]}
  (loop [i 775147 n 600851475143 times 3]
    (if (and (divides? i n) (fast-prime? i times)) i
      (recur (dec i) n times))))

Notice that I replaced the empty argument, "default constructor" function body (p3 775147 600851475143 3) with a loop + initial binding. The recur now rebinds the loop bindings (instead of the fn parameters) and jumps back to the recursion point (loop, instead of fn).


Referencing "phantom" vars

I'm speaking about the type of var you might define using the REPL - during your exploratory programming - then unknowingly reference in your source. Everything works fine until you reload the namespace (perhaps by closing your editor) and later discover a bunch of unbound symbols referenced throughout your code. This also happens frequently when you're refactoring, moving a var from one namespace to another.


Treating the for list comprehension like an imperative for loop

Essentially you're creating a lazy list based on existing lists rather than simply performing a controlled loop. Clojure's doseq is actually more analogous to imperative foreach looping constructs.

One example of how they're different is the ability to filter which elements they iterate over using arbitrary predicates:

user> (for [n '(1 2 3 4) :when (even? n)] n)
(2 4)

user> (for [n '(4 3 2 1) :while (even? n)] n)
(4)

Another way they're different is that they can operate on infinite lazy sequences:

user> (take 5 (for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x)))
(4 6 8 10 12)

They also can handle more than one binding expression, iterating over the rightmost expression first and working its way left:

user> (for [x '(1 2 3) y '(\a \b \c)] (str x y))
("1a" "1b" "1c" "2a" "2b" "2c" "3a" "3b" "3c")

There's also no break or continue to exit prematurely.


Overuse of structs

I come from an OOPish background so when I started Clojure my brain was still thinking in terms of objects. I found myself modeling everything as a struct because its grouping of "members", however loose, made me feel comfortable. In reality, structs should mostly be considered an optimization; Clojure will share the keys and some lookup information to conserve memory. You can further optimize them by defining accessors to speed up the key lookup process.

Overall you don't gain anything from using a struct over a map except for performance, so the added complexity might not be worth it.


Using unsugared BigDecimal constructors

I needed a lot of BigDecimals and was writing ugly code like this:

(let [foo (BigDecimal. "1") bar (BigDecimal. "42.42") baz (BigDecimal. "24.24")]

when in fact Clojure supports BigDecimal literals by appending M to the number:

(= (BigDecimal. "42.42") 42.42M) ; true

Using the sugared version cuts out a lot of the bloat. In the comments, twils mentioned that you can also use the bigdec and bigint functions to be more explicit, yet remain concise.


Using the Java package naming conversions for namespaces

This isn't actually a mistake per se, but rather something that goes against the idiomatic structure and naming of a typical Clojure project. My first substantial Clojure project had namespace declarations - and corresponding folder structures - like this:

(ns com.14clouds.myapp.repository)

which bloated up my fully-qualified function references:

(com.14clouds.myapp.repository/load-by-name "foo")

To complicate things even more, I used a standard Maven directory structure:

|-- src/
|   |-- main/
|   |   |-- java/
|   |   |-- clojure/
|   |   |-- resources/
|   |-- test/
...

which is more complex than the "standard" Clojure structure of:

|-- src/
|-- test/
|-- resources/

which is the default of Leiningen projects and Clojure itself.


Maps utilize Java's equals() rather than Clojure's = for key matching

Originally reported by chouser on IRC, this usage of Java's equals() leads to some unintuitive results:

user> (= (int 1) (long 1))
true
user> ({(int 1) :found} (int 1) :not-found)
:found
user> ({(int 1) :found} (long 1) :not-found)
:not-found

Since both Integer and Long instances of 1 are printed the same by default, it can be difficult to detect why your map isn't returning any values. This is especially true when you pass your key through a function which, perhaps unbeknownst to you, returns a long.

It should be noted that using Java's equals() instead of Clojure's = is essential for maps to conform to the java.util.Map interface.


I'm using Programming Clojure by Stuart Halloway, Practical Clojure by Luke VanderHart, and the help of countless Clojure hackers on IRC and the mailing list to help along my answers.