Accidental sparse sequences

Nothing's going to change my world.

Sequences are absolutely everywhere in clojure - hiding as functions, they're almost certainly the first thing a beginner encounters. They're a primary abstraction, not the least as your code and data. Pick up almost anything, and there wll be a sequence of it somehow.

Literal, concrete sequences are so common that we can overlook the important idiom of empty sequences - something we almost never write by hand. I'd argue that a likely culprit of this is the ubiquity of nil in situations where we might otherwise have to define a concrete collection. Destructuring is a common place to see this.

nil, if you aren't familiar with clojure, is a bit special. It's a real value, can implement protocols, can be used as a key in a map... the list goes on. Even so, it remains a source of much gnashing of teeth and NPEs. I'd love a version of clojure that doubled down on "nil in, nil out" - but that's an argument for another day. Java interop, we see you.

Eric Normand has done a fantastic job explaining this "punning" behaviour, so if that is an intriguing topic to you, stop, click, and go read.

With that in mind, what's the big difference between [nil] and [] anyway? Nil is nothing, so we're good, right?

Well, no. As we've hinted above, nil is special, but it will still cause booms, even without getting exotic and trying to upper case a string or so on.

Check this out:

user=> (defn foo [[a b]] (+ a b))
user=> (foo [1 2])
user=> (foo nil)
Execution error (NullPointerException) at user/foo (REPL:1).

Here we have a perfectly fine, standard piece of code - destructuring a tuple argument. It cannot under any circumstances be invoked with nil, or we get an exception. There's sadly many other examples. Hand on heart, it's one of the things that gets in my way most with clojure, tracking down NPEs and similar.

Let's look at a way we can mess up with a sparse sequence, by defining it in a concrete way in our code. Not an unreasonable thing to do, in this contrived example we know it's two items, so...

(let [admins [(user/primary-admin :group) (user/secondary-admin :group)]]
  (doseq [admin admins]
     (send-mail! admin)))

When this code is first written, all may be well. The functions that return the admins could either do that or throw, for example. Let's imagine someone keen comes along later and those functions are refactored such that they return nil instead of throwing.

Suddenly we have a sparse collection, and doseq and its friends will happily chew through the values we provide, and invoke the send-mail! function with nil. Now we have an exception in our mail function, sending us off on a wild goose chase in a completely wrong direction, or worse.

Better, here, is not to do anything at all, and that's why the concrete collection in this case is arguably a smell.

(let [admins (keep (fn [f] (f :group))
                   [user/primary-admin user/secondary-admin])]
  (doseq [admin admins]
    (send-mail! admin)))

keep is our friend here - pretty often we want to map over something and discard nils. Yes, you can do this with map, filter, remove and so on, but keep is a neat idiom.

doseq with an empty collection is effectively a noop, nothing happens. This is what we want.

By discarding a concrete collection, we've not changed the original behaviour of the code, and have gained resilience and flexibility for free. There's different semantics, but if the intention is to "email all admins or none", then that's probably better expressed in an actual invariant.

As always, there are no hard and fast rules - but if you do find yourself pondering a concretely defined collection, might be worth keeping this kind of thing in mind.

Tags: clojure codestyle

Copyright © 2021 Dan Peddle RSS
Powered by Cryogen