I’ve been exploring Clojure over the last year or so, and it’s one of the most joyful languages I’ve ever used. That said, I all too frequently cause silly runtime errors that I know could have been prevented statically in other languages such as Haskell. When I first heard about Ambrose Bonnaire-Sergeant’s work to create a static type system for Clojure, I began to hope that I could get back some of that safety I had been missing without sacrificing the other advantages of Clojure. Ambrose recently announced a production-ready release, so I finally decided to dig in and make an effort to grok it.
Explaining what I know to others has always helped me shore up my own understanding, so I decided to write down some of what I’ve learned so far. I’m going to walk through a series of interesting types I’ve encountered while working with some standard
clojure.core library functions. As we go along, the types will get more complex and show off additional features of the type system. My hope is that this will be a useful tutorial for people who want to learn Typed Clojure.
Some Initial Concepts
Before we get into the types, we should lay down some groundwork.
For those familiar with the type systems of the C-like languages or even that of Haskell, Typed Clojure is quite a departure. The first thing I noticed about it was the pervasive use of ad-hoc type unions, which represent a set of possible types a value is permitted to take.
For example, if a value can be an
Integer or a
String, then its type would be described as
(U Integer String), the union of those two types. You can supply a simple
Integer or a
(U Integer String) is expected.
Importantly, the converse is not true – a
(U Integer String) is itself neither an
Integer or a
String. The type system will require you to resolve the ambiguity before you pass such a value to a function that requires only an
Integer. Fortunately, this is not hard to do, as will be explained below.
Clojure was designed as a dynamically typed language, and type unions are a convenient way to describe the behavior of a dynamic language in static terms.
Typed Clojure has value types, which represent the type of all variables that have a certain literal value. The string literal
"foo" has the type
(Value "foo"). Value types are subtypes of their literals’ types, so
(Value "foo") will typecheck anywhere
String is required.
nil Is Explicit
Unlike in Java, where all non-primitive types are implicitly nullable, types in
core.typed are what they say they are. The value
nil has its own value type, also called
nil, and it can inhabit no other type. This is immensely useful because it forces you to deal with any possible
nils that may arise, and thereby avoid the dreaded
Bring on the Types
That’s enough prerequisites for now, so let’s start checking out some types. We’ll pick up the rest along the way.
The simplest type I’ll show you is just a union of several
java.lang.* boxed integer types. A function written in terms of AnyInteger can accept any of the concrete types mentioned in the union.
(def-alias AnyInteger (U Integer Long clojure.lang.BigInt BigInteger Short Byte))
If you write functions in terms of
AnyInteger, they can be polymorphic with respect to the particular integer type used, and thus more flexible.
(U x nil) a.k.a.
This is an extremely useful type: the union of the
nil type with some other type. It’s often seen under the alias
(Option x), and a great many functions in the standard library mention it. It’s Typed Clojure’s analogue of option types in languages like Haskell, Scala, and Rust, which represent a value that might not be present.
If a function returns
(Option String), you are forced by the type system to handle the possibility of
nil in some way. Thanks to the magic of occurrence typing, a conditional check against
nil is enough to satisfy the checker, and pluck out the
String from the union.
;; let `foo` be a variable of type (Option String) (if foo (...within this branch, `foo` has type String since all Strings are truthy) (...but in the else-branch, occurences of `foo` have type nil))
core.typed is examining your conditional logic and deducing what types can exist in each branch. In the above, if
foo were a
String instance, then the
if would consider it truthy and evaluate the first branch, so the checker is free to assume that it’s not
nil for that branch.
You can even just
(assert foo), and the
nil will be removed from the type union in subsequent occurrences of
foo, since the rest of the function would be unreachable if
nil. (But then of course you might get a runtime assertion exception.)
Occurence typing works for other types besides
nil. Make a conditional check with a runtime predicate like
keyword?, etc. and core.typed will narrow the types in your branches. This is a really elegant mechanism because you don’t have to do any extra annotations or write code with some kind of foreign smell – it just makes you do a runtime check that you should really be doing anyway.
It’s time for our first function.
name is a standard Clojure function to give you the name of something. It’s normally used to turn a keyword or symbol into a string. If you pass in a
String, you get the same
String back, making the function a no-op.
=> (name :foo) "foo" => (name "bar") "bar" => (name nil) NullPointerException ; d'oh!
The notation for function types resembles many other type systems, with the
-> arrow indicating the return type, though in Clojure it must be surrounded in square brackets.
(ann clojure.core/name [(U String Named) -> String])
This type says that
name can take a value that is either a
String or a
Named, and always gives you a non-nil String. The type
clojure.lang.Named, a simple Java interface implemented by Clojure objects that have a concept of a namespaced name. We can seamlessly mix types originating in Java into our Clojure-based types, so in a sense, Clojure’s new type system is a superset of Java’s. Rules about interfaces and base classes are exactly what you’d expect.
str function is a bit more interesting because it’s variadic. You can give it any number of arguments, of any type, and it will turn them all into strings and return their concatenation. It has the following type:
(ann clojure.core/str [Any * -> String])
Any type will match any other type. That is, all types are subtypes of
nil. You could also think of it as the union of all possible types. (Thankfully, it’s not really implemented that way, as infinite sets tend to be a bit expensive on RAM.)
The asterisk here means that our function takes an arbitrary number of arguments, each of type
Any. It often coincides with usage of the rest-parameter operator (
&) in Clojure argument destructuring.
iterate takes a function, an initial value, and produces a lazy sequence of results from successively applying that function.
=> (iterate inc 0) (0 1 2 3 ...)
It’s a higher order function, as it takes another function as an argument. It’s also polymorphic, since
iterate doesn’t care about the particular type of the collection elements it deals with.
(ann clojure.core/iterate (All [x] [[x -> x] x -> (LazySeq x)]))
(All [x]) form is introducing a type variable. This is similar to type parameters for generics in Java and C#, for example the
So the type above says that for any given type
iterate accepts a function
[x -> x] and an initial value also of type
x. It then produces a lazy sequence containing values of the same type
nth accesses a member of a collection by index. It’s our first example of a function with multiple type clauses.
(ann clojure.core/nth (All [x y] (Fn [(Indexed x) AnyInteger -> x] [(Indexed x) AnyInteger y -> (U x y)]))
Functions with multiple clauses are introduced with
(Fn). The clauses are tried in the given order, so you usualy want the most specific types first, and a catch-all general type last. Here, though, we’re simply matching on two different arities. The first case covers invocations that only supply an
Indexed collection and an index value. The second clause handles an extra parameter returned as a default value if the index is not found.
nth allows the default value to be of a different type (
y) than that of the collection elements (
x), and this is also reflected in the return type of
(U x y). Often the default value will be of the same type as
x, so the union
(U x x) will collapse into simply
x. That extra degree of freedom might seem like an odd design decision if you were designing this function with static types in mind, but the Clojure core libraries and cultural idioms were developed in a dynamically typed environment. Typed Clojure takes idiomatic Clojure code as given, and tries to provide as much safety as possible given that constraint.
The simple function
first, which extracts the first element of a sequence, has a really interesting signature.
(ann clojure.core/first (All [x] (Fn [(Option (EmptySeqable x)) -> nil] [(NonEmptySeqable x) -> x] [(Option (Seqable x)) -> (Option x)]))
We have three clauses in the type, even though they are all for the same arity.
The first case,
[(Option (EmptySeqable x)) -> nil], describes the fact that if we pass in a sequence that we know statically is empty, then we will always get a nil output. The
Option means that this case also covers static
[(NonEmptySeqable x) -> x] guarantees that a statically non-empty input sequence will always give us a non-
Finally, we have the general case of
[(Option (Seqable x)) -> (Option x)], which handles normal operation of
first when we don’t have any static information about the argument. You can give it nil or a possibly-empty sequence, and it will give you
nil or a value. In this case, the type system will force us to check against
nil, which is what we want.
The type for
second is even more interesting than that of
(ann clojure.core/second (All [x] (Fn [(Option (I (Seqable x) (CountRange 0 1))) -> nil] [(I (Seqable x) (CountRange 2)) -> x] [(Option (Seqable x)) -> (Option x)]))
Let’s break down each clause here.
[(Option (I (Seqable x) (CountRange 0 1))) -> nil]
This clause is using a type intersection that matches any type that’s both
Seqable (that is, able to produce a
x-typed values) and statically known to contain fewer than 2 elements. As there is no second element to retrieve, you’re guaranteed to get a
nil result. And since the argument type is wrapped in an
Option, a statically-
nil input will always produce a
[(I (Seqable x) (CountRange 2)) -> x]
The second clause is the happy path, when we statically know that we have a non-
nil input collection that has at least 2 elements. For that, we always get our non-
nil result, and don’t have to do a
[(Option (Seqable x)) -> (Option x)]
Finally we have the catch-all case. If we don’t have any static knowledge about this collection, all we can say is that you’ll get either a
nil or your value of type
The last stop on the today’s type tour is the venerable lisp
map function. You can pass any number of sequences to
map, and they will all be passed element-wise to the supplied function. This calls for a type of a form we haven’t seen yet.
(ann clojure.core/map (All [c a b ...] (Fn [[a b ... b -> c] (NonEmptySeqable a) (NonEmptySeqable b) ... b -> (NonEmptyLazySeq c)] [[a b ... b -> c] (U nil (Seqable a)) (U nil (Seqable b)) ... b -> (LazySeq c)]))
This is our first example of using an unbounded number of type variables. The
... in the
All form is how we introduce these.
Each clause starts with
[a b ... b -> c], which describes a function of an arbitrary number of arguments, each of which may have a distinct type. The remaining variadic arguments, such as
(NonEmptySeqable a) (NonEmptySeqable b) ... b in the first clause, represent some number of arguments, with non-empty sequences carrying elements of the same types, in the same order, as the function passed as the first argument. That is, if you pass a function of type
[String AnyInteger Double -> Whatever], then you have to feed it exactly three collections containing values of the corresponding types. That’s a really nice piece of type-safety that I wouldn’t have expected to be able to express.
I’ll admit I find the syntax for specifying the collection arguments here a bit confusing. I suppose I might have expected to see the right side of the ellipsis to be
(NonEmptySeqable b) instead of merely
By now you can probably read these types well enough to discern that the first clause guarantees a non-empty result for non-empty inputs, and that the second clause is the general case for possibly-empty or possibly-nil input sequences.
I found these examples by spelunking through the core.typed source code. I’m nowhere near grokking it all yet, but there’s a lot of fairly accessible stuff in
base_env.clj, which is where I got all of the above types.
There are also some other interesting aspects of the type system I haven’t talked about, such as heterogeneous maps, filters, predicates, type functions, and variance. I am particularly fond of the heterogeneous maps.
As for theory, Typed Clojure borrows liberally from Typed Racket, and Ambrose has mentioned a few papers in particular:
- The Design and Implementation of Typed Scheme
- Practical Variable-Arity Polymorphism
- Logical Types for Untyped Languages
Support Typed Clojure!
There’s still some of time left in the Typed Clojure crowdfunding campaign. If you think this work is valuable, consider chipping in!comments powered by Disqus