What Makes Julia Unique?
A look at how multiple dispatch distinguishes Julia from all other programming languages
Multiple dispatch is the killer feature of the Julia programming language, but unfortunately, few developers have even heard about multiple dispatch. Fewer still know what it is or how it works. That isn't surprising given that very few languages support multiple dispatch and those who do tend to do a good job of tucking it away. Thus, before I can go on a rant about the awesomeness of multiple dispatch, I have to explain what exactly it is. I will give you a clue: It has to do with how functions are called, but let us take a step back to elaborate.
When a program runs and encounters a function call, it has to figure out what code to jump to and execute. In a simple procedural programming languages such as C or Pascal, this process is straightforward. Every function is implemented as a subroutine, located at a unique memory location. Calling the function simply entails jumping to the memory address of the subroutine and executing every instruction until the microprocessor hits a return instruction.
Things get a bit more tricky when dealing with a function pointer. The subroutine we jump to can change during runtime because code is allowed to change what subroutine address is stored in the function pointer. Why am I mentioning these details? Because I want to convey the idea that calling a function and determining what code to execute isn't always a trivial matter.
Want to learn more about Julia? Read my book: Julia as a Second Language
Consider the complexity of invoking a method in object-oriented programming.
warrior.attack(knight)
The member-function attack
doesn't correspond to a subroutine at a specific memory address. When the attack
method is invoked on the warrior
object, a complex process of determining which subroutine to jump to is kicked into gear. We have to determine what kind of warrior is performing the attack. One might imagine a type-hierarchy of different warriors, such as archers, pikemen, and knights.
Because an archer attacks different from a pikeman or knight, each type will implement the attack
method differently. Through a process called single dispatch, we determine which method to call. From a low-level perspective, we are trying to determine which subroutine to jump to when executing the warrior.attack(knight)
statement.
How single dispatch works depends on whether we are dealing with a dynamic or statically typed language. Let us focus on how it works in a dynamically typed language because we are going to compare this process to Julia, which is also a dynamically typed language.
Imagine we have warrior a
attacking a warrior b
. Our first step will be to determine what type a
is. In a dynamically typed language, every object knows what type they are. In Objective-C for instance, each object has a field called isa
pointing to a class object describing the type of that object. In the graphics below, we pretend that warrior a
is an instance of the Archer
class. The Archer
class contains function pointers for every implemented method. To locate the correct method, we do a dictionary lookup on the key "attack!".
The exclamation point at the end of several methods may look peculiar. Don't worry, it is just a naming convention which is popular in Lisp and Julia for mutating functions. It has no semantic meaning.
Strictly speaking, it would be wrong to talk about function pointers in most dynamic languages. In Ruby for instance you don't actually point to any subroutine with machine code but an abstract-syntax tree (AST) produced by parsing the method. The Ruby interpreter interprets the AST to run the code in the method.
What we have just discussed is called single dispatch because we decide what method to call based on a single object. The type of object b
doesn't influence the method lookup process in any way. With multiple dispatch, in contrast, every argument in a function call play a role in determining which method gets picked. I know that sounds funky, so let me give you a motivation for having multiple dispatch by explaining the problem with single dispatch.
What Problems Do Multiple Dispatch Solve?
Let us say that we have a battle!
function written in Julia code. It simulates a battle between two warriors a
and b
by calling the attack!
function and printing out messages to the user depending on the outcome. Most of the following code should be self-explanatory. In Julia, we use ::
to separate variable name from variable type. Thus, in the code example, a::Warrior
, tells Julia that the battle!
function has an argument named a
of type Warrior
.
function battle!(a::Warrior, b::Warrior)
attack!(a, b)
if a.health == 0 && b.health == 0
println(a.name, " and ", b.name, " destroyed each other")
elseif a.health == 0
println(b.name, " defeated ", a.name)
elseif b.health == 0
println(a.name, " defeated ", b.name)
else
println(b.name, " survived attack from ", a.name)
end
end
Look at the code and ask yourself this simple question: Would similar code work in C++ or Java? At first glance, it may seem possible. Both languages allow you to define multiple functions with the same name but different arguments. You could write something similar to the following Julia code:
function attack!(a::Archer, b::Archer)
if a.arrows > 0
shoot!(a)
damage = 6 + rand(1:6)
b.health = max(b.health - damage, 0)
end
a.health, b.health
end
function attack!(a::Archer, b::Knight)
if a.arrows > 0
shoot!(a)
damage = rand(1:6)
if b.mounted
damage += 3
end
b.health = max(b.health - damage, 0)
end
a.health, b.health
end
function attack!(a::Knight, b::Knight)
a.health = max(a.health - rand(1:6), 0)
b.health = max(b.health - rand(1:6), 0)
a.health, b.health
end
The specifics of the code aren't important. What I want you to takeaway from this code example is that we have defined attack!
three times. Each definition accepts arguments of different types. In C++ and Java, we would call this function overloading. At compile time, the compiler will pick the appropriate function to call by examining the type of each input argument at the call site.
Here is the kicker: A C++ compiler cannot possibly guess which attack!
function to call inside the battle!
function because it doesn't know the concrete type of argument a
and b
. All the compiler knows is that both arguments are some subtype of the Warrior
type. Which subtype can only be determined when the code is actually run. That is a bummer because function overloading only works at compile time.
In this case, multiple dispatch can do something neither single dispatchnor function overloading can do: It can select the right code based on the type of both argument a
and b
at runtime.
How Multiple Dispatch Works
Remember how single dispatch works by looking up the correct method at runtime? Multiple dispatch is also about picking the correct method. The attack!
definitions you just saw are not actually function definitions but method definitions. To define an attack!
function in Julia, you would write:
function attack! end
Why no arguments? Because Julia functions don't have arguments, Julia methods have arguments. Unlike object-oriented languages, methods in Julia are attached to functions, rather than classes.
Hence, function invocation in Julia is carried out by first looking up which function is called. On each function, Julia registers a method table. This table is searched from top to bottom to find a method accepting argument types matching the types of the input arguments provided at the function call site.
Julia is a Just-in-Time (JIT) compiled language, so the method source code turns into executable machine code in several steps:
When a Julia file is loaded into memory, the source code of every method is parsed and turned into an abstract-syntax tree (AST).
The AST of each method is stored in the correct method table for the correct function.
At runtime when a method gets located we first get the AST. The AST is turned into machine code by the JIT compiler and cached for later lookups.
The process is actually a lot more complex than I show here. You see, an abstract-syntax tree can be very generic. It could be a calculation defined for number arguments. The calculation performed would be the same whether the arguments are 16-bit unsigned integers or 32-bit signed integers. However, the assembly code for these cases will not look the same. Hence, the same AST can produce several machine code subroutines. Julia will add an entry for every case in the method table. Thus, a method table isn't limited to the number of methods you wrote source code for.
What Makes Julia Multiple Dispatch Unique
Keep reading with a 7-day free trial
Subscribe to Erik Explores to keep reading this post and get 7 days of free access to the full post archives.