Writing Calcutron-33 Assembly Code
The practice of writing and testing assembly code
If you have read my previous stories on Calcutron-33 assembly programming, you will have an idea of what instructions exists and what they do. However, that doesn't mean you have a good sense of how to write assembly programs. Here I will give you an intro to how to use the Calcutron-33 tools to write and run programs.
Installing Calcutron-33 Assembler and Simulator
Go to my Calcutron-33 repository and download one of the binary releases for your operating system. The downloaded zip file will contain an executable named cutron
or cutron.exe
(if on Windows). To use the application, you simply need to place this executable in a location accessible by your path. A simple solution would be to place cutron
in $HOME/bin
or /usr/local/bin
. If you are unfamiliar with setting the PATH
environment variable to point to a directory with the cutron
executable, I advise you to read my Unix Command Line Crash Course, which covers this and more. If cutron
has been correctly installed, you should be able to run it from a terminal window and get the following result:
Installing Example Code
Downloading the cutron
executable doesn't give you any example code to run. You can find my example code in the Calcutron-33 examples/ subdirectory. If you want it locally, you can download the repository or download the source code from the latest release. Each release has Source code (zip) as the second last entry.
Writing and Running Your First Program
One of the simplest programs we can imagine read pairs of numbers from input, adds each pair and prints out the result. You can specify all the inputs using your keyboard or a file. Here is the program. The INP
instruction reads one number from the input and stores it in a register. We read in two inputs and store them in registers x1 and x2 respectively. Next, we run the ADD
instruction to add these two numbers and store them in the x3 register. Later, the OUT
instruction is used to print the contents of the x3 register. With the JMP
instruction, we jump back to the beginning.
loop:
INP x1
INP x2
ADD x3, x1, x2
OUT x3
JMP loop
HLT
You might ask if the program ever stops? Will this program not simply run forever? It would if not for the fact that Calcutron programs terminate whenever they hit a INP
instruction and there is no more input data available.
To be able to run this program, we assemble, meaning we turn it into machine code. Assume you stored this program in the adder.ct33
file. You can run the assemble
or asm
subcommand to produce machine code.
❯ cutron assemble adder.ct33
5109
5209
1312
7309
8000
0000
Since we want to save the machine code, it is better to redirect the output to a file:
❯ cutron assemble adder.ct33 > adder.machine
Now that you have an assembled program to run, we would like to supply it with some input data to process. Create a file named inputs.txt
(it can be named anything) and write the following numbers:
3
2
12
34
42
100
Again, you can write whatever numbers you like, but it is easier to follow along with my examples if you use the same filenames and numbers as me.
To run the program, you can use the sim
or run
subcommand. run
is just an alias for sim
, which is short for simulator.
❯ cutron sim adder.machine < inputs.txt
5
46
142
You may notice that the outputs are pairs of numbers added up. The simulator reads input from standard input (stdin) so you could provide inputs in numerous ways. In this example, we use the echo
command to pipe numbers into the simulator:
❯ echo 2 6 2 3 | cutron run adder.machine
8
5
You could even run it with no inputs. The simulator will ask you to type input numbers if it hits an INP
instruction, discovers that input has not yet been defined.
❯ cutron run adder.machine
5
4
3
2
9
5
You end your input by hitting the return key twice. A blank line indicates we are done with inputting numbers.
Next, are some programming challenges to work through. Keep my article detailing all the Calcutron-33 instructions handy in case you need to look up how a particular instruction works.
Doubler Challenge
Based on the first example code, we can make a simple challenge: Write a program which doubles whatever input you give it. Here is an example of assembling and running the doubler:
❯ cutron asm doubler.ct33 > doubler.machine
❯ echo 2 3 9 8 | cutron run doubler.machine
4
6
18
16
If you cannot solve the challenge, have a look at my solution.
Equalizer Challenge
The next challenge requires using a new instruction, BEQ
, branch if equal. In this case, you are not adding each pair of numbers, but you are comparing them. If the numbers are equal, you write out the number; otherwise, you discard it. Here is an example of running the equalizer:
❯ cutron asm equalizer.ct33 > equalizer.machine
❯ echo 2 3 5 5 4 8 2 2 | cutron run equalizer.machine
5
2
If you want to avoid looking at the solution, I can give you some hints:
Set up your
BEQ
comparison so that you skip over a jump back to the beginning of the program.Write a label, for instance
equal:
on a line before the code you wish to run in case the two inputs are equal.If the inputs are equal, write out either one of the registers with
OUT
. BothOUT x1
andOUT x2
will work equally well.
Maximizer Challenge
This challenge is a bit tricker because unlike the equalizer challenge, you are not deciding whether to write out or not write out a number. Instead, you have to choose between two different code paths to write out the number which is the larger one within a pair. Furthermore, this time, we are using the BGT
instruction rather than BEQ
to compare numbers.
❯ cutron asm maximizer.ct33 > maximizer.machine
❯ echo 1 2 3 4 5 6 | cutron run maximizer.machine
2
4
6
If you would rather not look at my solution, I can give some hints. Just read as many hints as you need to come up with a solution:
Setup
BGT
so that it jumps over the code that handles writing out one of the inputs. The code it jumps to should take care of writing out the other input and jumping back to the start.To structure your code, it can help to place labels you don't jump to but which clarifies what a section of the code is. Try for instance labeling code treating either
first
orsecond
input as the largest.
Countdown Challenge
Read a single number from input and then output every number down to 1. Thus, if you input 3, you should output 3, 2 and 1. Here is an example of running the countdown program.
❯ echo 5 | cutron run examples/countdown.machine
5
4
3
2
1
To solve this problem, it is useful to use the DEC
instruction, which decrements a register value by one each time it is called. You can find the solution here.
The general idea of the program is:
Get number to count down.
Write out number
Decrement number
If number is greater than zero goto 2
You can choose whether you want to support countdowns for multiple numbers or not.
Using the Debugger
The Calcutron debugger lets you step through code one instruction at a time and look at contents of registers or memory. You can also at any time alter the contents of a register or memory location. It is also possible to run individual assembly code instructions or look at their machine code representation. The debugger is also a good way of learning what base instructions the pseudo instructions translate into.
To invoke the debugger, you use the debug
subcommand, which has the alias dbg
. The debugger allows you to specify a program on the command line which you want to load. It will take both assembly code in .ct33
files and machine code in .machine
files as input. Assembly code files will get assembled to machine code. However, a benefit of loading assembly code is that the debugger will retain labels.
You can use the help
command inside the debugger to get a list of all available commands.
❯ cutron debug examples/maximizer.ct33
caluctron> help
help
inputs
outputs
list
load
status
next
run
print
set
asm
reset
memory
You can go further and use the help command to ask for help about a specific command. Consider this example: we have loaded the maximizer program. It has the following source code:
loop:
INP x1
INP x2
BGT x1, x2, first
second:
OUT x2
JMP loop
first:
OUT x1
JMP loop
HLT
We want to look at how this program gets assembled by the debugger. Let us look at what the help system can tell us about the list
command:
caluctron> help list
NAME
list -- list code of loaded program
SYNOPSIS
list
DESCRIPTION
disassembles code in memory and shows it
That functionality is precisely what we need, so let's run the list
command and see what our maximizer gets assembled as and the resulting disassembly.
Let's unpack what we are looking and why the code we see does not match the original source code we wrote. The first column of yellow numbers give you the address of each instruction in memory. You could look up an individual memory location with the memory
command. For instance, let us look up the contents of memory at address 02 which is where the branch instruction is stored:
caluctron> memory 02
02 9123
The first part shown is the address, and the second part 9123
is the number stored at that address. This number is the machine code for the instruction BGT x1, x2, 3
. You can see that the instruction BGT x1, x2, first
gets translated. The label first
is replaced with the address 3. It is a relative address. The original OUT
instruction is three instructions forward. However, if we look at the disassembled code you will see STOR x1, x0, -1
instead. Why? Where did the OUT
instruction go? If you look at the instruction-set overview, you will notice that OUT
is a pseudo instruction.
You can quickly check what pseudo instructions translate to in the debugger by using the asm
command:
caluctron> asm ADD x4, x3, x1
1431 ADD x4, x3, x1
caluctron> asm OUT x1
7109 STOR x1, x0, -1
Writing to output on the Calcutron-33 processor is done the same way has many architectures such as Motorola 68000 and RISC-V which both deal with I/O by reading and writing to specific addresses. In these cases, addresses are mapped to registers on I/O devices rather than actual main memory. When doing sign extension of -1 to four digits, we get the address 9999. If it is unclear how that works, review my story on representing negative numbers on a computer.
On Calcutron-33 I am using the high address numbers for I/O as they are unlikely to be used for storing a program. You will see the INP
instruction involves reading from the same memory address:
caluctron> asm INP x2
5209 LOAD x2, x0, -1
Now that you have a sense of how the disassembly works, let us run the program. We use the run
command to run the program in memory.
caluctron> run
00 5109 LOAD x1, x0, -1
10 2 11 8
01 5209 LOAD x2, x0, -1
02 9123 BGT x1, x2, 3
05 7109 STOR x1, x0, -1
06 8000 JMP x0, 0
00 5109 LOAD x1, x0, -1
01 5209 LOAD x2, x0, -1
02 9123 BGT x1, x2, 3
05 7109 STOR x1, x0, -1
06 8000 JMP x0, 0
00 5109 LOAD x1, x0, -1
PC: 00 Steps: 10
x1: 0011, x4: 0000, x7: 0000
x2: 0008, x5: 0000, x8: 0000
x3: 0000, x6: 0000, x9: 0000
Inputs: 10, 2, 11, 8
Outputs: 10, 11
If you try this, you will notice that the program blocks upon running the LOAD
instruction because it tries to read from input, and we have supplied no inputs. I write some input numbers, 10 2 11 8
and hit return twice to continue execution. As an alternative, you could set the input directly with the input command like this:
caluctron> inputs 10 2 11 8
But, let us step back and look at the output. What is the simulator showing us? It shows every instruction executed and the address of that instruction. You will notice that the JMP
instruction causes the address of the next instruction executed to change. When execution is done, you see an output of the status of the Calcutron-33 computer. You can see the program counter (PC), the number of instructions executed (Steps) and the contents of every register from x1 to x9. We can also see what outputs got produced.
Stepping through a program
To gain a better understanding of how a program works, it can be useful to step through it one instruction at a time. To do that, we need to reset our program using the reset
command. The next
command will do on instruction at a time.
caluctron> next
00 5109 LOAD x1, x0, -1
x1: 0010
caluctron> next
01 5209 LOAD x2, x0, -1
x2: 0002
caluctron> next
02 9123 BGT x1, x2, 3
PC: 05 Steps: 3
You will notice that next
attempt to show you useful status information upon completing each instruction. For instance, it tends to show the registers which got modified. The branch instruction BGT
modifies the program counter (PC) so we get to see its new value. At any time, you can look at a specific register with the print
command or a full overview with the status
command:
caluctron> print x1
x1: 10
caluctron> print pc
PC: 05
caluctron> status
PC: 05 Steps: 3
x1: 0010, x4: 0000, x7: 0000
x2: 0002, x5: 0000, x8: 0000
x3: 0000, x6: 0000, x9: 0000
Inputs: 10, 2, 11, 8
Outputs:
You could even run an arbitrary instruction. Let us say you want to see what happens if you run the ADDI x1, 30
instructions:
caluctron> run ADDI x1, 30
caluctron> print x1
x1: 40
Since the x1
register contained 10 already, it is changed to 40 after this addition.
NOTE: You don't have to write out each command fully; Calcutron-33 debugger support command completion. You can use the up-arrow to get back a previously issued command, or use the tab-key to complete a command you have started typing. To type "next" it is enough to write "n" and press the tab key.
Modifying a program
Machine code is just data in memory, so you can change it like any other data. Say you want the second input instruction to be INP x3
, you could do that. You can use the asm
command to figure out what the machine code is and then use the memory
command to change the contents of address 01.
caluctron> asm INP x3
5309 LOAD x3, x0, -1
caluctron> memory 01 5309
caluctron> list
00 5109 LOAD x1, x0, -1
01 5309 LOAD x3, x0, -1
02 9123 BGT x1, x2, 3
03 7209 STOR x2, x0, -1
04 8000 JMP x0, 0
05 7109 STOR x1, x0, -1
06 8000 JMP x0, 0
07 0000 HLT
Modifying Registers
We can set registers directly and then for instance run an instruction to add the registers together. Next, we can use the print
command to look at the contents of the register holding the result.
❯ cutron debug
caluctron> set x1 3
x1: 3
caluctron> set x2 4
x2: 4
caluctron> run ADD x3, x2, x1
caluctron> print x3
x3: 7
Final Challenge
Knowing the debugger, I will propose a final programming challenge: It is called the digit exploder because it picks out the individual digits of a larger number. Here are a few examples of running the digit exploder program. Why would something like that be useful you might ask? It is a useful stepping stone towards writing a program to perform fast multiplication (will cover this in future stories).
❯ echo 2385 | cutron run examples/digitexploder.machine
2
3
8
5
❯ echo 4321 | cutron run examples/digitexploder.machine
4
3
2
1
Before looking at my solution, I will give you some hints:
Play around with the shift commands
LSH
andRSH
in the debugger and see if you can figure out how they can help you get hold of an individual digit.Use conditional branch with
BGT
to run a loop that picks off one digit at a time.If you don't understand the solution I wrote try to step through it in the debugger to see what it does.
There are a lot more cool and interesting programs you can write with Calcutron-33. The examples/
directory contain a lot more examples to look at. Expect me to write other challenges in the future.
One of my longer-term goals is to see if I can demonstrate how viruses work with Calcutron-33 assembly. Programming a virus was one of the more challenging things I did when I was learning assembly programming. I am not encouraging anyone to spread a virus but learning how they work in a safe environment is quite educational in my view. In particular because writing a virus lets you deal with aspects of programming which is often unique to assembly programming.
Thoroughly enjoyed the challenges Erik, mine weren't as succinct as yours.
Would you be interested in writing an article on how to move on to RISC-V assembly?
Have subscribed, look forward to more.
Regards Tristan