Unix Commands, Pipes, and Processes
How to combine Unix commands with pipes and manage running programs
What makes the Unix command-line so powerful is that all the commands are made to work well together. You can chain commands together so that the output from one command becomes the input to another. Here is an example of doing just that.
Downloading pizza sales data and turning displaying three columns of sales data.
❯ RDATASETS=https://vincentarelbundock.github.io/Rdatasets/csv
❯ curl -s $RDATASETS/gt/pizzaplace.csv > pizzaplace.csv
❯ cat pizzaplace.csv | tail -10 | cut -d',' -f6-8 |
tr ',' '\t' | tr -d '"'
S classic 12
M supreme 16.5
L chicken 20.75
M chicken 16.75
S supreme 12.5
L veggie 17.95
S classic 12
M chicken 16.75
L veggie 20.25
S chicken 12.75
In this example, we strung together the commands cat
, tail
, cut
and tr
to produce the final output. We will cover each of these commands and how they are strung together more in detail later. My intention with this story is to give you some practical skills in working with data:
How Unix programs are plugged together with pipes
Hash data for verification
Managing running processes
Downloading data and processing it
Working with Unix Pipes
Much of the power of the Unix shell comes from the existence of pipes. To understand pipes, we need to understand the following I/O concepts:
stdin
- Standard input is where a Unix processes reads character inputs from. By default, this will be your keyboard.stdout
- Standard output is where a Unix process prints characters. By default, this is the Terminal window you launched the command from.stderr
- Standard error is similar to standard output, but is the output for error messages. Will default to the Terminal windows you run the command from.
You can think of these as ports or plugs on every Unix process. Every time you launch a program in a Unix-like operating system such as Linux or macOS, you get a process with one input the stdin
and two outputs stdout
and stderr
. We will demonstrate with the sha256sum
command, which is used to create SHA256 hashes of input data. Hashes are useful to do integrity checks on downloaded files.
When using a terminal emulator such as Terminal.app on macOS or GNOME Terminal on Linux, your keyboard and terminal windows are represented by a device file called /dev/tty
. We will run sha256sum
without any redirection first. You type input until you signal End-of-Transmission (EOT) with Ctrl-D. In this example, I typed "hello world."
❯ sha256sum
hello world
a948904f2f0f479b8f8197694b30184b0d2ed1c1cd2a1ec0fb85d299a192a447 -
Next, we will do as in the second illustration and redirect stdout
to a file called checksum.txt
.
❯ sha256sum > checksum.txt
hello world
❯ cat checksum.txt
a948904f2f0f479b8f8197694b30184b0d2ed1c1cd2a1ec0fb85d299a192a447 -
Hashes are useful for tons of things. We could use this as a primitive password checking mechanism. In this case, you would have to pretend that "hello world" is a secret password nobody may know. How do you check if somebody knows the password? You can do that by using the --check
switch, which instructs sha256sum
to verify is input hashes to the same value that has been stored in another file.
❯ sha256sum --check checksum.txt
hello
-: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match
❯ sha256sum --check checksum.txt
hello world
-: OK
Notice when I didn't write the full text correctly, our checksum failed. This mechanism is used to verify downloaded files. Imagine you downloaded many gigabytes on data in a zip archive. You want to make sure that there were no errors in the transmission. The site allowing you to download the file, can offer you a checksum file which you can verify against.
Writing the input to the sha256sum
command is cumbersome. With pipes, we can replace the keyboard as input with another process. In this example, we use the echo
command to provide input.
❯ echo foobar | sha256sum
aec070645fe53ee3b3763059376134f058cc337247c978add178b6ccdfb0019f -
Using the pipe |
symbol, we conceptually connect two running shell commands as shown below.
We can emulate what somebody uploading a file with a checksum would do be creating a file with the text "hello world," create a SHA256 hash of it and then verify the checksum later. We can modify the file to "hello mars," to demonstrate that the SHA256 hash has caught the change and alerted the user.
# Create text file with "hello world" message
❯ echo hello world > upload.txt
❯ cat upload.txt | sha256sum > upload.checksum
❯ cat upload.txt | sha256sum --check upload.checksum
-: OK
Let us modify the upload.txt
file and check if the checking the SHA265 hash fails.
❯ echo hello mars > upload.txt
❯ cat upload.txt | sha256sum --check upload.checksum
-: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match
In these sha256sum
examples, I have deliberately done everything in a clunky manner to demonstrate the use of pipes and redirection. There are actually much simpler ways to perform hashing and verification of hashes.
❯ sha256sum upload.txt > upload.checksum
❯ cat upload.checksum
53d58e94e61b1c2a641dc52b402729f76c3832978e37f7f31ad1286ae32a796e upload.txt
❯ sha256sum --check upload.checksum
upload.txt: OK
Keep reading with a 7-day free trial
Subscribe to Erik Explores to keep reading this post and get 7 days of free access to the full post archives.