Notes on The Unix Programming Environment

This document provides notes and comments, as well as my solutions to exercises found in The Unix Programming Environment by Brian W. Kernighan and Rob Pike.

Most Unix systems of today fall into a few categories. There's the classic BSD systems like OpenBSD, NetBSD, FreeBSD, and DragonFly BSD. There's also the POSIX compliant, trademarked UNIX® systems, which include a couple of Linux systems, HP-UX, Solaris, AIX, and macOS. There's also many different Linux systems, including commercial systems such as Red Hat Enterprise Linux, Ubuntu, and SUSE, and other clones such as Minix. This document looks mostly at working through the exercises on Slackware64 as installed on a Dell Precision T5400.

A reference to the Unix Programmer's Manual refers to two basic books: The man pages and the articles. For the purpose of this book, the Seventh Edition UNIX system from Bell Labs is what it means here. The Tenth Edition was the final research release. System III and System V from the Unix Systems Laboratories was the commercial branch of Unix. For Slackware64, this basically means the man pages provided with the system, and perhaps the POSIX and LSB standard documentation.

Chapter 1

Pg. 3 uses mail. The POSIX standard now uses mailx(1) (though mail will likely work). This is described in a later section in the book. Recogize that it refers to emailing each other on the same system, assuming multiple users logged in to the same system at the same time.

RETURN in the book refers to the key that generates the carriage return control character. PC keyboards usually have Enter now, but it’s the same key. This is a carry over from the typewriter/teletype era. Different teletypes behaved differently (a teletype used literal printer tape for input and output, or for some cards were used for input).

Unix comes from the time sharing system era, as dumb terminals began to replace tape, and had a secondary store (storage on tape or later hard drives, instead of RAM) was used to store files. A run time system was booted first (i.e. the OS). Originally, every program you wrote included the boot strap to load the program.

Every teletype interface had different characters marking a newline, e.g. CRLF, LFCR, CR, LF, RR. Typically, a teletype had to know to return the print carriage (the CR), then push the carriage down a line (the line feed). Unix started using only linefeed (LF) in its files. It was much easier to receive and process a single character on the stream. Whatever characters had to be sent to the teletype was sent when it received the line feed in a file.

Dot matrix, bubble jet, and laser printers of today typically have their own language, but some still honor the different processing characters of ASCII (others like EBCDIC are gone) and ISO-8859 (1-15) or Windows-1252. Modern use is UTF-8. Some systems like Windows use UTF-16 (LE, and UTF-32) as well.

As a final note, Backspace doesn't delete, and delete only processes the character it is on. Effectively the Backspace key means send the backspace character then send the delete character. As the book noted, sometimes the Delete key may do other things than delete a character.

Mistakes in typing

Towards the end of section 1.1, Mistakes in typing describes shell behavior that is likely confusing to modern users familiar with vi and Emacs mode command-line editing. Being able to backspace and delete something previously typed is a concept foreign to the first decade of Unix. Bill Joy distributed a tape called the Berkeley System Distribution (BSD), that include a visual editor (ex and vi), and the C Shell (CSH), which introduced the idea of being able to edit the command line and correct mistakes. The GNU Project's Bash, which is at the core of most Unix systems today (though older ones still use the Korn Shell), defaults to Emacs mode. This provides not only command editing and correction, but command completion options, and history paging.

To get the old fashioned behavior in Bash, instead of using @ character for line kill, use Ctrl-u. To erase the previous character, type Ctrl-\? (hold down both the Control/Ctrl key and and the backslash key, then type the key with the question mark (the Shift key is not necessary here). However, to accomplish Exercise 1-1, these must be remapped to the @ and # characters, respectively. Typing stty -a using the GNU Coreutils stty command will show the current settings. An initial start is as follows:

$ stty erase \#Enter
stty kill @Enter

Future examples will not show Enter, but will assume that the printed newline is sufficient. Unfortunately, this does not honor the \ character as an escape, so the example's \@ will still delete the line. The following is what the exercise was expected to look like:

$ date@

Notice that the trailing $ is not present, so the book does not expect you to type the Enter key, but instead expected the \@ part of your typing to not print the \ part. As the book indicated, typically this will still put you on a new line with a prompt. If you got command not found, then you pressed Enter when typing date\@, which makes sense: there's no date\@ command, only date, and the @ character is not being interpretted as a line kill, nor is the \ escaping it.

For now, that's as far as I got. With some more persistence, I'm sure I could identify why the \ character is not escaping the control characters defined by the stty command, and also identify how to explain how to set everything back the way it was easily. For now, you may have to log out, then back in again, or use the stty command to configured the old behavior. So with Bash on GNU/Linux and macOS, use Backspace instead of # (as the book suggests you test), and Ctrl-u instead of @.

Exercise 1-2 has four commands. The first types the date command with the expected output. The second has a # that has nothing to delete before running the date command. The third escapes the # which becomes a comment. The fourth escapes the escape to a literal \, which then escapes the # to a comment (instead of a backspace+delete), and thus returns from Bash a command not found.

Stopping a program

This section may not make sense until it is realized that slow baud rate serial connections using dumb terminals and dial-up modems is assumed here. Unless dealing with network or internet latency, the X/ON and X/Off behavior of the Break key, or Ctrl-q and Ctrl-s won't make sense.

Writing to others

Typically, the Delete key does not break out of the a write session, only Ctrl-d.

The manual

To quit a manual type q. Use the space bar or Enter to move forward in a manual, or b or Ctrl-b to move back. Search with the / character.

Computer aided instruction

The learn command is typically not found on Linux systems. Professor Kernighan (the author) has a copy from research Unix at his Princeton mirror.

©2019 David Egan Evans.