Software Tools in Free Pascal

This document briefly explains implementing the programs written in the book Software Tools in Pascal[KP81] by Brian W. Kernighan and P. J. Plauger, a follow up to their earlier Software Tools[KP76]. The modern version of this is the Bell Labs portability kit, distributed as Plan 9 from User Space, replacing the original research Unix kit. Also, an implementation of the book was done in Haskell, http://www.crsr.net/Programming_Languages/SoftwareTools/.

I worked through the original Software Tools some years ago using GNU Fortran, and f2c with GNU C. I was able to compile the bootstrap Ratfor of the tools tape on my custom GNU/Linux distribution, and more recently on the Windows 10 subsystem for Linux. The only difficulty was in needing to rename the index function to hindex (the h being for Hollerith) to get it to compile in a Fortran 77 compatibility mode. The C version of Ratfor from research Unix might be tracked down from the Unix Heritage site, as well as one written by Oz from Stanford. However, it became clear to me that Extended Fortran (i.e. Fortran 90 and later) was a far more robust language, no longer needing Ratfor, and that referencing the original Software Tools while working through The C Programming Language[KR78] [KR88] had better advantages. Otherwise, unless interested in Fortran or PL/1 specifically, working on the original Software Tools was no longer as educational in context of the elaborated on and modernized Pascal version.

Getting Started

From time to time, Software Tools in Pascal requires that primitives be built for the programs to function as intended. These were provided with a toolstape, now available from Kernighan's Princeton page. (See plan9.io for a partial mirror of the old plan9/cm website.) This tape relies on primitive forms of the Unix ar and nroff commands, so are only partly useful.

The first exercise wants you to be familiarized with your compiler environment. It provides a complete program to compile, copyprog, which is similar to the copytext program provided as an example in the User Manual and Report, Second Edition, pg. 164 or section 13 of the Report. This is provided on the toolstape as wholecopy.p. I first started with the p2c package, such as is found on Slackware. I was unable, presumably due to I/O bugginess using p2cc, to get copyprog working without error.

Because it is mentioned in the Appendix for the Whitesmith's Primitives, here's an example with the Amsterdam Compiler Kit:

ack -o copyprog wholecopy.p

I made an RPM generated for RHEL 7 from the ackit 6.1 alpha code tree. Installation on RHEL requires the correct configuration of the ACKDIR, ACKM, and ACKFE variables. I used the following shell profile configuration:

ACKDIR=/usr
ACKFE=/usr/share/ack/descr/fe
ACKM=linux386
export ACKDIR ACKFE ACKM

Using Free Pascal's ISO mode, I found building the tools very easy. For instance, the program example for copyprog was buildable, on all supported platforms, with the following:

fpc -Miso -Xst -v0 -l- -ocopyprog wholecopy.p

getc and putc

Several reasons are given for getc. The first is to hide the details of what is unique to any particular system: its input and output devices. Hiding not only the details of how to pick the standard in and out devices, but how lines and files are handled in terms of markers and functions that identify these details is incredibly useful. The general answer to needing these is first explained in the authors' book The Elements of Programming Style[KP78]. The first point is to isolate the details of I/O into one place that is recognized as being non-portable, including different character sets. This is explained with both PL/1 and Pascal.

Under Fortran 66, there is no character type. (There is in Fortran 77.) There is only the Holerith string type. Integers had to be passed around and converted to Holeriths. PL/1 had a character type, but one of the exercises asks why it wasn't used, and both books suggest this will be explained (which it does in several places). Passing of integer and character around makes less sense at first with Pascal, until an underlying key piece is explained: most of the compilers available to Kernighan and Plauger are written in C. The char type of C, to be portable, needs to be abstracted to distinguish between signed and unsigned integers, not only differences in character set. This becomes especially needful when using a negative integer (e.g. -1) as an end-of-file sentinel. There's low level nuggets like this scattered through both books. This lends to solving problems of efficiency, which also merit having a separate abstraction. Compiler authors are dealing with complex software. Writing for small systems often required tricks to make a program usable. With modern computing this is mostly unnecessary, except with HPC needs.

Counting tools

charcount requires the putdec procedure (as described on pages 57-58), but it is not introduced in chapter 1. The standard procedure write, if it is fully supported, can be used until you arrive at the end of chapter 2 where putdec is described:

{ putdec(nc, 1) }
write(nc:1)

Kernighan was careful to only use a compatible subset of the definition of the 1974 (final) Report and existing implementations (see pg 28-29). This approach of using a compatible subset also explains the primitives approach of both books, which at first seem redundant, but ultimately become clear in practice as the only way to handle portability between implementations, as well as provide the opportunity to tweak the efficiency of those primitives (sometimes due to inadequacies of a compiler).

#include

The include command is not provided until chapter 3, yet its use is introduced in the last program of chapter 1 (detab), and implied with charcount (see the wrapper on pg. 71). The book hints that #include was used by Kernighan with the Unix C preprocessor.

The Free Pascal $include can be used instead (similar to the PL/1 example of pg. 75 of the original Software Tools), but I found this made fixing mistakes harder as the line numbers didn't match up in error messages. Wirth's CDC compiler used external references to independently compiled libraries (i.e. object files, such as can be used with the -c flag of the c99 or gfortran commands). This is consistent with the Whitesmith's and ACK example in the appendix. Free Pascal supports a similar external referencing feature if you write libraries with another compiler (or use unit files). I found it a useful exercise in efficiency, using the example of copyprog, to assemble each procedure and function manually into a single program file until I had the include command built.

getarg

The getarg function under Free Pascal required modifications to the UCB example in the Appendix on page 331. Instead of argv and argc, paramcount and paramstr can be used. Replace (n < argc) with (n <= paramcount), and argv(n, arg) with arg := paramstr(n).

See the UCB globdefs.p example in the Appendix for the string type.

message and error

Though a goto and label could be used for each specific program, or even a branch with a simpler program where a writeln is at the end (as I did refactoring of the crypt example from Why Pascal Is Not My Favorite Language [Ker81], using the FreePascal xor built-in), an error function is fairly simple in Free Pascal. First, Free Pascal provides a halt statement, the same as is described in the User Manual and Report, Second Edition. Using the Free Pascal shortstring type, and writeln for directing output to STDERR, the macros suggested by the book can be avoided:

  PROCEDURE message (CONST s: shortstring);
  BEGIN writeln(openlist[STDERR].filevar, s)
  END; 
  PROCEDURE error (CONST s: shortstring);
  BEGIN message(s); halt
  END; 

Free Pascal can also write to erroutput, (instead of the initialized STDIN/STDOUT/STDERR environment of the Appendix UCB primitives).

compare0

Currently, compare0 is problematic with Free Pascal 3.2, as the use of files in the program header does not allow the required type declaration, perhaps not being sufficiently bug free for this program to work.

The function getline as provided in the Appendix opens up a whole can of worms for other procedures not yet discussed. Unless the entirety of chapter 3's primitives are complete, it is best to follow the directions in the chapter to build this function with getc.

compare

The first thing needed for compare is the open primitive from UCB. This primitive pads intname with blanks, which is not explained in the Appendix, but which makes sense once comparing against the BSD Unix manual. The for loop should be removed for Free Pascal.

The second thing is that Free Pascal uses the standard reset and rewrite commands, so the extended syntax cannot be used. Instead, use the Free Pascal assign procedure:

  assign(openlist[i].filevar, intname);
  IF (mode = IOREAD) THEN RESET(openlist[i].filevar)
  ELSE REWRITE(openlist[i].filevar);

A fix for the return status deficiency of the UCB example can be tested against ioresult at the end of the procedure:

  IF (ioresult <> 0) THEN open := IOERROR

This requires that { $i+ } preceeds, and { $i- } succeeds the procedure.

The open procedure relies on initio from pg. 326. In Free Pascal on Unix and GNU/Linux, instead of assigning /dev/tty to STDERR, use a blank string: assign(openlist[STDERR].filevar, ''). The rest can be taken verbatim from the UCB primitives.

The file descriptors described here seem confusing to some, however the approach seems simple enough. It allows for the amount of files to be numbered, and makes it simpler for assigning the internal file name in a way that can count what the maximum file amount is. The MAXOPEN variable for file handle count has a small number, but a modern OS can easily deal with thousands of files. In GNU/Linux, see the file /proc/sys/fs/file-max, printed using:

$ sysctl fs.file-max

xclose

Free Pascal has a built-in close procedure, which name collides with close.p (first used in include.p), requiring the procedure to be called xclose (otherwise it calls itself recursively). Until the macro tool of chapter 8 is built, and the #define shown on page 340 for the UCSD wrapper can be used (using the macro or define syntax. See pg. 280 or 305), any program that uses xclose will have to be manually edited so uses of close are changed to xclose. Following are the affected files:

outer.p

To assemble a program, follow the instructions in Chapter 3, and the appendix, making the assembled primitive files in their particular directories (resulting in the described globdefs.p, prims.p, and utility.p primitive files), and updating the outer.p file to include the program file (e.g copy.p) and call the appropriate main program, assembly of the programs should be fairly easy. On Windows, I made the following batch script for use with the Free Pascal x64 cross compiler:

include <outer.p >copy.pas
ppcrossx64 -Miso copy.pas
del *.o
rem comment the below to debug:
del copy.pas

As the appendix notes, this is inefficient for small programs that don't use all the primitives and utilities. Something similar to the Whitesmith's example can also be used with Free Pascal's external statement for function declarations. This is likely the closest to CDC 6000 3.4 Pascal of the Manual & Report, except that many of these utilities and some primitives are built using Pascal, so this is more about independent compilation in a build system than taking advantage of another language to supplement Pascal, of which most of Free Pascal's built-in functions are already suitable. Though most modern command interpreters follow the full Unix conventions described in the book, adapting the UCSD custom interpreter might be a fruitful exeercise.

Building and using the programs while reading this book was fun. I had planned a second pass of the book to work on the exercises with Free Pascal, but the weaknesses of Pascal that Kernighan makes abundently clear are fixed in Pascal's successor language, so I have continued working on the exercises in Modula-2[W88].

References

[Ker81]
B. W. Kernighan, Why Pascal is Not My Favorite Programming Language, AT&T Bell Laboratories, Computing Science Technical Report No. 100, 2 April 1981
[KP76]
B. W. Kernighan, P. J. Plauger, Software Tools, Addison-Wesley, 1976
[KP78]
B. W. Kernighan, P. J. Plauger, The Elements of Programming Style, Second Edition, McGraw-Hill, 1978
[KP81]
B. W. Kernighan, P. J. Plauger, Software Tools in Pascal, Addison-Wesley, 1981
[KR78]
B. W. Kernighan, D. M. Ritchie, The C Programming Language, Prentice-Hall, 1978
[KR88]
B. W. Kernighan, D. M. Ritchie, The C Programming Language Second Edition, Prentice-Hall, 1988
[W88]
N. E. Wirth, Programming in Modula-2, 4th Edition, Springer-Verlag, 1988

©2016-2021 David Egan Evans.