Software Tools in Free Pascal

Table of Contents:

This document briefly explains implementing the programs written in the book Software Tools in Pascal[KP81] by Brian W. Kernighan and P. J. Plauger, a follow up to their earlier Software Tools[KP76]. A Software Tools User Group also went through the tools, and exercises, in Fortran, using C primitives. The modern version of this is the Bell Labs portability kit, distributed as Plan 9 from User Space, replacing the original research Unix kit. These effectively made the Software Tools user group, and software distribution, redundant.

I worked through the original Software Tools[KP76] some years ago using GNU Fortran, and f2c with GNU C. I was able to compile the bootstrap Ratfor of the tools tape on my custom GNU/Linux distribution, and more recently on the Linux Subsystem for Windows. The only difficulty was in needing to rename the index function to hindex (the h being for Hollerith) to get it to compile with a Fortran 77 compliant compiler, (a compatibility mode in GCC). The C version of Ratfor from research Unix might be tracked down from the Unix Heritage site, as well as one written by Oz from Stanford. However, it became clear to me that Extended Fortran (i.e. Fortran 90 and later) was a far more robust language, no longer needing Ratfor, and that referencing the original Software Tools while working through The C Programming Language[KR78] [KR88] had better advantages. Thanks to Professor Kernighan for his help with my copy of his book, and for hinting at improvements to Fortran in his correspondance. Unless interested in Fortran or PL/1 specifically, working on the original Software Tools was no longer as educational in context of the elaborated on and modernized Pascal version, except perhaps with the crypt tool.

About Pascal: a brief history of its evolution

The Pascal language is a derivative of Wirth's work on Algol. Algol started in 1958 as the International Algebraic Language (IAL) with a preliminary report. In 1960 a report was released as the Alogrithmic Language (ALGOL[N60]). It was implemented by Edsgar Dijkstra as part of his Ph.D. dissertation on the implementation of the X1. Wirth was a part of the Algol community, and did his Ph.D. disertation at UCB Berkeley on a dynamically typed version of Algol called Euler[WW66]. This was implemented in 1962-1965 on the IBM 704, then at Stanford on the Burroughs B5000, and later on the IBM 360/30 with an improved, stack oriented interpreter.

In 1964, Wirth began work on a proposal to the IFIP Working Group 2.1 for Algol X, which was rejected in favor of what became Algol-68. He continued work on that proposal as Algol-W[WH66] at Stanford University with Tony Hoare, also implementated on the IBM 360 in a custom, high level assembler, PL360[W68]. This was later continued at Stanford by others.

Wirth moved on to a professorship at ETH Zuerich, where he developed Pascal in 1968 as an Algol and Algol-W derived language, intended to be kept similar, on the CDC 6400 (a predecessor of Cray) with the SCOPE-3.4 operating environment. The introduction to Pascal began with The Programming Language Pascal[W70] (their first technical paper, ETH-1), in November 1970. A version of this was published in Acta Informatica.[WSV71]. A minor revision was published in July 1971 as the Second Edition[W71]. The Revised Report[W72] was published in November 1972, and a further minor revision in July 1973[W73]. The 1973 revision adds the optional program header that is more recognizable to Pascal users outside ETH.

During this time, the Pascal-P compiler[AJNN74] was designed as a portability kit. The CDC compiler was later rewritten for the language of the Revised Report against the P-system, version P4 being the final P-system release. The Pascal-P language is a small, minimal language intended for porting to other platforms in a unique approach: the compiler written in Pascal against a stack machine interpreter, also written in Pascal, but in such a way that it can be translated into a language for another platform. Pascal-P was used to build the UCSD system, which added its own non-standard extensions, which was also picked up by Turbo Pascal, both of which popularized Pascal, making this work of Wirth's students significant to the success of Pascal.

Pascal was slowly being tweaked through these ETH published reports. Beginning with the book Systematiches Programmeren[Wir72] in 1972, and its English equivalent, Systematic Programming: An Introduction[W76] in 1976, these tweaks can be seen inbetween the reports. The first edition of the K&R PASCAL: user manual and report[JW74] was published in 1974, then a text on algorithms[W75] was published in the summer of 1975 in German, then in English as Algorithms + Data Structures = Programs[Wir76].

A subset language called Pascal-S[Wir75] (bigger than Pascal-P), was implemented by Wirth for the sake of teaching. Perhaps it was this language that led to the misunderstanding that Pascal was designed only as a teaching language, or the fact that Wirth published Pascal as a university professor, instead of as a student or committee member. Pascal, like its predecessors, was intended to be a full, complete language for programming.

In 1978, the second edition of the User Manual and Report[JW78] was published, which went through several reprints (accumulating corrections). I have the first and fourth printing of this edition. Most notably, it enforces the use of the program header. Other papers were published relating to Pascal at ETH. In 1980 a proposal for a standard started at ANSI, and later ISO, completed in 1983. The University of Minnesota continued the CDC work on the Pascal compiler, and Mickel and Miner issued a third edition of the User Manual and Report in November 1984, and its final fourth edition against the revised and final ANSI standard in February of 1991. The CDC edition of Pascal gave way to Welsh and Hay's Model Implementation. There was also an Extended Pascal, modeled after UCSD's units, (and ADA's packages?).

Choosing Free Pascal

From time to time, Software Tools in Pascal requires that primitives be built for the programs to function as intended. These were provided with a toolstape, now available from Kernighan's Princeton page. (See plan9.io for a partial mirror of the old plan9/cm website.) This tape relies on original forms of the research Unix ar and nroff commands, so are only partly useful in a modern Unix or GNU/Linux environment.

The first exercise wants you to be familiarized with your compiler environment. It provides a complete program to compile, copyprog, which is similar to the copytext program provided as an example in the User Manual and Report, Second Edition, pg. 164 or section 13 of the Report. This is provided on the toolstape as wholecopy.p. I first started with the p2c package, such as is found on Slackware. I was unable, presumably due to I/O bugginess using p2cc, to get copyprog working without error.

Because it is mentioned in the Appendix for the Whitesmith's Primitives, here's an example with the Amsterdam Compiler Kit:

ack -o copyprog wholecopy.p

I made an RPM generated for RHEL 7 from the ackit 6.1 alpha code tree. Installation on RHEL requires the correct configuration of the ACKDIR, ACKM, and ACKFE variables. I used the following shell profile configuration:

ACKDIR=/usr
ACKFE=/usr/share/ack/descr/fe
ACKM=linux386
export ACKDIR ACKFE ACKM

The Whitesmith's compilers are no longer available, having been purchased by several companies and finally buried. They are connected to Plauger's company Whitesmith's Ltd. and its Idris Unix system. The Pascal version of the book was mainly written by Kernighan, who used Bill Joy's BSD interpreter (not the corresponding compiler) for the initial work. The P4 derived UCSD OS for Z80, Intel 8080, and later Intel 8088 (the IBM PC), also has available primitives which includes a simplistic shell interpreter for handling files and redirection. It has been tempting to look at the UCSD primitives for p5, as well as try out the Atari Pascal ISO compatible p-system.

There were several other compilers I found, though all but p5 (carrying on the P4 system against the INCITS/ISO/IEC standard, and providing an alternative to the proprietary Model Implementation) was defunct. However, Free Pascal was starting to add its ISO mode, and I found this sufficiently suitable, and the most modern implementation, though as of version 3.2 it still has some bugs. The program example for copyprog was buildable, on all supported platforms, with the following command:

fpc -Miso -Xst -v0 -l- -ocopyprog wholecopy.p

getc and putc

Several reasons are given for getc and putc. The first is to hide the details of what is unique to any particular system: its input and output devices. Hiding not only the details of how to pick the standard in and out devices, but also how lines and files are handled in terms of markers and functions that identify these details, is incredibly useful. The general answer to needing these is first explained in the authors' book The Elements of Programming Style[KP78]. Isolate the details of I/O into one place that is recognized as being non-portable, including different character sets. This is explained with both PL/1 and Pascal.

Under Fortran 66, there is no character type. (There is in Fortran 77.) There is only the Holerith string type. Integers had to be passed around and converted to Holeriths. PL/1 had a character type, but one of the exercises asks why it wasn't used, and both books suggest this will be explained (which it does in several places). Passing of integer and character around makes less sense at first with Pascal, until an underlying key piece is explained: most of the compilers available to Kernighan and Plauger are written in C. The char type of C, to be portable, needs to be abstracted to distinguish between signed and unsigned integers, not only differences in character set. This becomes especially needful when using a negative integer (e.g. -1) as an end-of-file sentinel. There's low level nuggets like this scattered through both books. This lends to solving problems of efficiency, which also merit having a separate abstraction. Compiler authors are dealing with complex software. Writing for small systems often required tricks to make a program usable. With modern computing, perhaps this is mostly unnecessary, (except with HPC needs).

Counting tools

charcount requires the putdec procedure (as described on pages 57-58), but it is not introduced in chapter 1. The standard procedure write, if it is fully supported, can be used until you arrive at the end of chapter 2 where putdec is described:

{ putdec(nc, 1) }
write(nc:1)

Kernighan was careful to only use a compatible subset of the definition of the 1974 (final) Report, the (at the time) proposed ANSI/ISO standard, and existing implementations (see pg 28-29). This approach of using a compatible subset also explains the primitives approach of both books, which at first seem redundant, but ultimately become clear in practice as the only way to handle portability between implementations, as well as provide the opportunity to tweak the efficiency of those primitives (sometimes due to inadequacies of a compiler).

#include

The include command is not provided until chapter 3, yet its use is introduced in the last program of chapter 1 (detab), and implied with charcount (see the wrapper on pg. 71). The book hints that #include was used by Kernighan with the Unix C preprocessor.

The Free Pascal $include can be used instead (similar to the PL/1 example of pg. 75 of the original Software Tools), but I found this made fixing mistakes harder as the line numbers didn't match up in error messages. Wirth's CDC compiler used external references to independently compiled libraries (i.e. object files, such as can be used with the -c flag of the c99 or gfortran commands). This is consistent with the Whitesmith's and ACK example in the appendix. Free Pascal supports a similar external referencing feature if you write libraries with another compiler (or use unit files). I found it a useful exercise in efficiency, using the example of copyprog, to assemble each procedure and function manually into a single program file until I had the include command built.

getarg

The getarg function under Free Pascal required modifications to the UCB example in the Appendix on page 331. Instead of argv and argc, paramcount and paramstr can be used. Replace (n < argc) with (n <= paramcount), and argv(n, arg) with arg := paramstr(n).

See the UCB globdefs.p example in the Appendix for the string type.

message and error

Though a goto and label could be used for each specific program, or even a branch with a simpler program where a writeln is at the end, an error function is fairly simple in Free Pascal. First, Free Pascal provides a halt statement, the same as is described in the User Manual and Report, Second Edition. Using the Free Pascal shortstring type, and writeln for directing output to STDERR, the macros suggested by the book can be avoided:

  PROCEDURE message (CONST s: shortstring);
  BEGIN writeln(openlist[STDERR].filevar, s)
  END; 
  PROCEDURE error (CONST s: shortstring);
  BEGIN message(s); halt
  END; 

Free Pascal can also write to erroutput, (instead of the initialized STDIN/STDOUT/STDERR environment of the Appendix UCB primitives).

NOTE. In the crypt example from Why Pascal Is Not My Favorite Language[Ker81], using the FreePascal xor built-in allowed me to build crypt. However, instead of using halt, it made more sense to change the conditional so that the last part of the program was an else that was at the end of the program. (End of note.)

compare0

Currently, compare0 is problematic with Free Pascal 3.2, as the use of files in the program header does not allow the required type declaration, perhaps not being sufficiently bug free for this program to work.

The function getline as provided in the Appendix opens up a whole can of worms for other procedures not yet discussed until the entirety of chapter 3's primitives are complete.

compare

The first thing needed for compare is the open primitive from UCB. This primitive pads intname with blanks, which is not explained in the Appendix, but which makes sense once comparing against the BSD Unix manual. The for loop should be removed for Free Pascal.

The second thing is that Free Pascal uses the standard reset and rewrite commands, so the extended syntax cannot be used. Instead, use the Free Pascal assign procedure:

  assign(openlist[i].filevar, intname);
  IF (mode = IOREAD) THEN RESET(openlist[i].filevar)
  ELSE REWRITE(openlist[i].filevar);

A fix for the return status deficiency of the UCB example can be tested against ioresult at the end of the procedure:

  IF (ioresult <> 0) THEN open := IOERROR

This requires that { $i+ } preceeds, and { $i- } succeeds the procedure.

The open procedure relies on initio from pg. 326. In Free Pascal on Unix and GNU/Linux, instead of assigning /dev/tty to STDERR, use a blank string: assign(openlist[STDERR].filevar, ''). The rest can be taken verbatim from the UCB primitives.

The file descriptors described here seem confusing to some, however the approach seems simple enough. It allows for the amount of files to be numbered, and makes it simpler for assigning the internal file name in a way that can count what the maximum file amount is. The MAXOPEN variable for file handle count has a small number, but a modern OS can easily deal with thousands of files. In GNU/Linux, see the file /proc/sys/fs/file-max, printed using:

$ sysctl fs.file-max

xclose

Free Pascal has a built-in close procedure, which name collides with close.p (first used in include.p), requiring the procedure to be called xclose (otherwise it calls itself recursively). Until the macro tool of chapter 8 is built, and the #define shown on page 340 for the UCSD wrapper can be used (using the macro or define syntax. See pg. 280 or 305), any program that uses xclose will have to be manually edited so uses of close are changed to xclose. Following are the affected files:

outer.p

To assemble a program, follow the instructions in Chapter 3, and the appendix, making the assembled primitive files in their particular directories (resulting in the described globdefs.p, prims.p, and utility.p primitive files. A similar batch script as below could be used), and updating the outer.p file to include the program file (e.g copy.p) and call the appropriate main program, assembly of the programs should be fairly basic. On Windows, I made the following batch script, for use with the Free Pascal x64 cross compiler, to assemble copy:

include <outer.p >copy.pas
ppcrossx64 -Miso copy.pas
del *.o
rem comment the below to debug:
del copy.pas

As the appendix notes, this is inefficient for small programs that don't use all the primitives and utilities. Something similar to the Whitesmith's example can also be used with Free Pascal's external statement for function declarations. Though most modern command interpreters follow the full Unix conventions described in the book, adapting the UCSD custom interpreter might be a fruitful exercise.

Building and using the programs while reading this book was fun, and gave an in depth view of the fundamentals of good CLI and small tool programming. I had planned a second pass of the book to work on the exercises with Free Pascal, but the weaknesses of Pascal that Kernighan makes abundently clear are fixed in Pascal's successor language, so I have continued working on the exercises in Modula-2 [W88]. Some good information exists at Scott Moore's site on Pascal (as well as early Basic) programming and history.

References

[AJNN74]
K. V. Nori, U. Ammann, K. Jensen, H. H. Naegeli, The Pascal 'P': Implementation Notes, ETH Technical Report No. 10, December 1974
[MHW76]
R. M. De Morgan, I. D. Hill, B. A. Wichmann, A supplement to the ALGOL Revised Report, Computer Journal, Volume 19, Number 3, 1976, 276-288
[MHWAlgol76]
R. M. De Morgan, I. D. Hill, B. A. Wichmann, Modified Report on the Algorithmic Language Algol 60, Computer Journal, Volume 19, Number 4, 1976, 276-288
[JW74]
N. E. Wirth, K. Jensen PASCAL: user manual and report, Springer-Verlag, 1974
[JW78]
N. E. Wirth, K. Jensen PASCAL: user manual and report, Second Edition, Springer-Verlag, 1978
[Ker81]
B. W. Kernighan, Why Pascal is Not My Favorite Programming Language, AT&T Bell Laboratories, Computing Science Technical Report No. 100, 2 April 1981
[KP76]
B. W. Kernighan, P. J. Plauger, Software Tools, Addison-Wesley, 1976
[KP78]
B. W. Kernighan, P. J. Plauger, The Elements of Programming Style, Second Edition, McGraw-Hill, 1978
[KP81]
B. W. Kernighan, P. J. Plauger, Software Tools in Pascal, Addison-Wesley, 1981
[KR78]
B. W. Kernighan, D. M. Ritchie, The C Programming Language, Prentice-Hall, 1978
[KR88]
B. W. Kernighan, D. M. Ritchie, The C Programming Language Second Edition, Prentice-Hall, 1988
[N60]
P. Naur, Report on the algorithmic language ALGOL 60, Comm. ACM, 3 May 1960, 299-314
[N63]
P. Naur, Revised report on the algorithmic language ALGOL 60, Comm. ACM, 6 Jan. 1963, 1-17
[W68]
N. E. Wirth, PL360, A Programming Language for the 360 Computers, Journal of the ACM, January 1968, Volume 15, No. 1
[W70]
N. E. Wirth, The Programming Language Pascal, ETH Technical Report No. 1, November 1970
[W71]
N. E. Wirth, The Programming Language Pascal, ETH Technical Report No. 1, July 1971
[Wir72]
N. E. Wirth, Systematisches Programmieren, Teubner-Verlag, Stuttgart, 1972.
[W72]
N. E. Wirth, The Programming Language Pascal, ETH Technical Report No. 5, November 1972
[W73]
N. E. Wirth, The Programming Language Pascal, ETH Technical Report No. 5, July 1973
[W75]
N. E. Wirth, Algorithmen und Datenstrukturen, Teubner-Verlag, Stuttgart, 1975
[Wir75]
N. E. Wirth, Pascal-S: A Subset and its implementation, ETH-12, 1975
[W76]
N. E. Wirth, Systematic Programming: An Introduction, Prentice Hall, Englewood, 1976.
[Wir76]
N. E. Wirth, Algorithms + Data Structures = Programs, Springer-Verlag, 1976
[W88]
N. E. Wirth, Programming in Modula-2, 4th Edition, Springer-Verlag, 1988
[WH66]
N. E. Wirth, C. A. R. Hoare, A Contribution to the Development of ALGOL, Comm. ACM, June 1966, Volume 9, Number 6
[WSV71]
N. E. Wirth, The Programming Language Pascal, Acta Informatica 1, Springer-Verlag, 1971, pg. 35-63.
[WW66]
N. E. Wirth, H. Weber, EULER: A Generalization of ALGOL, and its Formal Definition, Part I and Part II, Comm. ACM, February 1966, Volume 9, Numbers 1-2

©2016-2023 David Egan Evans.