This document is currently in-progress, started as a supplement to Software Tools in Free Pascal. It relates my considerations for implementing the Software Tools in Modula-2.
In considering Modula-2, understanding the history and motivations of Wirth's change from Pascal is useful. Perhaps of interest was the experience with the VENUS Multi-Access system, and its editor, and issues related to multiprogramming with that environment.
In 1973, the first Modula compiler was written on the ETH CDC machine with its
Pascal compiler, then ported to a PDP-11/40 GT44. It appears to be a small language (e.g. no
floating point handling), perhaps relying on the boot strap for loading. Software was written to
handle the different hardware and peripheral devices, and to run some basic programs that
demonstrate the use of its multi-process handling. This was a case insensitive language, similar
to Pascal, but adding modules. Modula appears to be implemented as a single text language like
Pascal with local modules that replaced the Algol own statement, and had device and
interface (process) modules for specific, non-portable handling. At least one 3rd party Modula
compiler provided independent or separate compilation.
After exposure to MESA, by 1978 the language became case-sensitive and followed MESA's separate text imports for separate compilation. This was the beginning of Modula-2. Though EBNF was introduced in the Pascal Algorithms book, and a paper in 1977 promoting the benefits of converting BNF to ENBF, the 1977 Modula report seems to be the first place it was used to describe a language. (I have not looked for differences between the 1977 ETH Modula report and that published in Software - Practice and Experience, not having access to the latter.)
The first Modula-2 compiler was developed, based on the Modula compiler. The late 1978 report
(#27) seems to be adding back things from Pascal to Modula after adapting changes observed from
MESA. Some things stand out: the separate compilation of modules, upper-case reserved words and
case-sensitivity, and the switch to co-routines exemplified by the removal of the interface and
device module syntax (though still described as such with the Typewriter module
example). The addition of things from Pascal is not complete, however. One curious change in the
procedures section example now uses log2 with the new CARDINAL type,
instead of gcd, likely giving a cleaner example of the new type. The
lineinput and trackreservation local module examples from Modula remain,
but modified to conform to the new syntax, yet clearly with a hint of direct relationship between
the two language versions. ALLOCATE and DEALLOCATE are now standard
procedures. The SYSTEM module is provided. The module syntax is still simple, adding
only a DEFINITION module for libraries, (i.e. no implimentation module. I can't help
but see a form of the 1988, i.e. after 1987 but before the definition browser of 1989, Oberon
syntax here.)
The ETH 36 report, in its first (March) and second (December) 1980 editions, was the first to
provide a public report on the first compiler used by students at ETH on the DEC PDP-11/40, now
implemented on RT-11. The language now looks about like what we're familiar with, the
REAL type is added, and the IMPLEMENTATION module now is used separate
from the program or main module. The compiler implementation is now described for how it's used on
RT-11, and the InOut module is provided built from SYSTEM. Wirth adds
InOut, Streams, and ProcessScheduler as portable utility
modules. (Jacobi wrote the device specific modules TTIO, Files, and the
stack interpreter Loader.)
The main changes to the second edition report seem related to modules and standard procedures.
ADR is moved to SYSTEM and ASH is removed, as is
ROUND. CHR and ORD are added. FLOAT is
restricted to CARDINAL (but will be used with both in PIM4). VAL is now
listed with the standard procedures. MathLib has no 0 (what was the 0 about?) and is added for the first time as a standard module.
With the introduction of PIM, the 1980 second edition report goes through what seem like big
changes, but in reality only some interpretive clarifications (necessary for other compiler
implimentations) are given: the language remains effectively the same. The report is rewritten to
separate some parts into specific descriptions as a separate manual, the compiler usage
descriptions are removed (though the Lilith compiler port, effectively the same compiler, is
described in the Lilith Handbook in a similar way), and the manual portion of PIM now includes
differing Lilith module definitions along with the RT-11 ones. LineDrawing and
WindowHandler modules are added for the Medos-2 system, (see ETH paper #56). Some
modules are renamed, or split. A word for word comparison of PIM with PIM2 suggests that little
but the fonts and layout have changed, though the Processes implementation module is
fixed from its broken state in PIM, and it introduces text duplication and missing output from an
example, (most fixed in PIM3, except the windowing screenshot is moved out of context to the end of
a later chapter). Several code bugs exist that are not fixed in PIM2, and only a couple fixed
in PIM3.
The changes in PIM3 are effectively described in the ETH paper #59, which also revisits the multiprogramming research of the original Modula but in context of coroutines. The compiler rewrite by Wirth as a single-pass compiler is described in ETH paper #64. A German translation is made of PIM3. The Pascal text of Algorithms + Structures = Programs is separated into two texts by Wirth (in both the German and English versions, the German being the originator). The German text goes through five editions until translated back into English, also apparently by Wirth himself (a sixth edition?). Compilerbau is the last chapter of the original Pascal edition turned into a small manual, which goes through four editions. There is no English translation, but the first English Oberon edition in 1995 is based on it, and perhaps Pascal-S, again by Wirth.
The Ceres-1, the first true 32-bit system with a National Semiconductor chip has the Medos-2 system and Modula-2 compiler ported to it. PIM4 is released describing a couple of changes and clarifications from that port, and is translated into German and printed in 1991. The final edition of MacMETH adopts these changes. The changes from PIM3 to PIM4 are as follows:
INTEGER is normalized as a base type for subranges, the
implication being that CARDINAL can be implimented as a sub-range
of INTEGER, thus making CARDINAL and sub-ranges
compatible with INTEGER. The book switches from
CARDINAL to INTEGER for its examples. (The ISO
standard does not allow assignment compatibility of INTEGER and
CARDINAL.)MOD and DIV statements now expect
compatibility with INTEGER, not to be interpreted as only
CARDINAL. Wirth's compiler introduces REM. (It is
not part of the report. The 1986 Algorithms and Data Structures
includes an explanation of REM, which is in the EBNF.) The ISO
standard uses / instead of DIV and includes
REM formally.INTEGER as well as
CARDINAL. (The ISO standard doesn't fully make this adjustment,
e.g. TRUNC which has a CARDINAL value that is not
compatible with INTEGER.)CHAR.From this point, Modula-2 gives way to Oberon, and the controversial ISO/IEC standard is finally finished in 1996, based on PIM3 and picking and choosing from PIM4 changes. Some compilers had already switched to the ISO/IEC standard in 1993/4.
getc/putcFor a port of Software Tools in Pascal, and following Kernighan's approach and suggestions, a compatible subset of the language will be used following Wirth's last report (PIM4 with a German second edition translation) and ISO/IEC standard (10514-1[1996]). This paper takes a different approach from that of Software Tools in Free Pascal. There, the differences needed to make the existing software build with Free Pascal was described, and some technical detail was given. Here, the basics for getting started are documented, and a focus on how Modula-2 fixes the complaints within Pascal are focused on. Unlike with HOST, my goals are to learn from the text in Modula-2, not build a new library for Medos-2.
Wirth's library and the Medos-2 command interface are very simple. Command launching is single command. It fills in
the command (so the entire command is not required to be typed), but a space, newline, (or null
character), end the command. Options, a switch that uses a / slash character, must be
without a space. In some ways, this is perfect for the conditions of Kernighan's tools. However,
for getc in the beginning, character buffering is required and can't use
InOut (or FileSystem). As the
HOST[KU87] paper eth-3161-01 indicates,
too many module layers may not be the best in the end
creating primitives from primitives. However, to get started, making the
program work is the first priority. Optimizing, and perhaps making the
primitive more portable, can be done later. Portability is difficult, because
Modula-2 has no standard library. The PIM standard (3/4 in English, 1st/2nd in
German) libraries are really no longer common. The ISO/IEC standard is a
dialect in a way, not merely an interpreted implementation with extensions.
Kernighan's libraries see their usefulness as a portable collection.
Here's an example of a (definition) module called ST, with
just enough for the copyprog command to work with the
copy procedure:
DEFINITION MODULE ST;
CONST
(* Universal manifest constants. *)
ENDFILE = -1;
TYPE
character = [-1..127]; (* Byte-sized. ASCII + other stuff. *)
(* Primitives *)
PROCEDURE getc(VAR c: character): character;
PROCEDURE putc(c: character);
END ST.
The getc function return under Modula-2 uses the
RETURN statement instead of a reassignment of the variable, as in
Pascal or Modula(-1).
Here is an example of charcount. Note the use of the
INC standard procedure, instead of the typical expression
nc := nc + 1. Also is Wirth's solution to Algol 68's
solution to the dangling IF and DO (e.g.
do .. od, if .. fi): all conditionals and loops end
with the END statement. Multi-statement blocks are assumed,
instead of single statements like in Pascal, so a BEGIN is
unnecessary (and tends to only be found in procedure and module definitions).
MODULE charcount;
IMPORT ST;
(* charcount: count characters in standard input *)
PROCEDURE charcount;
VAR
nc: CARDINAL;
c: ST.character;
BEGIN
nc := 0;
WHILE (ST.getc(c) # ST.ENDFILE) DO
INC(nc)
END;
ST.putdec(nc, 1);
ST.putc(ST.NEWLINE)
END charcount;
BEGIN charcount
END charcount.
The use of ENDFILE and NEWLINE are qualified to
their imported module, thus avoiding capitalized objects for the sake of
avoiding reserved words by the compiler, (a Mesa convention). However, this
makes portability of the PROCEDURE to code differences less flexible.
This is a design choice. Some compilers interpret IMPORT differently.
Future code examples here will assume FROM.
With putdec as described in Software Tools in Pascal the
Pascal and Modula DIV statement is not used the same in the ISO
standard. To keep it portable, I added a divide function to the
primitives, and used it. This should be easily modifiable between /
and DIV depending on the compiler. This keeps non-portable changes
in the primitives, which are then the only code that has to change from compiler
to compiler.
The current code can be found at http://oberon07.com/dee/software/ST/, tested on the Windows ADW and GNU/Linux gm2 compilers.
wordcount introduces the BLANK and
TAB character objects. Though the NOT statement could
be used, the ~ operator, introduced in PIM3, resembles a different
gliff from earlier times to represent a negation (¬). I decided to
use this as it is exclusively used with its successor language Oberon, and is
not unusual in other programming languages.
One of the improvements to Pascal, originally explained in the Modula
report, is reserving ELSE to the final catch-all of
IF with the introduction of ELSIF.
(* wordcount: count words in standard input *)
PROCEDURE wordcount;
VAR
nw: CARDINAL;
c: character;
inword: BOOLEAN;
BEGIN
nw := 0;
inword := FALSE;
WHILE (getc(c) # ENDFILE) DO
IF (c = BLANK) OR (c = NEWLINE) OR (c = TAB) THEN
inword := FALSE
ELSIF ~inword THEN
inword := TRUE;
INC(nw)
END
END;
putdec(nw, 1);
putc(NEWLINE)
END wordcount;
Line printer handling had an ANSI standard, also found in Fortran. However, into the 80s, this became less common. Modula-2 had worked mostly with laser printers, which had its own language. Today most printers have their own, far more complicated, printing language, and the ANSI standard has been withdrawn.
overstrike had its simplest examples in Wirth's book Systematic Programming.
I spent a bit of time with overstrike to make it more robust from the
exercises, having a bit of fun, but it is essentially a relic of the past. The
formfeed character constant FF was added to the standard
environment, and is not found in Kernighan's book.
The FOR loop in Modula-2 is different than Pascal, for instance
using BY instead of DOWNTO. This is first used in the
settabs procedure in detab. However, the
putrep procedure used by the compress and
expand commands is a good example for one of the differences in
Modula-2 from Pascal. (Modula(-1) had no for loop.)
FOR m := n TO 1 BY -1 DO
putc(c)
END
The function isupper is discussed in terms of a set, but also
a range. Where the syntax of both is different that what is in the book, I
found the following the simplest and most efficient:
PROCEDURE isupper(c: character): BOOLEAN;
VAR A, Z: INTEGER;
BEGIN
A := ORD('A');
Z := ORD('Z');
RETURN (c >= A) AND (c <= Z)
END isupper;
In Software Tools in Pascal, the echo example replaces
crypt as shown in Software Tools. Kernighan indicated the
language was incapable of making a portable xor. There are a
couple ways to do this in Modula-2. The more efficient, though less portable
approach, would be to use Modula-2's BITSET type and the
VAL function (to cast between CHAR and
INTEGER). If your compiler already has a built-in XOR
function, that is better. Perhaps there is a variation on Kernighan's
c := ((NOT b) AND a) OR ((NOT a) AND b); approach but in (PIM4)
standard Modula-2.
PROCEDURE xor(x, y: INTEGER): INTEGER;
VAR c: INTEGER;
BEGIN c := x BXOR y; (* ADW specific function BXOR *)
RETURN c
END xor;
In context of strings, Modula-2 adds a standard open parameter for handling any kind of ARRAY type and size (still static). This allows for procedures to be made that handle varying string lengths. It also allows for the compiler to optimize resource size around the parameter being passed, effectively solving the Pascal problem. However, the discussion around how to define a string is still relevant for older versions of Modula. PIM4 expects the null termination of a string, a single character string with null is assignment compatible with a single CHAR, which is a change from past versions of Modula. (The ISO standard appears to follow PIM3 in this respect.)
©2017-2020, 2022-2023 David Egan Evans.