This document is currently in-progress, started as a supplement to Software Tools in Free Pascal. It relates my considerations for implementing the Software Tools in Modula-2.
In considering Modula-2, understanding the history and motivations of Wirth's change from Pascal is useful. Perhaps of interest was the experience with the VENUS Multi-Access system, and its editor, and issues related to multiprogramming with that environment.
In 1973, the first Modula compiler was written on the ETH CDC machine with its
Pascal compiler, then ported to a PDP-11/40 GT44. It appears to be a small language (e.g. no
floating point handling), perhaps relying on the boot strap for loading. Software was written to
handle the different hardware and peripheral devices, and to run some basic programs that
demonstrate the use of its multi-process handling. This was a case insensitive language, similar
to Pascal, but adding modules. Modula appears to be implemented as a single text language like
Pascal with local modules that replaced the Algol own
statement, and had device and
interface (process) modules for specific, non-portable handling. At least one 3rd party Modula
compiler provided independent or separate compilation.
After exposure to MESA, by 1978 the language became case-sensitive and followed MESA's separate text imports for separate compilation. This was the beginning of Modula-2. Though EBNF was introduced in the Pascal Algorithms book, and a paper in 1977 promoting the benefits of converting BNF to ENBF, the 1977 Modula report seems to be the first place it was used to describe a language. (I have not looked for differences between the 1977 ETH Modula report and that published in Software - Practice and Experience, not having access to the latter.)
The first Modula-2 compiler was developed, based on the Modula compiler. The late 1978 report
(#27) seems to be adding back things from Pascal to Modula after adapting changes observed from
MESA. Some things stand out: the separate compilation of modules, upper-case reserved words and
case-sensitivity, and the switch to co-routines exemplified by the removal of the interface and
device module syntax (though still described as such with the Typewriter
module
example). The addition of things from Pascal is not complete, however. One curious change in the
procedures section example now uses log2
with the new CARDINAL
type,
instead of gcd
, likely giving a cleaner example of the new type. The
lineinput
and trackreservation
local module examples from Modula remain,
but modified to conform to the new syntax, yet clearly with a hint of direct relationship between
the two language versions. ALLOCATE
and DEALLOCATE
are now standard
procedures. The SYSTEM
module is provided. The module syntax is still simple, adding
only a DEFINITION
module for libraries, (i.e. no implimentation module. I can't help
but see a form of the 1988, i.e. after 1987 but before the definition browser of 1989, Oberon
syntax here.)
The ETH 36 report, in its first (March) and second (December) 1980 editions, was the first to
provide a public report on the first compiler used by students at ETH on the DEC PDP-11/40, now
implemented on RT-11. The language now looks about like what we're familiar with, the
REAL
type is added, and the IMPLEMENTATION
module now is used separate
from the program or main module. The compiler implementation is now described for how it's used on
RT-11, and the InOut
module is provided built from SYSTEM
. Wirth adds
InOut
, Streams
, and ProcessScheduler
as portable utility
modules. (Jacobi wrote the device specific modules TTIO
, Files
, and the
stack interpreter Loader
.)
The main changes to the second edition report seem related to modules and standard procedures.
ADR
is moved to SYSTEM
and ASH
is removed, as is
ROUND
. CHR
and ORD
are added. FLOAT
is
restricted to CARDINAL
(but will be used with both in PIM4). VAL
is now
listed with the standard procedures. MathLib
has no 0
(what was the 0
about?) and is added for the first time as a standard module.
With the introduction of PIM, the 1980 second edition report goes through what seem like big
changes, but in reality only some interpretive clarifications (necessary for other compiler
implimentations) are given: the language remains effectively the same. The report is rewritten to
separate some parts into specific descriptions as a separate manual, the compiler usage
descriptions are removed (though the Lilith compiler port, effectively the same compiler, is
described in the Lilith Handbook in a similar way), and the manual portion of PIM now includes
differing Lilith module definitions along with the RT-11 ones. LineDrawing
and
WindowHandler
modules are added for the Medos-2 system, (see ETH paper #56). Some
modules are renamed, or split. A word for word comparison of PIM with PIM2 suggests that little
but the fonts and layout have changed, though the Processes
implementation module is
fixed from its broken state in PIM, and it introduces text duplication and missing output from an
example, (most fixed in PIM3, except the windowing screenshot is moved out of context to the end of
a later chapter). Several code bugs exist that are not fixed in PIM2, and only a couple fixed
in PIM3.
The changes in PIM3 are effectively described in the ETH paper #59, which also revisits the multiprogramming research of the original Modula but in context of coroutines. The compiler rewrite by Wirth as a single-pass compiler is described in ETH paper #64. A German translation is made of PIM3. The Pascal text of Algorithms + Structures = Programs is separated into two texts by Wirth (in both the German and English versions, the German being the originator). The German text goes through five editions until translated back into English, also apparently by Wirth himself (a sixth edition?). Compilerbau is the last chapter of the original Pascal edition turned into a small manual, which goes through four editions. There is no English translation, but the first English Oberon edition in 1995 is based on it, and perhaps Pascal-S, again by Wirth.
The Ceres-1, the first true 32-bit system with a National Semiconductor chip has the Medos-2 system and Modula-2 compiler ported to it. PIM4 is released describing a couple of changes and clarifications from that port, and is translated into German and printed in 1991. The final edition of MacMETH adopts these changes. The changes from PIM3 to PIM4 are as follows:
INTEGER
is normalized as a base type for subranges, the
implication being that CARDINAL
can be implimented as a sub-range
of INTEGER
, thus making CARDINAL
and sub-ranges
compatible with INTEGER
. The book switches from
CARDINAL
to INTEGER
for its examples. (The ISO
standard does not allow assignment compatibility of INTEGER
and
CARDINAL
.)MOD
and DIV
statements now expect
compatibility with INTEGER
, not to be interpreted as only
CARDINAL
. Wirth's compiler introduces REM
. (It is
not part of the report. The 1986 Algorithms and Data Structures
includes an explanation of REM
, which is in the EBNF.) The ISO
standard uses /
instead of DIV
and includes
REM
formally.INTEGER
as well as
CARDINAL
. (The ISO standard doesn't fully make this adjustment,
e.g. TRUNC
which has a CARDINAL
value that is not
compatible with INTEGER
.)CHAR
.From this point, Modula-2 gives way to Oberon, and the controversial ISO/IEC standard is finally finished in 1996, based on PIM3 and picking and choosing from PIM4 changes. Some compilers had already switched to the ISO/IEC standard in 1993/4.
getc/putc
For a port of Software Tools in Pascal, and following Kernighan's approach and suggestions, a compatible subset of the language will be used following Wirth's last report (PIM4 with a German second edition translation) and ISO/IEC standard (10514-1[1996]). This paper takes a different approach from that of Software Tools in Free Pascal. There, the differences needed to make the existing software build with Free Pascal was described, and some technical detail was given. Here, the basics for getting started are documented, and a focus on how Modula-2 fixes the complaints within Pascal are focused on. Unlike with HOST, my goals are to learn from the text in Modula-2, not build a new library for Medos-2.
Wirth's library and the Medos-2 command interface are very simple. Command launching is single command. It fills in
the command (so the entire command is not required to be typed), but a space, newline, (or null
character), end the command. Options, a switch that uses a /
slash character, must be
without a space. In some ways, this is perfect for the conditions of Kernighan's tools. However,
for getc
in the beginning, character buffering is required and can't use
InOut
(or FileSystem
). As the
HOST
[KU87] paper eth-3161-01 indicates,
too many module layers may not be the best in the end
creating primitives from primitives. However, to get started, making the
program work is the first priority. Optimizing, and perhaps making the
primitive more portable, can be done later. Portability is difficult, because
Modula-2 has no standard library. The PIM standard (3/4 in English, 1st/2nd in
German) libraries are really no longer common. The ISO/IEC standard is a
dialect in a way, not merely an interpreted implementation with extensions.
Kernighan's libraries see their usefulness as a portable collection.
Here's an example of a (definition) module called ST
, with
just enough for the copyprog
command to work with the
copy
procedure:
DEFINITION MODULE ST;
CONST
(* Universal manifest constants. *)
ENDFILE = -1;
TYPE
character = [-1..127]; (* Byte-sized. ASCII + other stuff. *)
(* Primitives *)
PROCEDURE getc(VAR c: character): character;
PROCEDURE putc(c: character);
END ST.
The getc
function return under Modula-2 uses the
RETURN
statement instead of a reassignment of the variable, as in
Pascal or Modula(-1).
Here is an example of charcount
. Note the use of the
INC
standard procedure, instead of the typical expression
nc := nc + 1
. Also is Wirth's solution to Algol 68's
solution to the dangling IF
and DO
(e.g.
do .. od
, if .. fi
): all conditionals and loops end
with the END
statement. Multi-statement blocks are assumed,
instead of single statements like in Pascal, so a BEGIN
is
unnecessary (and tends to only be found in procedure and module definitions).
MODULE charcount;
IMPORT ST;
(* charcount: count characters in standard input *)
PROCEDURE charcount;
VAR
nc: CARDINAL;
c: ST.character;
BEGIN
nc := 0;
WHILE (ST.getc(c) # ST.ENDFILE) DO
INC(nc)
END;
ST.putdec(nc, 1);
ST.putc(ST.NEWLINE)
END charcount;
BEGIN charcount
END charcount.
The use of ENDFILE
and NEWLINE
are qualified to
their imported module, thus avoiding capitalized objects for the sake of
avoiding reserved words by the compiler, (a Mesa convention). However, this
makes portability of the PROCEDURE
to code differences less flexible.
This is a design choice. Some compilers interpret IMPORT
differently.
Future code examples here will assume FROM
.
With putdec
as described in Software Tools in Pascal the
Pascal and Modula DIV
statement is not used the same in the ISO
standard. To keep it portable, I added a divide
function to the
primitives, and used it. This should be easily modifiable between /
and DIV
depending on the compiler. This keeps non-portable changes
in the primitives, which are then the only code that has to change from compiler
to compiler.
The current code can be found at http://oberon07.com/dee/software/ST/, tested on the Windows ADW and GNU/Linux gm2 compilers.
wordcount
introduces the BLANK
and
TAB
character objects. Though the NOT
statement could
be used, the ~
operator, introduced in PIM3, resembles a different
gliff from earlier times to represent a negation (¬
). I decided to
use this as it is exclusively used with its successor language Oberon, and is
not unusual in other programming languages.
One of the improvements to Pascal, originally explained in the Modula
report, is reserving ELSE
to the final catch-all of
IF
with the introduction of ELSIF
.
(* wordcount: count words in standard input *)
PROCEDURE wordcount;
VAR
nw: CARDINAL;
c: character;
inword: BOOLEAN;
BEGIN
nw := 0;
inword := FALSE;
WHILE (getc(c) # ENDFILE) DO
IF (c = BLANK) OR (c = NEWLINE) OR (c = TAB) THEN
inword := FALSE
ELSIF ~inword THEN
inword := TRUE;
INC(nw)
END
END;
putdec(nw, 1);
putc(NEWLINE)
END wordcount;
Line printer handling had an ANSI standard, also found in Fortran. However, into the 80s, this became less common. Modula-2 had worked mostly with laser printers, which had its own language. Today most printers have their own, far more complicated, printing language, and the ANSI standard has been withdrawn.
overstrike
had its simplest examples in Wirth's book Systematic Programming.
I spent a bit of time with overstrike to make it more robust from the
exercises, having a bit of fun, but it is essentially a relic of the past. The
formfeed character constant FF
was added to the standard
environment, and is not found in Kernighan's book.
The FOR
loop in Modula-2 is different than Pascal, for instance
using BY
instead of DOWNTO
. This is first used in the
settabs
procedure in detab
. However, the
putrep
procedure used by the compress
and
expand
commands is a good example for one of the differences in
Modula-2 from Pascal. (Modula(-1) had no for
loop.)
FOR m := n TO 1 BY -1 DO
putc(c)
END
The function isupper
is discussed in terms of a set, but also
a range. Where the syntax of both is different that what is in the book, I
found the following the simplest and most efficient:
PROCEDURE isupper(c: character): BOOLEAN;
VAR A, Z: INTEGER;
BEGIN
A := ORD('A');
Z := ORD('Z');
RETURN (c >= A) AND (c <= Z)
END isupper;
In Software Tools in Pascal, the echo
example replaces
crypt
as shown in Software Tools. Kernighan indicated the
language was incapable of making a portable xor
. There are a
couple ways to do this in Modula-2. The more efficient, though less portable
approach, would be to use Modula-2's BITSET
type and the
VAL
function (to cast between CHAR
and
INTEGER
). If your compiler already has a built-in XOR
function, that is better. Perhaps there is a variation on Kernighan's
c := ((NOT b) AND a) OR ((NOT a) AND b);
approach but in (PIM4)
standard Modula-2.
PROCEDURE xor(x, y: INTEGER): INTEGER;
VAR c: INTEGER;
BEGIN c := x BXOR y; (* ADW specific function BXOR *)
RETURN c
END xor;
In context of strings, Modula-2 adds a standard open parameter for handling any kind of ARRAY type and size (still static). This allows for procedures to be made that handle varying string lengths. It also allows for the compiler to optimize resource size around the parameter being passed, effectively solving the Pascal problem. However, the discussion around how to define a string is still relevant for older versions of Modula. PIM4 expects the null termination of a string, a single character string with null is assignment compatible with a single CHAR, which is a change from past versions of Modula. (The ISO standard appears to follow PIM3 in this respect.)
©2017-2020, 2022-2023 David Egan Evans.