Pascal and its Successors - Niklaus Wirth
25.09.2002
|
Niklaus Wirth was born in February 1934 in Winterthur,
Switzerland. He received the degree of Electronics Engineering from the
Swiss Federal Institute of Technology (ETH) in Zurich in 1959, an M.Sc.
from Laval University, Canada, in 1960, and a Ph.D. from the University
of California at Berkeley in 1963. He was Assistant Professor of Computer
Science at Stanford University (1963 - 1967), and then at the University
of Zurich. In 1968 he became Professor of Informatics at ETH Zurich. He
spent two sabbatical years at the Xerox PARC in California, and he is
retired since April 1999.
Pascal, 1968-1972
Freed from the constraining influence of a working group's consensus,
Wirth developped the language Pascal in Zurich. The basis was Algol-W
and the desire to have a language that would satisfy the requirements
of system design (compilers, operating systems, etc.). Also, there was
to be a basis of clear concepts and structures, definable axiomatically
and independently of any particular computer, as the language was to be
suitable also for teaching in an academic environment. Pascal has satisfied
these requirements; it is today one of the most widely used languages
in computer science education. The first Pascal compiler was designed
in Zurich for the CDC 6000 computer family, and it became operational
in 1970. Already in 1972 Pascal was used in introductory programming courses.
Kontakt:
Departement Informatik
ETH Zentrum
CH-8092 Zürich
(Switzerland)
|
Abstract
The programming language Pascal was designed in 1969 in the spirit of Algol
60 with a concisely defined syntax representing the paradigm of structured programming.
Seven years later, with the advent of the micro computer, it became widely known
and was adopted in many schools and universitites. In 1979 it was followed by
Modula-2 which catered for the needs of modular programming in teams. This was
achieved by the module construct and the separate compilation facility. In an
effort to reduce language complexity and to accommodate object-oriented programming,
Oberon was designed in 1988. Here we present some aspects of the evolution of
this family of programming languages
0. Introduction
Many times I have been asked how one "invents" a programming language.
One cannot really tell, but it certainly is a matter of experience with the
subject of programming, and of careful deliberation. Sometimes I answered: "Like
one designs an airplane. One must identify a number of necessary building blocks
and materials, and then assemble and combine them properly to a functioning
whole". This answer may not be entirely satisfactory, but at least in both
cases the result either flies or crashes.
Programming languages were one of the first topics that established computing
science as a discipline with its own identity. The topic belonged neither to
mathematics nor to electrical engineering. It was Algol 60 that introduced rigor
and precision to the subject through its formal definition of syntax. A flurry
of activities began in academia to investigate language properties, to find
faults and inconsistencies, to devise powerful algorithms of syntax analysis,
and to cope with the challenges of compilers. Soon the range of application
of Algol was felt to be too narrow. A new, better language was required, perhaps
a successor to Algol. Committees were established and hot controversies raged,
some protagonists dreaming of grandiose formal systems, some thinking more modestly
of a practical improvement. It was this environment that bred Pascal.
1. Structured Programming and Pascal
Pascal was born in 1969 out of an act of liberation [0]. In more than one sense.
Confronted with the duty to teach programming, I had been faced with the dire
options of Fortran and Algol. The former did not appeal to my taste as a scientist,
the latter not to those of the practical engineer. I liberated myself from this
jail by designing Pascal, convinced that an elegant style and an effective implementation
were not mutually exclusive. I felt strongly -- and still do -- that a language
used in teaching must display some style, elegance, consistency, while at the
same time also reflecting the needs (but not necessarily bad habits) of practice.
I wanted to design a language for both my classroom and my "software factory".
The second alluded liberation was from the design constraint imposed by committee
work. In 1966, Algol W [1] had been a compromise bowing to many divergent opinions
and requests from both an Algol committee and an Algol community. Surely, many
of them were inspiring and beneficial, but others were incompatible and hindering.
Some members had high ambitions of creating a language with novel features whose
consequences were to be the subject of further research, whereas I had been
brought up as an engineer feeling uneasy with proposals whose realization was
still the subject of speculation. I wanted to have at least a concrete idea
of how a construct was to be represented on available computers, and these,
let me add, were rather ill-suited for any feature not already present in Fortran.
The general idea dominating the design of Pascal was to provide a language appealing
to systematic thinking, mirroring conventional mathematical notation, satisfying
the needs of practical programming, and encouraging a structured approach. The
rules governing the language should be intuitive and simple, and freely combinable.
For example, if x+y stands for an expression, x+y should be usable as a sub
expression, in assignments, as procedure parameter, or as index. For example,
if a widely established convention interprets x-y-z to mean (x-y)-z, we should
not redefine this to denote x-(y-z). Or if x=y is used for centuries to denote
equality of x and y, we should refrain from the arrogance of replacing it by
x==y. Clearly, Pascal was to build up on the notational grounds established
by mathematics and Algol. Pascal and its successors were therefore called Algol-like.
Today, it is hard to imagine the circumstances prevailing in the 1960s. We must
recall that the computing community was strictly split into two professional
camps. The scientists and engineers used Fortran for their programming large-scale,
word-oriented, binary computers, wheres the business community used Cobol for
their smaller, character-oriented, decimal machines. System programmers were
labouring within computer companies using proprietary machine-code assemblers.
There were attempts to unite the two worlds, such as the highly innovative Burroughs
B-5000 computer, or IBM's programming language PL/I. Both were ill-fated and
devoured considerable budgets. Pascal was another such attempt, although less
ambitious and without budget or industrial support. It applied the idea of recursively
defined structures not only to executable statements, but also to data types.
As elements, it adopted arrays (vectors, matrices) from Fortran and Algol, as
well as records and files from Cobol. It allowed them to be freely combined
and nested.
The other fact about the 1960s that is difficult to imagine today is the scarcity
of computing resources. Computers with more than 8K of memory words and less
than 10us for the execution of an instruction were called super-computers. No
wonder it was mandatory for the compiler of a new language to generate at least
equally dense and efficient code as its Fortran competitor. Every instruction
counted, and, for example, generating sophisticated subroutine calls catering
to hardly ever used recursion was considered an academic pastime. Index checking
at run-time was judged to be a superfluous luxury. In this context, it was hard
if not hopeless to compete against highly optimized Fortran compilers.
Yet, computing power grew with each year, and with it the demands on software
and on programmers. Repeated failures and blunders of industrial products revealed
the inherent difficulties of intellectually mastering the ever increasing complexity
of the new artefacts. The only solution lay in structuring programs, to let
the programmer ignore the internal details of the pieces when assembling them
into a larger whole. This school of thought was called Structured Programming
[2], and Pascal was designed explicitly to support this discipline. Its foundations
reached far deeper than simply "programming without go to statements"
as some people believed. It is more closely related to the top-down approach
to problem solving.
Besides structured statements, the concept of data types characterized Pascal
profoundly. It implies that every object, be it a constant, a variable, a function,
or a parameter has a type. Data typing introduces redundancy, and this redundancy
can be used to detect inconsistencies, that is, errors. If the type of all objects
can be determined by merely reading the program text, that is, without executing
the program, then the type is called static, and checking can be performed by
the compiler. Surely errors detected by the compiler are harmless and cheap
compared to those detected during program execution in the field, by the customer.
Thus static typing became an important concept in software engineering, the
discipline emerging in the 1970s coping with the construction of large software
complexes.
A particularly successful concept was the integration of pointers into static
typing as suggested by Hoare [3] and adopted in Pascal. The simple idea is to
attribute a fixed type not only with every data object, but also with every
pointer, such that a pointer variable can at any time only refer to an object
of the type to which it is bound (or to none at all). Programming with pointers,
then called list processing, notoriously fraught with pitfalls, now became as
safe as programming without pointers.
Yet, Pascal also suffered from certain deficiencies, more or less significant
depending on personal perception and application. One of them had its origin
in a too dogmatic interpretation of static typing, requiring that the type of
every procedure parameter be known at compile-time. Since this included index
bounds in the case of array types, the frequently convenient dynamic arrays
were excluded. In hindsight, this rigidity was silly and kept many Algolites
from adopting Pascal. Arrays are typically passed by a reference, and for dynamic
arrays only the array bounds must be added to this information. The limited
additional complexity of the compiler would certainly have been outweighed by
the gained language flexibility.
Certain other deficiencies were due to the author's lack of courage to throw
some rules inherited from Algol over board, in fear of antagonizing influential
Algol programmers. The prime entry in this list is the famed go to statement,
retained although, in principle, always replaceable by an if, while, or repeat
construct. Another retained mistake was the lack of full type specification
of parameters of formal procedures, through which, in principle, the entire
type system could be undermined. This is illustrated by the following condensed,
artificial example. Incidentally, it may also serve as an example of programming
puzzles popular at the time.
PROCEDURE P(b: BOOLEAN; q: PROCEDURE);
VAR i: INTEGER;
PROCEDURE Q; BEGIN i := i+1 END Q;
BEGIN i := 0;
IF b THEN P(FALSE, Q) ELSE q;
Print(i)
END P
The puzzle: Which sequence of numbers will be printed by the call P(TRUE, P)?
Note that no parameter types need be specified for q!
We are here confrontred with a case where a certain combination of concepts
leads to difficulties in interpretation, although each concept in isolation
is harmless and well-defined. Here it is the combination of nested procedures,
local scopes, and recursion that cause the problem. It is one of the outstanding
challenges of language design to exclude unexpected effects.
Last but not least, Pascal adopted from Algol a few syntactic ambiguities; a
deadly sin. I refer to the lack of an explicit closing symbol for nestable constructs.
The prime example is the conditional statement. As a consequence, the nested
if statement
IF b0 THEN IF b1 THEN S0 ELSE S1
can be interpreted as
IF b0 THEN [IF b1 THEN S0 ELSE S1]
or alternatively as
IF b0 THEN [IF b1 THEN S0] ELSE S1
This case demonstrates that one should not commit a mistake simply because everybody
else does, particularly if there exists a known, elegant solution to eliminate
the mistake. A thorough account of the development of Pascal is contained in
[4].
Modular Programming and Modula-2
With various defects clearly identified and new challenges in programming emerging,
time seemed ripe for a fresh start, for a successor language. The two foremost
novel challenges were multiprogramming and information hiding. For me personally,
a third, quite practical challenge became an ambition: To create a language
adequate for describing entire systems, from storage allocator to document editor,
from process manager to compiler, and from display driver to graphics editor.
I perceived that many problems in software development stemmed from the mixing
of parts written in different languages. The challenge became real within our
project to design and build the workstation Lilith in 1978 [6]. Its precursor
was Xerox PARC's pioneering workstation Alto [5]. The Alto's software was mostly
written in Mesa; Lilith's software entirely in Modula-2. It would have been
prohibitive to implement more than one language. Evidently, Modula was born
out of an act of necessity [7].
The cornerstone of Modula-2 was the module construct. Whereas Pascal had served
to build monolithic programs, Modula-2 was suitable for systems consisting of
a hierarchy of units with properly defined interfaces. Such a unit was called
module, and later package in Ada. In short, a module is like a Pascal program
with the addition of an explicit interface specification to other modules. This
is obtained as follows: Modules are described by two, distinct texts, a definition
part and an implementation part. In the former all objects are defined which
are visible by other modules, typically types and procedure signatures. They
are said to be exported. The latter part contains all local, hidden objects,
and the bodies of procedures, i.e. their implementations. Hence the term information
hiding. The heading contains lists of identifiers to be imported from other
modules. A small example follows:
DEFINITION MODULE Files;
TYPE File; (*opaque type*)
Rider = RECORD eof: BOOLEAN END ; (*other fields hidden*)
PROCEDURE Set(VAR r: Rider; f: File; pos: INTEGER);
PROCEDURE Read(VAR r: Rider; VAR ch: CHAR);
PROCEDURE Write(VAR r: Rider; ch: CHAR);
PROCEDURE Length(f: File): INTEGER;
END Files.
This key feature catered for the urgent demands for programming in teams. Now
it became possible to determine jointly a modular decomposition of the task
and to agree on the interfaces of the planned system. Thereafter, the team members
could proceed independently in implementing the parts assigned to them. This
style is called modular programming. The concept of module arose earlier in
work by Parnas and, in conjunction with multi-programming by Hoare and Brinch
Hansen, where the module construct was called a monitor [8, 9]. The module was
also present in a concrete form in Mesa, which in Modula was simplified and
generalized.
The module construct would, however, have remained of mostly academic interest
only, were it not for the technique of separate compilation, which was from
its inception combined with the module. By separate compilation we understand
that (1) full type checking is performed by the compiler not only within a module,
but also across module interfaces, and (2) that compatibility (or version) checking
between modules to be joined is achieved by a simple key comparison when the
modules are linked and loaded. We refrain from expounding technical details,
but emphasize that this is a crucial requirement for the design of complex systems,
yet still poorly handled by most systems of commercial provenience.
Besides the successful feature of the module with separate compilation, the
language also had some drawbacks. Surely, the evident deficiencies of Pascal
had been mended. The syntax was now unambiguous, type specifications complete,
and the set of basic data types adequately comprehensive. But as a result, the
language, and with it the compiler, had become relatively large and bulky, although
still orders of magnitude less so than comparable commercial ventures. The goal
of making the language powerful enough to describe entire systems was achieved
by introducing certain low-level features, mostly for accessing particular machine
resources (such as I/O device registers) and for breaching, overriding the rigid
type system. Such facilities, like e.g. type casting, are inherently contrary
to the notion of abstraction by high-level language, and should be avoided.
They were called loopholes, because they allow to break the rules imposed by
the abstraction. But sometimes these rules appear as too rigid, and use of a
loophole becomes unavoidable. The dilemma was resolved through the module facility
which would allow to confine the use of such "naughty" tricks to specific,
low-level server modules. It turned out that this was a naive view of the nature
of programmers. The lesson: If you introduce a feature that can be abused, then
it will be abused, and frequently so!
Object-oriented Programming and Oberon
The advent of the personal computer around 1980 changed the way in which computers
were used dramatically. Direct, immediate interaction replaced remote access
and batch processing. User interfaces became an important issue. They were shaped
by the novel mouse and the high-resolution display, replacing the 24 lines by
80 character screens. They established the paradigm of windows and multi-tasking.
It had been pioneered by the Xerox Alto workstation, and in particular the Smalltalk
project [10]. Along with it came the object-oriented style of programming. Object-oriented
design emerged from the specialized subject of discrete event simulation and
its language Simula [11], whose authors Dahl and Nygaard had realised that its
concepts had a scope far beyond simulation. Some of the proponents of object-oriented
programming even suggested that all of programming should be converted to the
new view of the world.
We felt that a revolution was undesirable to cure the lamented ills of the software
profession, and we considered evolution as the wiser approach. Tempted to design
a version of Modula stripped down to essentials, we also wanted to identify
those features that were indispensable to encompass object-orientation. Our
findings were revealing: A single feature would suffice, all other ingredients
were already present in Modula. The one feature to be added had to allow the
construction of data type hierarchies, called sub-classing in Smalltalk. Our
own term was type extension: The new type adds attributes to those of the old
type. Type extension had the welcome side effect of practically eliminating
all needs for loopholes.
The absence of loopholes is the acid test for the quality of a language. After
all, a language constitues an abstraction, a formal system, determined by a
set of consistent rules and axioms. Loopholes serve to break these rules and
can be understood only in terms of another, underlying system, an implementation.
The principal purpose of a language, however, is to shield the programmer from
implementation details, and to let him think exclusively in terms of the higher-level
abstraction. Hence, a language should be fully defined without reference to
any implementation or computer.
The language Oberon was born out of the ambition to simplify language to the
essentials. The result turned out to be a considerably more powerful and more
elegant language than its predecessors The defining report of Pascal required
30 pages, that of Modula grew to 45 pages, Oberon's could do with 16 [12]. Not
surprisingly, implementations profited substantially from this achievement after
25 years of experience in language design, from a continuous evolution.
One of the simplifications was the reunification of the definition and implementation
parts of modules. An Oberon module is again defined by a single text. Its heading
contains a single list of server modules (rather than of individual, imported
objects). Declarations of objects that are to be accessible in client modules
are specially marked. Unmarked, local objects remain hidden. From a didactic
point of view this reunification may be regretted, because ideally definition
parts are designed among team members and form contracts between them, whereas
implementation parts can thereafter be designed by individual members without
regard for the others, as long as the definition part remains untouched. However,
the proliferation of files and the burden to keep corresponding parts consistent
was considered a drawback. Moreover, reunification eliminated the compiler's
duty to check for consistency between definition and implementation parts. Also,
a definition part can readily be extracted from the module text by a simple
tool.
Conclusions and Outlook
In his article about Oberon, M. Franz wrote in [13]: "The world will remember
Niklaus Wirth primarily as 'that language guy' and associate his name with Pascal."
His observation is accurate; also the invitation to speak at this Conference
hinted that I should concentrate on Pascal. My disobedient digression stems
from my conviction that its successors Modula and Oberon are much more mature
and refined designs than Pascal. They form a family, and each descendant profited
from experiences with its ancestors. At the end, the time span was 25 years.
Why, then, did Pascal capture all the attention, and Modula and Oberon got so
little? Again I quote Franz: "This was, of course, partially of Wirth's
own making". He continues: "He refrained from ... names such as Pascal-2,
Pascal+, Pascal 2000, but instead opted for Modula and Oberon". Again Franz
is right. To my defense I can plead that Pascal-2 and Pascal+ had already been
taken by others for their own extensions of Pascal, and that I felt that these
names would have been misleading for languages that were, although similar,
syntactically distinct from Pascal. I emphasized progress rather than continuity,
evidently a poor marketing strategy.
But of course the naming is by far not the whole story. For one thing, we were
not sufficiently active -- today we would say aggressive -- in making our developments
widely known. Instead of asking what went wrong with Modula and Oberon, however,
let us rather ask what went right with Pascal. In my own perception, the following
factors were decisive:
1. Pascal, incorporating the concepts of structured programming, was sufficiently
different and progressive from Fortran to make a switch worth while. Particularly
so in America, where Algol had remained virtually unknown.
2. In the early 1970s, an organization of users (Pascal User Group PUG) was
formed and helped to make Pascal widely known and available. It also published
a Newsletter.
3. Pascal was ported just in time for the first micro computers (UCSD) [14],
and thereby reached a large population of newcomers unconstrained by engrained
habits and legacy code.
4. Pascal was picked up by start-up companies. Borland's Pascal implementation
was the first compiler to be available for less than $50, turning it into a
household article.
5. UCSD as well as Borland properly integrated the compiler into a complete
development tool, including a program editor, a file system, and a debugging
aid. This made Pascal highly attractive to schools and to beginners. It changed
the manner in which programs were "written". A fast write-compile-test-correct
cycle and interactivity were the new attractions.
6. Shortly after the initial spread, an ever growing number of text books on
programming in Pascal appeared. They were as important as the compiler for Pascal's
popularity in schools and universities. Text books and software entered a symbiotic
partnership.
Perhaps it is worth observing that this chain reaction started around 1977,
fully seven years after Pascal had been published and implemented on a CDC mainframe
computer. Meanwhile, Pascal had been ported to numerous other large computers,
but remained largely within universities. This porting effort was significantly
facilitated by our project resulting in the Pascal P-compiler generating P-code,
the predecessor of the later M-code (for Modula) and Java byte-code.
In contrast to Pascal, Modula and Oberon did not appear at a time when computing
reached new segments of the population. The module concept was not perceived
in teaching as sufficiently significant to warrant a change to a new, albeit
similar language. Text books had been selected, investments in learning had
been made, time was not ripe for a change. Industry did not exactly embrace
Modula either, with a few exceptions, mainly in Britain. A more palatable solution
was to extend Pascal, retaining upward compatibility and old shortcomings. And
there appeared competition in the form of C++ and Ada with powerful industrial
backing.
Oberon fared even worse. It was virtually ignored by industry. This is astounding,
as not only the elegant and powerful language was presented in 1988, but also
a compact and fast compiler in 1990, along with a modern, flexible development
environment for workstations, complete with window system, network, document
and graphics editor, neatly fitting into about 200 Kbytes of memory, the compiler
alone taking less than 50 Kbytes. The entire system was fully described in a
single, comprehensive book, including its source code [15]. It carried the proof
that such a system could be built using only a small fraction of manpower typically
allotted to such endeavors, if proper methods and tools were employed.
One is tempted to rationalize this history with recent, deep changes in the
computing field. Computing resources are no longer at a premium, memory is counted
in megabytes rather than kilobytes as 15 years ago, instruction cycles in nanoseconds
instead of microseconds, and clock frequencies in gigahertz instead of megahertz.
The incentive to economize has dwindled alarmingly. The only scarce resource
is manpower, competent manpower. Industry has difficulties even in finding good
C-programmers, those of finding Oberon programmers are insurmountable. So how
could one reasonably expect companies to adopt Oberon?
We recognize a deadlock: The adoption of new tools and methods is impracticable,
yet the retention of the old ones implies stagnation. Are we therefore condemned
to eternally produce an ever growing mountain of software of ever growing complexity,
software that nobody fully understands, although everybody is well aware of
its defects and deficiencies? In order to avert this highly unpleasant vision
of the future, some fresh starts will have to undertaken now and then. They
require the courage to discard and abandon, to select simplicity and transparency
as design goals rather than complexity and obscure sophistication.
In the field of programming languages two fresh starts have been attempted recently.
I refer to Java and C#. Both tend to restrict rather than augment the number
of features, trying to suggest a certain programming discipline. Hence they
move in the right direction. This is encouraging. But there remains still a
long way to reach Pascal, and longer even to Oberon.
References
0. N. Wirth. The Programming Language Pascal. Acta Informatica, 1, 35 - 63 (1971)
1. N. Wirth and C.A.R. Hoare. A Contribution to the development of ALGOL.
Comm. ACM 9, 6, 413 - 432 (June 1966)
2. E. W. Dijkstra. Notes on Structured Programming.
In Structured Programming. O.-J. Dahl, E. W. Dijkstra and C.A.R. Hoare, Acad.
Press, 1972.
3. C.A.R. Hoare. Notes on Data Structuring.
In Structured Programming. O.-J. Dahl, E. W. Dijkstra and C.A.R. Hoare, Acad.
Press, 1972, and
C.A.R. Hoare. Record Handling.
In F. Genuys, Ed., Programming Languages. Acad. Press, 1968.
4. N. Wirth. Recollections about the Development of Pascal. In T.J. Bergin,
R.G. Gibson.
History of Programming Languages II, Addison-Wesley 1996, ISBN 0-201-89502-1.
5. C.P. Thacker et al. Alto: A personal computer. Xerox PARC, Tech. Report CSL-79-11.
6. N. Wirth. The personal computer Lilith.
Proc. 5th International Conf. on Software Engineering, IEEE Computer Society
Press, 1981.
7. N. Wirth. Programming in Modula-2. Springer, 1974. ISBN 0-387-50150-9.
8. C.A.R. Hoare. An Operating System Struturing Concept.
Comm. ACM, 17, 10 (Oct. 1974), 548-557.
9. P. Brinch Hansen. Operating System Principles.
Prentice-Hall 1973, ISBN 0-13-637843-9.
10. A. Goldberg and A. Kay, Eds. Smalltalk-72 Instruction Manual.
Tech. Report, Xerox PARC, March 1976.
11. Ole-Johan Dahl and C.A.R.Hoare. Hierarchical Program Structures.
In Structured Programming. O.-J. Dahl, E. W. Dijkstra and C.A.R. Hoare, Acad.
Press, 1972.
12. N. Wirth. The Programming Language Oberon.
Software, Practice and Experience, 1988.
13. M. Franz. Oberon: The overlooked Jewel.
In L. Boszormenyi, J. Gutknecht, G. Pomberger. The School of Niklaus Wirth.
ISBN 1-55860-723-4 and 3-932588-85-1.
14. UCSD Pascal Useres Manual. SofTech, 1981.
15. N. Wirth and J. Gutknecht. Project Oberon.
Addison-Wesley, 1992. ISBN 0-201-54428-8.
Paper to be presented at sd&m Conference on Computer Pioneers,
Bonn, 28-29. 6. 2001
|