SwissDelphiCenter.ch : Pascal and its Successors

was ist neu ¦ programmier tips ¦ indy artikel ¦ intraweb artikel ¦ informationen ¦ links ¦ interviews
sonstiges ¦ tutorials ¦ Add&Win Gewinnspiel

interviews

Marco Cantu
Chad Z. Hower
Niklaus Wirth

Werbung

Pascal and its Successors - Niklaus Wirth
25.09.2002

Niklaus Wirth was born in February 1934 in Winterthur, Switzerland. He received the degree of Electronics Engineering from the Swiss Federal Institute of Technology (ETH) in Zurich in 1959, an M.Sc. from Laval University, Canada, in 1960, and a Ph.D. from the University of California at Berkeley in 1963. He was Assistant Professor of Computer Science at Stanford University (1963 - 1967), and then at the University of Zurich. In 1968 he became Professor of Informatics at ETH Zurich. He spent two sabbatical years at the Xerox PARC in California, and he is retired since April 1999.

Pascal, 1968-1972
Freed from the constraining influence of a working group's consensus, Wirth developped the language Pascal in Zurich. The basis was Algol-W and the desire to have a language that would satisfy the requirements of system design (compilers, operating systems, etc.). Also, there was to be a basis of clear concepts and structures, definable axiomatically and independently of any particular computer, as the language was to be suitable also for teaching in an academic environment. Pascal has satisfied these requirements; it is today one of the most widely used languages in computer science education. The first Pascal compiler was designed in Zurich for the CDC 6000 computer family, and it became operational in 1970. Already in 1972 Pascal was used in introductory programming courses.

Kontakt:
Departement Informatik
ETH Zentrum
CH-8092 Zürich
(Switzerland)

Abstract
The programming language Pascal was designed in 1969 in the spirit of Algol 60 with a concisely defined syntax representing the paradigm of structured programming. Seven years later, with the advent of the micro computer, it became widely known and was adopted in many schools and universitites. In 1979 it was followed by Modula-2 which catered for the needs of modular programming in teams. This was achieved by the module construct and the separate compilation facility. In an effort to reduce language complexity and to accommodate object-oriented programming, Oberon was designed in 1988. Here we present some aspects of the evolution of this family of programming languages

0. Introduction
Many times I have been asked how one "invents" a programming language. One cannot really tell, but it certainly is a matter of experience with the subject of programming, and of careful deliberation. Sometimes I answered: "Like one designs an airplane. One must identify a number of necessary building blocks and materials, and then assemble and combine them properly to a functioning whole". This answer may not be entirely satisfactory, but at least in both cases the result either flies or crashes.

Programming languages were one of the first topics that established computing science as a discipline with its own identity. The topic belonged neither to mathematics nor to electrical engineering. It was Algol 60 that introduced rigor and precision to the subject through its formal definition of syntax. A flurry of activities began in academia to investigate language properties, to find faults and inconsistencies, to devise powerful algorithms of syntax analysis, and to cope with the challenges of compilers. Soon the range of application of Algol was felt to be too narrow. A new, better language was required, perhaps a successor to Algol. Committees were established and hot controversies raged, some protagonists dreaming of grandiose formal systems, some thinking more modestly of a practical improvement. It was this environment that bred Pascal.

1. Structured Programming and Pascal
Pascal was born in 1969 out of an act of liberation [0]. In more than one sense. Confronted with the duty to teach programming, I had been faced with the dire options of Fortran and Algol. The former did not appeal to my taste as a scientist, the latter not to those of the practical engineer. I liberated myself from this jail by designing Pascal, convinced that an elegant style and an effective implementation were not mutually exclusive. I felt strongly -- and still do -- that a language used in teaching must display some style, elegance, consistency, while at the same time also reflecting the needs (but not necessarily bad habits) of practice. I wanted to design a language for both my classroom and my "software factory".

The second alluded liberation was from the design constraint imposed by committee work. In 1966, Algol W [1] had been a compromise bowing to many divergent opinions and requests from both an Algol committee and an Algol community. Surely, many of them were inspiring and beneficial, but others were incompatible and hindering. Some members had high ambitions of creating a language with novel features whose consequences were to be the subject of further research, whereas I had been brought up as an engineer feeling uneasy with proposals whose realization was still the subject of speculation. I wanted to have at least a concrete idea of how a construct was to be represented on available computers, and these, let me add, were rather ill-suited for any feature not already present in Fortran.

The general idea dominating the design of Pascal was to provide a language appealing to systematic thinking, mirroring conventional mathematical notation, satisfying the needs of practical programming, and encouraging a structured approach. The rules governing the language should be intuitive and simple, and freely combinable. For example, if x+y stands for an expression, x+y should be usable as a sub expression, in assignments, as procedure parameter, or as index. For example, if a widely established convention interprets x-y-z to mean (x-y)-z, we should not redefine this to denote x-(y-z). Or if x=y is used for centuries to denote equality of x and y, we should refrain from the arrogance of replacing it by x==y. Clearly, Pascal was to build up on the notational grounds established by mathematics and Algol. Pascal and its successors were therefore called Algol-like.

Today, it is hard to imagine the circumstances prevailing in the 1960s. We must recall that the computing community was strictly split into two professional camps. The scientists and engineers used Fortran for their programming large-scale, word-oriented, binary computers, wheres the business community used Cobol for their smaller, character-oriented, decimal machines. System programmers were labouring within computer companies using proprietary machine-code assemblers. There were attempts to unite the two worlds, such as the highly innovative Burroughs B-5000 computer, or IBM's programming language PL/I. Both were ill-fated and devoured considerable budgets. Pascal was another such attempt, although less ambitious and without budget or industrial support. It applied the idea of recursively defined structures not only to executable statements, but also to data types. As elements, it adopted arrays (vectors, matrices) from Fortran and Algol, as well as records and files from Cobol. It allowed them to be freely combined and nested.

The other fact about the 1960s that is difficult to imagine today is the scarcity of computing resources. Computers with more than 8K of memory words and less than 10us for the execution of an instruction were called super-computers. No wonder it was mandatory for the compiler of a new language to generate at least equally dense and efficient code as its Fortran competitor. Every instruction counted, and, for example, generating sophisticated subroutine calls catering to hardly ever used recursion was considered an academic pastime. Index checking at run-time was judged to be a superfluous luxury. In this context, it was hard if not hopeless to compete against highly optimized Fortran compilers.

Yet, computing power grew with each year, and with it the demands on software and on programmers. Repeated failures and blunders of industrial products revealed the inherent difficulties of intellectually mastering the ever increasing complexity of the new artefacts. The only solution lay in structuring programs, to let the programmer ignore the internal details of the pieces when assembling them into a larger whole. This school of thought was called Structured Programming [2], and Pascal was designed explicitly to support this discipline. Its foundations reached far deeper than simply "programming without go to statements" as some people believed. It is more closely related to the top-down approach to problem solving.

Besides structured statements, the concept of data types characterized Pascal profoundly. It implies that every object, be it a constant, a variable, a function, or a parameter has a type. Data typing introduces redundancy, and this redundancy can be used to detect inconsistencies, that is, errors. If the type of all objects can be determined by merely reading the program text, that is, without executing the program, then the type is called static, and checking can be performed by the compiler. Surely errors detected by the compiler are harmless and cheap compared to those detected during program execution in the field, by the customer. Thus static typing became an important concept in software engineering, the discipline emerging in the 1970s coping with the construction of large software complexes.

A particularly successful concept was the integration of pointers into static typing as suggested by Hoare [3] and adopted in Pascal. The simple idea is to attribute a fixed type not only with every data object, but also with every pointer, such that a pointer variable can at any time only refer to an object of the type to which it is bound (or to none at all). Programming with pointers, then called list processing, notoriously fraught with pitfalls, now became as safe as programming without pointers.
Yet, Pascal also suffered from certain deficiencies, more or less significant depending on personal perception and application. One of them had its origin in a too dogmatic interpretation of static typing, requiring that the type of every procedure parameter be known at compile-time. Since this included index bounds in the case of array types, the frequently convenient dynamic arrays were excluded. In hindsight, this rigidity was silly and kept many Algolites from adopting Pascal. Arrays are typically passed by a reference, and for dynamic arrays only the array bounds must be added to this information. The limited additional complexity of the compiler would certainly have been outweighed by the gained language flexibility.

Certain other deficiencies were due to the author's lack of courage to throw some rules inherited from Algol over board, in fear of antagonizing influential Algol programmers. The prime entry in this list is the famed go to statement, retained although, in principle, always replaceable by an if, while, or repeat construct. Another retained mistake was the lack of full type specification of parameters of formal procedures, through which, in principle, the entire type system could be undermined. This is illustrated by the following condensed, artificial example. Incidentally, it may also serve as an example of programming puzzles popular at the time.

PROCEDURE P(b: BOOLEAN; q: PROCEDURE);
VAR i: INTEGER;
PROCEDURE Q; BEGIN i := i+1 END Q;
BEGIN i := 0;
IF b THEN P(FALSE, Q) ELSE q;
Print(i)
END P

The puzzle: Which sequence of numbers will be printed by the call P(TRUE, P)? Note that no parameter types need be specified for q!
We are here confrontred with a case where a certain combination of concepts leads to difficulties in interpretation, although each concept in isolation is harmless and well-defined. Here it is the combination of nested procedures, local scopes, and recursion that cause the problem. It is one of the outstanding challenges of language design to exclude unexpected effects.

Last but not least, Pascal adopted from Algol a few syntactic ambiguities; a deadly sin. I refer to the lack of an explicit closing symbol for nestable constructs. The prime example is the conditional statement. As a consequence, the nested if statement

IF b0 THEN IF b1 THEN S0 ELSE S1
can be interpreted as
IF b0 THEN [IF b1 THEN S0 ELSE S1]
or alternatively as
IF b0 THEN [IF b1 THEN S0] ELSE S1

This case demonstrates that one should not commit a mistake simply because everybody else does, particularly if there exists a known, elegant solution to eliminate the mistake. A thorough account of the development of Pascal is contained in [4].

Modular Programming and Modula-2
With various defects clearly identified and new challenges in programming emerging, time seemed ripe for a fresh start, for a successor language. The two foremost novel challenges were multiprogramming and information hiding. For me personally, a third, quite practical challenge became an ambition: To create a language adequate for describing entire systems, from storage allocator to document editor, from process manager to compiler, and from display driver to graphics editor. I perceived that many problems in software development stemmed from the mixing of parts written in different languages. The challenge became real within our project to design and build the workstation Lilith in 1978 [6]. Its precursor was Xerox PARC's pioneering workstation Alto [5]. The Alto's software was mostly written in Mesa; Lilith's software entirely in Modula-2. It would have been prohibitive to implement more than one language. Evidently, Modula was born out of an act of necessity [7].

The cornerstone of Modula-2 was the module construct. Whereas Pascal had served to build monolithic programs, Modula-2 was suitable for systems consisting of a hierarchy of units with properly defined interfaces. Such a unit was called module, and later package in Ada. In short, a module is like a Pascal program with the addition of an explicit interface specification to other modules. This is obtained as follows: Modules are described by two, distinct texts, a definition part and an implementation part. In the former all objects are defined which are visible by other modules, typically types and procedure signatures. They are said to be exported. The latter part contains all local, hidden objects, and the bodies of procedures, i.e. their implementations. Hence the term information hiding. The heading contains lists of identifiers to be imported from other modules. A small example follows:

DEFINITION MODULE Files;
TYPE File; (*opaque type*)
Rider = RECORD eof: BOOLEAN END ; (*other fields hidden*)
PROCEDURE Set(VAR r: Rider; f: File; pos: INTEGER);
PROCEDURE Read(VAR r: Rider; VAR ch: CHAR);
PROCEDURE Write(VAR r: Rider; ch: CHAR);
PROCEDURE Length(f: File): INTEGER;
END Files.

This key feature catered for the urgent demands for programming in teams. Now it became possible to determine jointly a modular decomposition of the task and to agree on the interfaces of the planned system. Thereafter, the team members could proceed independently in implementing the parts assigned to them. This style is called modular programming. The concept of module arose earlier in work by Parnas and, in conjunction with multi-programming by Hoare and Brinch Hansen, where the module construct was called a monitor [8, 9]. The module was also present in a concrete form in Mesa, which in Modula was simplified and generalized.

The module construct would, however, have remained of mostly academic interest only, were it not for the technique of separate compilation, which was from its inception combined with the module. By separate compilation we understand that (1) full type checking is performed by the compiler not only within a module, but also across module interfaces, and (2) that compatibility (or version) checking between modules to be joined is achieved by a simple key comparison when the modules are linked and loaded. We refrain from expounding technical details, but emphasize that this is a crucial requirement for the design of complex systems, yet still poorly handled by most systems of commercial provenience.

Besides the successful feature of the module with separate compilation, the language also had some drawbacks. Surely, the evident deficiencies of Pascal had been mended. The syntax was now unambiguous, type specifications complete, and the set of basic data types adequately comprehensive. But as a result, the language, and with it the compiler, had become relatively large and bulky, although still orders of magnitude less so than comparable commercial ventures. The goal of making the language powerful enough to describe entire systems was achieved by introducing certain low-level features, mostly for accessing particular machine resources (such as I/O device registers) and for breaching, overriding the rigid type system. Such facilities, like e.g. type casting, are inherently contrary to the notion of abstraction by high-level language, and should be avoided. They were called loopholes, because they allow to break the rules imposed by the abstraction. But sometimes these rules appear as too rigid, and use of a loophole becomes unavoidable. The dilemma was resolved through the module facility which would allow to confine the use of such "naughty" tricks to specific, low-level server modules. It turned out that this was a naive view of the nature of programmers. The lesson: If you introduce a feature that can be abused, then it will be abused, and frequently so!

Object-oriented Programming and Oberon
The advent of the personal computer around 1980 changed the way in which computers were used dramatically. Direct, immediate interaction replaced remote access and batch processing. User interfaces became an important issue. They were shaped by the novel mouse and the high-resolution display, replacing the 24 lines by 80 character screens. They established the paradigm of windows and multi-tasking. It had been pioneered by the Xerox Alto workstation, and in particular the Smalltalk project [10]. Along with it came the object-oriented style of programming. Object-oriented design emerged from the specialized subject of discrete event simulation and its language Simula [11], whose authors Dahl and Nygaard had realised that its concepts had a scope far beyond simulation. Some of the proponents of object-oriented programming even suggested that all of programming should be converted to the new view of the world.

We felt that a revolution was undesirable to cure the lamented ills of the software profession, and we considered evolution as the wiser approach. Tempted to design a version of Modula stripped down to essentials, we also wanted to identify those features that were indispensable to encompass object-orientation. Our findings were revealing: A single feature would suffice, all other ingredients were already present in Modula. The one feature to be added had to allow the construction of data type hierarchies, called sub-classing in Smalltalk. Our own term was type extension: The new type adds attributes to those of the old type. Type extension had the welcome side effect of practically eliminating all needs for loopholes.

The absence of loopholes is the acid test for the quality of a language. After all, a language constitues an abstraction, a formal system, determined by a set of consistent rules and axioms. Loopholes serve to break these rules and can be understood only in terms of another, underlying system, an implementation. The principal purpose of a language, however, is to shield the programmer from implementation details, and to let him think exclusively in terms of the higher-level abstraction. Hence, a language should be fully defined without reference to any implementation or computer.

The language Oberon was born out of the ambition to simplify language to the essentials. The result turned out to be a considerably more powerful and more elegant language than its predecessors The defining report of Pascal required 30 pages, that of Modula grew to 45 pages, Oberon's could do with 16 [12]. Not surprisingly, implementations profited substantially from this achievement after 25 years of experience in language design, from a continuous evolution.

One of the simplifications was the reunification of the definition and implementation parts of modules. An Oberon module is again defined by a single text. Its heading contains a single list of server modules (rather than of individual, imported objects). Declarations of objects that are to be accessible in client modules are specially marked. Unmarked, local objects remain hidden. From a didactic point of view this reunification may be regretted, because ideally definition parts are designed among team members and form contracts between them, whereas implementation parts can thereafter be designed by individual members without regard for the others, as long as the definition part remains untouched. However, the proliferation of files and the burden to keep corresponding parts consistent was considered a drawback. Moreover, reunification eliminated the compiler's duty to check for consistency between definition and implementation parts. Also, a definition part can readily be extracted from the module text by a simple tool.

Conclusions and Outlook
In his article about Oberon, M. Franz wrote in [13]: "The world will remember Niklaus Wirth primarily as 'that language guy' and associate his name with Pascal." His observation is accurate; also the invitation to speak at this Conference hinted that I should concentrate on Pascal. My disobedient digression stems from my conviction that its successors Modula and Oberon are much more mature and refined designs than Pascal. They form a family, and each descendant profited from experiences with its ancestors. At the end, the time span was 25 years.

Why, then, did Pascal capture all the attention, and Modula and Oberon got so little? Again I quote Franz: "This was, of course, partially of Wirth's own making". He continues: "He refrained from ... names such as Pascal-2, Pascal+, Pascal 2000, but instead opted for Modula and Oberon". Again Franz is right. To my defense I can plead that Pascal-2 and Pascal+ had already been taken by others for their own extensions of Pascal, and that I felt that these names would have been misleading for languages that were, although similar, syntactically distinct from Pascal. I emphasized progress rather than continuity, evidently a poor marketing strategy.

But of course the naming is by far not the whole story. For one thing, we were not sufficiently active -- today we would say aggressive -- in making our developments widely known. Instead of asking what went wrong with Modula and Oberon, however, let us rather ask what went right with Pascal. In my own perception, the following factors were decisive:

1. Pascal, incorporating the concepts of structured programming, was sufficiently different and progressive from Fortran to make a switch worth while. Particularly so in America, where Algol had remained virtually unknown.

2. In the early 1970s, an organization of users (Pascal User Group PUG) was formed and helped to make Pascal widely known and available. It also published a Newsletter.

3. Pascal was ported just in time for the first micro computers (UCSD) [14], and thereby reached a large population of newcomers unconstrained by engrained habits and legacy code.

4. Pascal was picked up by start-up companies. Borland's Pascal implementation was the first compiler to be available for less than $50, turning it into a household article.

5. UCSD as well as Borland properly integrated the compiler into a complete development tool, including a program editor, a file system, and a debugging aid. This made Pascal highly attractive to schools and to beginners. It changed the manner in which programs were "written". A fast write-compile-test-correct cycle and interactivity were the new attractions.

6. Shortly after the initial spread, an ever growing number of text books on programming in Pascal appeared. They were as important as the compiler for Pascal's popularity in schools and universities. Text books and software entered a symbiotic partnership.
Perhaps it is worth observing that this chain reaction started around 1977, fully seven years after Pascal had been published and implemented on a CDC mainframe computer. Meanwhile, Pascal had been ported to numerous other large computers, but remained largely within universities. This porting effort was significantly facilitated by our project resulting in the Pascal P-compiler generating P-code, the predecessor of the later M-code (for Modula) and Java byte-code.
In contrast to Pascal, Modula and Oberon did not appear at a time when computing reached new segments of the population. The module concept was not perceived in teaching as sufficiently significant to warrant a change to a new, albeit similar language. Text books had been selected, investments in learning had been made, time was not ripe for a change. Industry did not exactly embrace Modula either, with a few exceptions, mainly in Britain. A more palatable solution was to extend Pascal, retaining upward compatibility and old shortcomings. And there appeared competition in the form of C++ and Ada with powerful industrial backing.
Oberon fared even worse. It was virtually ignored by industry. This is astounding, as not only the elegant and powerful language was presented in 1988, but also a compact and fast compiler in 1990, along with a modern, flexible development environment for workstations, complete with window system, network, document and graphics editor, neatly fitting into about 200 Kbytes of memory, the compiler alone taking less than 50 Kbytes. The entire system was fully described in a single, comprehensive book, including its source code [15]. It carried the proof that such a system could be built using only a small fraction of manpower typically allotted to such endeavors, if proper methods and tools were employed.
One is tempted to rationalize this history with recent, deep changes in the computing field. Computing resources are no longer at a premium, memory is counted in megabytes rather than kilobytes as 15 years ago, instruction cycles in nanoseconds instead of microseconds, and clock frequencies in gigahertz instead of megahertz. The incentive to economize has dwindled alarmingly. The only scarce resource is manpower, competent manpower. Industry has difficulties even in finding good C-programmers, those of finding Oberon programmers are insurmountable. So how could one reasonably expect companies to adopt Oberon?
We recognize a deadlock: The adoption of new tools and methods is impracticable, yet the retention of the old ones implies stagnation. Are we therefore condemned to eternally produce an ever growing mountain of software of ever growing complexity, software that nobody fully understands, although everybody is well aware of its defects and deficiencies? In order to avert this highly unpleasant vision of the future, some fresh starts will have to undertaken now and then. They require the courage to discard and abandon, to select simplicity and transparency as design goals rather than complexity and obscure sophistication.
In the field of programming languages two fresh starts have been attempted recently. I refer to Java and C#. Both tend to restrict rather than augment the number of features, trying to suggest a certain programming discipline. Hence they move in the right direction. This is encouraging. But there remains still a long way to reach Pascal, and longer even to Oberon.

References
0. N. Wirth. The Programming Language Pascal. Acta Informatica, 1, 35 - 63 (1971)
1. N. Wirth and C.A.R. Hoare. A Contribution to the development of ALGOL.
Comm. ACM 9, 6, 413 - 432 (June 1966)
2. E. W. Dijkstra. Notes on Structured Programming.
In Structured Programming. O.-J. Dahl, E. W. Dijkstra and C.A.R. Hoare, Acad. Press, 1972.
3. C.A.R. Hoare. Notes on Data Structuring.
In Structured Programming. O.-J. Dahl, E. W. Dijkstra and C.A.R. Hoare, Acad. Press, 1972, and
C.A.R. Hoare. Record Handling.
In F. Genuys, Ed., Programming Languages. Acad. Press, 1968.
4. N. Wirth. Recollections about the Development of Pascal. In T.J. Bergin, R.G. Gibson.
History of Programming Languages II, Addison-Wesley 1996, ISBN 0-201-89502-1.
5. C.P. Thacker et al. Alto: A personal computer. Xerox PARC, Tech. Report CSL-79-11.
6. N. Wirth. The personal computer Lilith.
Proc. 5th International Conf. on Software Engineering, IEEE Computer Society Press, 1981.
7. N. Wirth. Programming in Modula-2. Springer, 1974. ISBN 0-387-50150-9.
8. C.A.R. Hoare. An Operating System Struturing Concept.
Comm. ACM, 17, 10 (Oct. 1974), 548-557.
9. P. Brinch Hansen. Operating System Principles.
Prentice-Hall 1973, ISBN 0-13-637843-9.
10. A. Goldberg and A. Kay, Eds. Smalltalk-72 Instruction Manual.
Tech. Report, Xerox PARC, March 1976.
11. Ole-Johan Dahl and C.A.R.Hoare. Hierarchical Program Structures.
In Structured Programming. O.-J. Dahl, E. W. Dijkstra and C.A.R. Hoare, Acad. Press, 1972.
12. N. Wirth. The Programming Language Oberon.
Software, Practice and Experience, 1988.
13. M. Franz. Oberon: The overlooked Jewel.
In L. Boszormenyi, J. Gutknecht, G. Pomberger. The School of Niklaus Wirth. ISBN 1-55860-723-4 and 3-932588-85-1.
14. UCSD Pascal Useres Manual. SofTech, 1981.
15. N. Wirth and J. Gutknecht. Project Oberon.
Addison-Wesley, 1992. ISBN 0-201-54428-8.

Paper to be presented at sd&m Conference on Computer Pioneers, Bonn, 28-29. 6. 2001