Programming language specification

A 'programming language specification defines a programming language. A specification must define at least the following two things:

What phrases are legal programs in the language.
What legal programs mean in the language.

The first of these corresponds roughly to syntax, and the latter corresponds roughly to semantics.

Syntax

Syntax in a programming language is usually described using a combination of

regular expressions to describe lexemes, and
context-free grammars to describe the context-free structure of the language.

Semantics

There are six ways in which programming language semantics are described; all languages use at least one, and some languages combine more than one:

Axiomatic semantics.
Denotational semantics.
Operational semantics.
Natural language descriptions.
Reference implementations.
Test suites

The first three of these are grounded in mathematics, and have the advantage of being precise, compact, and unambiguous. Programming languages whose semantics are described using one of these methods can reap many benefits. For example:

Formal semantics enable mathematical proofs of program correctness;
Formal semantics facilitate the design of type systems, and proofs about the soundness of those type systems;
Formal semantics establish unambiguous and uniform standards for implementations of the language, making it more likely that programs written in those languages will be portable across implementations.

By contrast, natural language descriptions tend to be imprecise, verbose, and ambiguous. They do not lend themselves to proofs, either about individual programs or about the programming language's type system. On the other hand, it is relatively easy for inexperienced language designers to write a natural-language description of a programming language's semantics. Additionally, formulating a rigorous mathematical semantics of a large, complex, practical programming language is a daunting task even for experienced specialists, and the resulting specification can be difficult to understand except by a small priesthood of experts.

To these objections, advocates of formal semantics reply that if a language is so complicated that a formal semantics cannot be defined for it, then a natural language description is likely to fare no better. Natural language description can always be defined as a supplement to a formal semantics. Formal semantics advocates also point out that the imprecision of natural language as a vehicle for programming language semantics has caused problems in the real world: for example, the semantics of Java threads were specified in English, and it was later discovered that the specification did not provide adequate guidance for implementors. Writing truly portable multithreaded Java programs remains challenging to this day.

Regardless of the relative merits of formal and natural-language semantics, in practice most widely-used languages are specified using natural language description. This description usually takes the form of a reference manual for the language. The manuals for widely used languages usually run in the hundreds of pages. For example, the print version of The Java Language Specification, 3rd Ed. is 596 pages long.

By contrast, The Definition of Standard ML, Revised, which uses operational semantics to describe ML, is 114 pages long. The Revised⁵ Report on the Scheme (R5RS) uses denotational semantics to describe Scheme, and is 50 pages long. (These comparisons should be taken with the caveat that Scheme and ML are both arguably simpler languages than Java.)

The fourth means of specifying language semantics is with a reference implementation. In this approach, a single implementation of the programming language is designated as authoritative, and its behavior is held to define the proper behavior of a program written in this language. This approach has several attractive properties. First, it is precise, and requires no human interpretation: disputes as to the meaning of a program can be settled simply by executing the program on this implementation (provided that the implementation behaves deterministically for that program).

On the other hand, this approach also has several drawbacks. Chief among them is that it conflates limitations of the reference implementation with properties of the language. For example, if the reference implementation has a bug, then that bug must be considered to be an authoritative behavior. Another drawback is that programs written in this language are likely to rely on quirks in the reference implementation, hindering portability across different implementations.

Nevertheless, several languages effectively take the reference implementation approach. For example, the Perl interpreter is considered to define the authoritative behavior of Perl programs. In the case of Perl, nobody has ever produced an independent implementation of the language, and the Perl executable itself is highly portable, so some of the drawbacks of using a reference implementation to define the language semantics are moot.

The final way of specifying the meaning of a language is with a test suite. In this approach, the language designer writes a number of example programs in the language, and then describes how those programs ought to behave — perhaps by writing down their correct outputs. The programs, plus their outputs, are called the "test suite" of the language. Any correct language implementation must then produce exactly the correct outputs on the test suite programs.

This technique's chief advantage that it is easy to determine whether a language implementation passes a test suite. The user can simply execute all the programs in the test suite, and compare the outputs to the desired outputs. If the outputs are the same, the language implementation passes. If not, the implementation fails, and therefore must be incorrect. However, when used by itself, the test suite approach has major drawbacks as well. For example, users want to run their own programs, which are not part of the test suite; indeed, a language implementation that could only run the programs in its test suite would be largely useless. But a test suite does not, by itself, describe how the language implementation should behave on any program not in the test suite; determining that behavior requires some extrapolation on the implementor's part, and different implementors may disagree. In addition, it is difficult to use a test suite to test behavior that is intended or allowed to be nondeterministic.

Therefore, in common practice, test suites are used only in combination with one of the other language specification techniques, such as a natural language description or a reference implementation.