User:Steamerandy/editing/metacompiler2

This is the user sandbox of Steamerandy. A user sandbox is a subpage of the user's user page. It serves as a testing spot and page development space for the user and is not an encyclopedia article. Create or edit your own sandbox here.

Other sandboxes: Main sandbox | Template sandbox

Finished writing a draft article? Are you ready to request review of it by an experienced editor for possible inclusion in Wikipedia? Submit your draft for review!

metacompiler

Metacompiler

A meta compiler in the most general case is a compiler that takes as input computer programs written in a meta language.^[1]^[2]^[3] The metalanguage is a formally defined transformational meta language.^[3]^[4] The metalanguages have two functions. syntax analysis and transformation.

However many are more general string-processing-object-orianted programming languages having uses other then compiler development. A notable example is metacompilers developed at SDC, funded by the United States Department of Defense, where primarly used to write document analysis applications.^[2]^[3]

The early history of meta compilers is closely associated with the ACM SIGPLAN working group 1 on Syntax Directed Compilers. The group was primarily started by the efforts of Howard Metcalf. Howard designed two compilers in the fall of 1962.^[5] One based on work by Ledley and Wilson used a botom-up parsing method.^[6] The other using a top-down approach based on work by Glennie generated random sentences from context-free grammars.^[5]^[7] META I, in a talk given by Dewey Val Schorre at the 1963 ACM conference in Denver Colorado, presented the basic concepts of transformational metalanguage programming.^[3]^[4] The Schorre META II compiler writing language artical was published in the Jernal of the ACM in 1964. Schorre's transformational metalanguages are analytical grammars that also include semantic transform operations as part of the metalanguage specification.^[8]^[9] That same year System Development Corporation, (SDC), a government supported think tank, began major conserated efforts in metacompiler development. This effort lead to powerful metacompilers, BOOK1, BOOK2 and META 5, written in LISP. META 5, released to many users, was used in writing many string-minuplating, document analysis, applications. META 5 has multiple push-down stacks, incorporated backup of the input stream, symbol attributes and other facilities enabling it to parse context-sensitive grammars.^[10]^[2] SDC released CWIC in 1970. CWIC used a Schorre transformational metalanguage and included a dialect of LISP 2 used in generating and optimizing output object code. Like META II CWIC was written in it's own languages.

Early meta compilers were compilers taking as input a META "language". As is the practice today were documentation describes the META "language". I.E. 'META I, META II, METAFOUR, META 5 etc, This naming convention started by Dewey Val Schorre were in the META II document he gave license to freely use his work:

META II is not presented as a standard language, but as a point of departure from which a user may develop his own META "language". - Dewey Val Schorre^[11]

Of course programs coded in a META 'langusge' were compiled by the META 'language' compiler. Eventually a portmanteau of meta~~language~~ compiler became the generic term meta compiler. A compiler whose input is a meta language.^[2]^[3].

Between 1964 and 1970 many compiler compilers were developed. These were complete compiler development systems that included all phases of translation from parsing to code generation. A prime example, "A GENERAL PURPOSE TABLE-DRIVEN COMPILER", used five phases of translation each written in it's own specialized language. It was not described as being a metacompiler or specifically using metalanguages.^[12] However it's documentation clearly describes it as using specialized meta languages for each phase of translation.

Phase 1 syntax analysis. Using a BNF like metalanguage transformed input source code into tree structures.

Phase 2 Generator. Performed optimizations on the tree and transformed it into macro like instructions.

Phase 3 ISO. In Sequence Optimization. Processed the macro sequences optomizing them. Constant calculations moved outside of loops. Redundant calculation eliminatation.

Phase 4 Macro expansuon. Macro instruction expansion into assembly code.

Phase 5 Assembly. Transforming assembly instructions into binary object code.

During the 1960s metacompiler and compiler-compiler referred to complete compiler writing systems. When yacc came out in the 1970s it corrupted the meaning of compiler-compiler. It's name being an acronym for yet another compiler compiler and only being a parser generator.

Metacompilers by supplying functionality and automating aspects that are the same regardless of the object language, reduce the time, effort and cost of producing and maintaining compilers, translators, or interpreters, to a point where it becomes economically feasible to include the design of domain-specific languages which are appropriate to the specification of a particular problem in the solution of a problem.^[2]^[13]

Besides being useful for development of programming languages, they are themselves a programming language useful for a wide range string or symbol processing applications.^[14]^[13]^[3]

String processing object-oriented

transformational metalanguage programming.^[4]

The early meta compilers were origionally explained as using syntax equations resembling BNF. However as schorre explained in his talks on META I they use a form of transformational metalanguage. Generally resembling phrase structure grammar rules. Today syntax formula is a closer fit to the technology. Each syntax analysing formula is translated into a recursive test function. A top driving rule defines a valid program module's structure. In these languages the terms token and lexeme are used differently. Tokens rather then being equivalent to lexemes are a subset of lexemes. Lexeme applies to any sequances of characters recognized as a unit. Tokens are recognized by token formula and kept for later use. A quoted string is used in a formula to recognize a constant lexeme that normally is not kept.

Early metalanguages META I and META II used builtin token recognizes:

.ID identifier token.

.NUM integer token.

.STRING quoted string token.

Saved token objects were then addressed in code production transformations using *<number>. i.e. *1 *2

Advanced parser languages as found in CWIC have programed token rules. CWIC's token rules create objects that are pushed on the parse stack. Symbol objects that are cataloged in the dictionary (symbol tables) are the defult of a token recognition. Supplied generator functions may be used that intercede the default dictionary cataloging of symbols. These intercede function are used to create numeric and string token objects bypassing cataloging. The quoted string token formula below recognizes string tokens:

string .. """" $(-"""" .ANY | """""","""") """" MAKESTR[];
^      ^  ^    ^ ^     ^    ^ ^     ^      ^    ^
1      2  3    4 5     6    7 8     9      10   11

The string formula above defines a string constant as used in the CWIC parser language:

1 Function name or token identifier may be used in syntax formula to recognize the token. It may also be used to identify a dictionary entries type.

2 .. signifies a token formuls. The symbol to it's left names a formula the right side is the lexing formula

3 Conforming to the string rule """"is a string constant lexeme test for a single " character.

4 $ zero or more of the lexing term following it. In this case a grouped, prenthisized, sequence of tests.

5 -"""" a negative look ahead. A single " character is tested for and if matched the negative look ahead fails. If not matched the test succeeds. The parse is not advanced in either case.

6 .ANY matches any character. In token formula the matched character is appended to the token in process.

7 | is the alternative operator. In this case when -"""" in the left alternative fails the right alternative is atempted. If however .ANY the second test of the left alternative fails a backtrack long failure would result.

8 """""" is a test for two " characters and if successful ,"""" appends a single " character to the token.

9 ,"""" The comma operator in token formula appends the following string to the token.

10 """" matches a single " character. The string constant right bounding " character.

11. MAKESTR[] is a supplied interceding function that creates a string object of the matched token characters.

Token formula skip initial or leading skip_class characters. $ is a sequances operator. ( ) are grouping operators. $( ... ) matches a sequances of the inclosed grouped expression.

The analytical syntax formula are string matching rules that colect characters into objects. The objects may simply be output using transform operators is in META II's:

.OUT(<output pattetn>)

The <output pattetn> consisting of quoted strings and star-stack referances. i.e. *1 *2 etc. *1 being the most recent recognized token. *2 the next most recent.

parser programming language

The parser programming language is programed using grammar equations:

rule = id rule_type parsing_expression ";";

rule_type = "=" | ".." | ":";

"=' syntax rule

".." token making rule

":" character class rule

Unlike BNF grammar rules, parser programming languages use equations or perhaps more specific formula:

<name> <defined as operator> <form> ; <name> = <ID> ;

In CWIC a parsing formula would be defined as:

parse = id ("="  parse_expr
           |".." token_expe
           |":"  class_expr);

As a programming language parse is compiled into a function that first calls id. If successful an equation type is deturmined by testing for the string "=". If successful parse_expr is called. If unsuccessful recognizing "=" the alternate ".." is atempted again if unsuccessful the ":" alternative is atempted. Grouping ("=" parse_expr|".." token_expe|":" class_expr) alternatives eliminates backtracking as all equation forms start with an id. Backtracking is programed using backtrack alternatives indicated by using \ to separate alternatives. A backtrack alternative saves the parse state befor attempting it's left alternative. On failure the parse state is reset to the saved state and program control transfer to the right hand alternative. Lexeme failure is a backup failure only reseting the input state and returning failure.

value type object copying

Object-oriented as originally used in lisp 2 talks referred to a programing paradigm were every datum is an object and variables are typeless containers (pointers). The datum types are fixed defined by the language. A declared variable x for example may at one point contain an integer and on another instance a list, string or floating point number.

In the first generation of metacompilers only character sequances recognized as tokens by the parser languages become token objects. The more generalized objects were publicly described in the CWIC (Compiler for Wring and I mplementing Compilers). The generator language of CWIC was a dialect of LISP2.

The early metacompilers were simply string processing languages taking as input a source code character stream and transforming it into an output object code character stream.

junk

A full featured development package would include a linker and a run-time support library. Usually a machine oriented language is required for writing the support library. C or C++ could be used as a machine oriented language. A library consisting of support functions required for the compilation process usually rounds out the full metacompiler package.

edited

The metacompiler's metalanguage is a specialized programming languages designed for the purpose of parsing amd transforming character sequences.^[2]

Metacompilers reduce the task of writing compilers by automating the aspects that are the same regardless of the object language. This makes possible the design of domain-specific languages which are appropriate to the specification of a particular problem. A metacompiler reduces the cost of producing translators for such domain-specific object languages to a point where it becomes economically feasible to include in the solution of a problem a domain-specific language design.^[2]^[13]

Metacompiler metalanguages are powerful string and symbol processing languages that are also useful for generating a wide range of other software engineering and analysis tools.^[2]^[13]

Besides being useful for domain-specific language development, a metacompiler is itself a prime example of a domain-specific language, designed for the domain of compiler writing.

A metacompiler is a metaprogram usually written in its own metalanguage or an existing computer programming language. The process of a metacompiler, written in its own metalanguage, compiling itself is equivalent to self-hosting compiler. Most common compilers written today are Self-hosting compilers. Self-hosting is a powerful tool, of many metacompilers, allowing the easy extension of their own metaprogramming metalanguage. The feature that separates a metacompiler apart from other compiler compilers is that it takes as input a specialized metaprogramming language that describes all aspects of the compilers operation. A metaprogram produced by a metacompiler is as complete a program as a program written in C++, BASIC or any other general programming language. The metaprogramming metalanguage is a powerful attribute allowing the ease of development of computer programming languages and other computer tools. Command line processors, text string transforming and analysis are easily coded using metaprogramming metalanguages of metacompilers.

A full featured development package would include a linker and a run-time support library. Usually a machine oriented language is required for writing the support library. C or C++ could be used as a machine oriented language. A library consisting of support functions required for the compilation process usually rounds out the full metacompiler package

metaprogramming metalanguage

^[15] ^[16] ^[2] ^[16] ^[16] ^[2] ^[3] ^[5] ^[6] ^[7]

^[5] ^[6] ^[7]

e1

They employ specialized string and object orianted programming languages

Lead paragraph

A metacompiler is a software development tool used chiefly in the construction of compilers, translators, and interpreters for other programming languages.^[15] They are a subset of a specialized class of compiler writing tools called compiler-compilers that employ metaprogramming languages. Metaprogramming is the writing of computer programs with the ability to treat programs as their data.^[17]^[18]^[16] The input to a metacompiler is a metaprogram written in a specialized metalanguage designed chiefly for the purpose of constructing compilers.^[15]^[16]^[2] The language of the compiler produced is called the object language.^[16] The minimal input producing a compiler is a metaprogram specifying the object language grammar and semantic transformations into an object program.^[16]^[2]^[3]

Metacompilers reduce the task of writing compilers by automating the aspects that are the same regardless of the object language. This makes possible the design of domain-specific languages which are appropriate to the specification of a particular problem. A metacompiler reduces the cost of producing translators for such domain-specific object languages to a point where it becomes economically feasible to include in the solution of a problem a domain-specific language design.^[2]^[13]

Metacompiler metalanguages are powerful string and symbol processing languages that are also useful for generating a wide range of other software engineering and analysis tools.^[2]^[13]

Besides being useful for domain-specific language development, a metacompiler is itself a prime example of a domain-specific language, designed for the domain of compiler writing.

A metacompiler is a metaprogram usually written in its own metalanguage or an existing computer programming language. The process of a metacompiler, written in its own metalanguage, compiling itself is equivalent to self-hosting compiler. Most common compilers written today are Self-hosting compilers. Self-hosting is a powerful tool, of many metacompilers, allowing the easy extension of their own metaprogramming metalanguage. The feature that separates a metacompiler apart from other compiler compilers is that it takes as input a specialized metaprogramming language that describes all aspects of the compilers operation. A metaprogram produced by a metacompiler is as complete a program as a program written in C++, BASIC or any other general programming language. The metaprogramming metalanguage is a powerful attribute allowing the ease of development of computer programming languages and other computer tools. Command line processors, text string transforming and analysis are easily coded using metaprogramming metalanguages of metacompilers.

A full featured development package would include a linker and a run-time support library. Usually a machine oriented language is required for writing the support library. C or C++ could be used as a machine oriented language. A library consisting of support functions required for the compilation process usually rounds out the full metacompiler package.

and transforms it into object code instructions

b)is used chiefly in the construction of compilers, translators, and interpreters for other programming languages.^[15].

When the input is a complete programming language specification the transformation is a compiler for that language.^[1]

program transformation

software development tool

^[15] They are a subset of a specialized class of compiler writing tools called compiler-compilers program that reads a metalanguage program as input and translates that program into a set of instructions. If the input program is a complete specification of a programming language the translation is a compiler for that language.

A metacompiler is a computer program that takes as input a metalanguage program in which a programming language specification is coded. i.e. the specification of source code syntax and semantically equivalent transformation to machine code,

compiler's, source code, programming language specification and semantic equivalent transformation to object code are programed. The language specification defines the grammar semantic transformation  that operate on parsed language constructs. When the input is a complete programming language specification the output is a compiler for that language.^[1]^[19]^[20]

Metacompiler's work by performing semantic equivalent program transformations on parsed language constructs.

A metacompiler may be used to construct compilers, translators, or interpreters for a variety of programming languages.^[15] With their powerful string analyzing and transforming features they are useful for many string processing applications.

stack automaton

History of compiler construction 
reductive

syntax and Semantics#Computer science Saul Rosen

edit Pt2

Basically, the metalanguage consists of rules. A rule can either succeed or fail. A rule consists of alternatives that are sequences of other rule invocations. A rule succeeds if any of its alternatives succeeds; these are tried in sequence. An alternative succeeds if all of its rule invocations succeed. The language provides operators to create evaluation loops without recursion.

The metacompiler class of compiler-compiler is characterized by its reductive grammar specification language incorporating semantic transformations. A language is specified top-down by a set of rules.

Token (parser)

Aspects of the Theory of Syntax

Proceedings of the 1964 19th ACM National Conference

^[15]^[21]^[22]

Reductive Grammar

A reductive grammar is a set of rules, analytical in nature, defining a language's structure. They were used by philosophers and linguists in the early 20th century. A reductive grammar rule defines a named language formation. A verb phrase, noun phrase etc. Defining language in this fashion allowed language to be discussed.

A book for example has a structure that may contain sections, chapters, table of contents, an index, paragraphs sentences phrases and words. A reductive grammar would define a book top down starting with a book rule.

books = intro TOC sections index;

As a programming language the above rule specfies the top level parts and their order. Order being a sequential formation.

Each rule defines a structure's formation. Textual constructs of sentences that are made up of phrases. That are in turn made up of words. Words made up of characters.

A reductive grammar rule names a formation of symbols. The name relates to the formations function. i.e. A set of reductive grammar rules expressing the form of an algebraic expression using related rule names.

Consider the following algebraic expression grammar expressed in CWIC.

1 – Exponent (power), 2 – factor (coefficient), 3 – term, 4 – sum_operator, 5 – factor (constant), $x,y-$ factor (variables)

expression = term $(('+'|'-') term);

term = factor $factor;

factor = power | coefficient

| variable | constant;

power = variableexponent;

coefficient = number | constant;

variable = 's'|'t'|'u'|'v'|'w'|'x'|'y'|'z';

constant = 'c' | 'c₀' | 'c₁' | 'c₂' | 'c₃' | 'c₄':

number .. dgt $dgt;

dgt: '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9';

exponent = number | constant;

edit point

and transforms it into object code. If the input source code is a complete specification (syntax and semantics) of a programming language the output object code is a compiler. Metacompilers are used chiefly in the construction of compilers for other programming language.

In turn a metalanguage is described using a metasyntax.^[15]^[1]^[19]^[20]

translator

Metaprogramming is the writing of computer programs with the ability to treat programs as their data.^[17]^[18]^[16] The input to a metacompiler is a metaprogram written in a specialized metalanguage designed chiefly for the purpose of constructing compilers.^[15]^[16]^[2] The language of the compiler produced is called the object language.^[16] The minimal input producing a compiler is a metaprogram specifying the object language grammar and semantic transformations into an object program.^[16]^[2]^[3]

is a compiler that takes as input source code written in a metalanguage. If the input is a complete specification of a programming language the output translation is a compiler for that programming language. Metacompilers are used chiefly in the construction of compilers, translators, and interpreters for other programming language.^[15]^[19]^[20]^[1].

Metacompilers employ powerful string and symbol, analyzing and processing features that are useful for a wide range of software engineering and analysis applications. They are metaprogramming tools that reduce the effort of writing compilers and other language analysis tools by automating the common aspects that are the same regardless of the object language. This makes possible the economic design and development of domain-specific languages which are appropriate to the specification of a particular problem. Besides being useful for domain-specific language development, a metacompiler is itself a prime example of the use of domain-specific languages.^[23]^[13]^[16]

Metacompilers belong to a class of software development tools known as compiler-compilers. Metacompiler and compiler-compiler were at on time synonymous. The term compiler-compiler by common use has became more general. Using a metacompiler one programs the parser, defineing top down the allowable structure and composition of the source language and semantic transformation to the target language or object code.^[1]^[21]

Metalanguages

The first use of a specialized metalanguage in specifying a programming language was BNF in the ALGOL 60 reports. BNF, Backus–Naur Form, was specifically developed by John Backus to talk about ALGOL. What was used before BNF? A natural language such as English, Danish, Russian etc. And that remained for talking about the semantics of ALGOL. But with BNF they could be bound together as never before. BNF was immediately and unanimously accepted by the IAL group.

Early programming languages were primarily mathematical in nature. FORTRAN (FORmula TRANslation) was crated by John Backus. John was a math major having just aquired his BS degree when recruited by IBM to head up the IBM language development group. It is important to know this to understand the reasoning behind BNF. BNF is a metalanguage of formula that specify a language construct, formation or pattern. The ALGOL 60 report describes them as metalinguistic formulae. They are further explained by Peter Naur to be a set of named linguistic classes. A class name is bounded by the < > characters.

In algebra we learn that a mathematical expression is made up of terms that are added or subtracted. i.e.

a+b-c

An expression specifications in BNF can be written as follows.

<adding opetator>:=+|-

Likewise, term can be defined:

<product opetator>::=*|/

The linguists would describe BNF as a production language ignoring the semantic descriptions. A programmer might agree for they describe how to write valid language constructs. But a compiler writer would say they are reductive or analytical for they described the ALGOL programming language top down. Making for an easy translation to a top down recursive decent parser.

But both are individually wrong. For BNF is a metalanguage describing the syntax of a valid program and (semantics) having a natural language description of each class precisely describing what it does. It is used to communicate to a programmer writing a program the precise meaning of every language construct. And to the compiler writer the same thing. BNF, as origionally defined, is a documentary specification metalanguage to be used along with natural language semantic descriptions of a BNF, language construct, or class. The semantic specification for <expression> would be written in a natural language. It would specify the acceptable data types of the <term>s. Upward conversion to maintain accurate results. And any other significant details of implementation.

The programming metalanguages of metacompilers are simular to BNF. Analytic grammar rules are boolean functions with imbeded semantic transforms and the ability to code many programming languages specifications in their programming metalanguage as well as their own metasyntax specifications.^[1]^[19]^[20]^[24]^[25]^[15]

Each rule defines a language element or construct. In the process of change to a programming language the < > brackets were dropped and words of the target language were then coded as quoted strings. BNF required writing a separate rule for alternative within a rule. The <adding opetator> above for example. And a series of like constructs were coded as immediate left recursion. i.e.

In the above BNF class, expr is defined as a single term or a term followed by a sequence of <adding opetator> term. The immediate <expr> left recursion, <expr> referencing itself, implies a loop. BNF simply was not a simple language to write a compiler for and could not define itself. The < > are cumbersome to write around every name. Computers were very memory limited. ln 1963 - 1964 Dewey Val Schorre made a more conventional programming language following essentially the same concept. Quoted strings were used as was already the norm in the 1401 autocoder. META I and META II were written on a IBM 1401. Schorre removed the < > no longer needed using quoted strings. Added ( ) grouping and a $ loop operator. Parsing is a complex conditional. In essance one huge conditional if program then ...

Can one even imagine the enormity of the boolean expression to decide if program meets a language specifications. That is a compilers job. Using a programming metalanguage that job is broken down into deciding language construct correctness. As in BNF each language construct has a named rule. The rule is a boolean expression returning success(true) or failure(false). The above expr rule written in META II:

EXPR = TERM $(('+'/'-') TERM);

TERM = FACTOR $(('*'/'/') FACTOR);

FACTOR = .ID / .NUMBER / '(' EXPR ')';

Built in token rules for .ID, .NUMBER, .STRING were used to recognize tokens in the language. Early Schorre metalanguages recognizing that tokenis require different algorithms than syntax rules provided builtin functions that followed ALGOL rules. Program input was on punch cards. The | not being available a / character was used.^[2]^[13]

Features of metacompilers

Specialized metalanguages that are easy to read, write, and maintain specificly designed for writing compilers having complete programmed control of all aspects of the grammar analysis^{[** 1]} and transformation to object code^{[** 2]}. Metacompilers define the transformation of source code to object code completely within the metalanguage(s) of the metacompiler.^[11]^[3]^[2]
A defining feature of a metacompiler is it compiles a program written in a metasyntax defined metalanguage that is able to express many programming languages including its own metasyntax. Not all metacompilers compile their selves but utilizing the self expressing feature allows them to do so making them a specialized class of self-hosting compilers enabling them to quickly and easly port or extend their selves.^[11]^[3]^[2]
Programable parsers provide the ability to avoiding backtracking, by factoring, ambiguous language constructs.^{[** 3]}^[11]
Context sensitive domain specific sub-languages like COBOL divisions can be handled by programed analitical parsers.^[2]
Grammar analysis may include context sensitive lexical rules.^{[** 4]}^[2]
Advanced metacompilers, having extensive backtracking features and lexical rules can analyze most context sensitive languages. ^[2]
A full featured development package includes a linker and a run-time support library.
A machine oriented language is required for writing the support library. MOL 360 was used with the CWIC metacompiler. Today C or C++ might be used as a machine oriented language.

Examples

These examples are primarrly based on advanced metacompilers developed by SDC (Systems Development Corporation) and their predecessors in the Schorre line of metacompilers. Other metacompilers exist, usually of limited capability, or lacking documentation.

These examples attempt to illustrate how a compiled metalanguage operates. They are quite different from parser generators in common use.

All metacompiles in the line of these Schorre based metalanguage use the same basic programed syntax rule method of language analysis and are able to program the generation of object code. The code generation method has advanced in the later generations. CWIC, the basis for these examples, is the last publicly documented generation in the Scorre metacompiler line. Small differences in the illustrated code reflect modern character sets and generalizations illustrating concepts.

Syntax Rules

Syntax rules are test functions specifying a language construct, pattern of tests, matched against the input character stream. They analyze it, breaking it into elements, and, on success, using embedded transform operations output it in a different equivalent form. A rule is itself a test that returns success or failure. Transformations do not occure on failure.

; Assignment statement example:

assign = id '=' expr :STORE!2 genit[*1];
expr = term $(('+':ADD|'-':SUB) term!2);
term = factor $(('*':MPY|'/':DIV) factor!2); 
factor = '(' expr ')' | id | number;

Above is a simple arithmetic assignment defined in a metalanguage similar to the CWIC SYNTAX language. It is a functional programming language made up of rules that analyze a program starting with a top driving rule defining the parsing of a source file. Syntax rules are functions that analyze the input character stream, returning success or failure. On success the recognized language constructs are transformed and placed on the parse stack.

Rules are best thought of as logic equations. The '|'(or operator) is used to allow alternate language constructs. ('+' | '-') is read as + or -. The character constants '+' and '-' are tests on the input character stream. The input is advanced over characters that match.

The programming model consists of two stacks and a symbol table. The parse stack contains parsed objects and constructed lists The SYNTAX rules use node and tree building operators to combine parse stack entries into list or tree objects creating an abstract syntax tree. The : <node name> operator pushes the <node name> on the node stack. The tree builder ! <number> operator builds a tree object poping the top node stack (node name) object and the top <number> of parse stack entries forming <number>+1 entry list and pushing it on the parse stack. Generally the parse stack contains abstract tree elements that are combined and eventually passed to an unparse transform rule. Token rules create atomic objects on the parse stack. Atomic objects are numbers, strings, and symbols.

Syntax rules are made up of logical operators, grouping operators, looping operators, tree building operators, and tests consisting of literal strings or rules that are matched against the input stream returning success or failure. A rule is itself a test. Starting with the assign rule defining an assignment to be an id followed by an = followed by an expr. The assign rule is specifying a series of tests to be made against the input stream of characters. First an id must be matched. When a test succeeds the input stream has been advanced over the matched characters. The '=' is matched against the input stream. And finally expr is called to match an arithmetic expression. If all three tests are successful a STORE tree is built and passed to the (codegen[*1]) codegen unparse transform rule. The [*1] indicates codegen to be a generator transform rule. *1 is the top parse stack object that is popped and passed as the single argument to codegen.

The backtracking alternate allows ambiguous language constructs to be recognized. A more common use of backtracking is handling errors. A backtrack alternate resets the parser state to the beginning state of its left alternant before proceeding with the right alternant. Error recovery usually tries skiping to a point where normal operation can continue. For example a ';' may be scanned for to find the end of a statement. ($-';' ';') skips characters not a ; and then matches a ;. A normal alternant will not be attempted if any part of left alternant succeeds. Operators of the syntax language

Operator    Function
  |     (A or B)   Alternant separator.
        If A partially succedes B will not be attempted. 
 ||     (A or B)   Backtracking alternant
        If A partially succeeds the state will be restored to when
        A began and B is attempted.
 ( )    Groups constructs.
  $     Loop, zero or more of.
         '(' id $(',' id) ')'
         matches a argument list to a function.
  %     Zero or more with backtracking.
  "     String delimiter  "ABC" is the character string ABC.
  '     Character delimiter. A string of one character.
        Character class rules are restricted to single character strings.
  :     Used to define a character class rule.
        In syntax rules creates a node and pushes it on the node stack.
  !     Forms a tree from the top node stack entry and parse stack 
        entries. The ! is followed by the number of parse stack entries 
        to be removed and placed in the list.
 < >    Creates list of parse stack entries created within their bounds.
  -     Negates the following test. Or grouped test.
           (-record_contains) true if record_contains fails. The input
           stream is not advanced in any case.
  ~     Not the following character. Matches any character except the
        following character constant. ~';' matches any characte except ; 
          string .. '"' $~'"' '"' MAKESTRING[]:
        is the token rule for a string.
   ,     Pushes a string on the parse stack. ,"xyz" pushes the string
        "xyz" onto the parse stack.
   +    proceeding a string constant keeps the string. In a syntax rule it
        is pushed on the parse stack. In a token rule it becomes part of 
        the token.

^ Cite error: The named reference rules was invoked but never defined (see the help page).
^
Transforming unparse rules recognize tree patterns, evaluate and transform them into code sequences.
```
genit(STORE[x,NUM:(y)])=><mov y,x>;
      (STORE[x,genit(y)])=> <mov y,x>;
 (#u1[srcit(x),genit(y)])=> <#v1 x,y>;
                             return y;
         (x)=>   y:=allocreg();
                     <mov x,y>;
                     return y;

       #u1 = ADD,SUB,MPY,DIV;
       #v1 = add,sub,mpy,div;

srcit(#u1[srcit(x),genit(y)])=> <#v1 x,y>:
                                return y;
    (x) => return x:

   #u1 = ADD,SUB,MPY,DIV;
   #v1 = add,sub,mph,div;
```
The above are unparsing transform rules that transform the abstract syntax tree from the syntax rules into sequential code.
An unparse transform rule starts with its name followed by a sequence of transforms. Each transform is a unparse rule and an action separated by '=>'.
```
 <unparse rule> => <action>;
```
The unparse rule is used to match tree patterns and attributes. Branches are assigned to local variables. An add tree would be matched by (ADD[left,right]). The left and right tree branches can be assigned to variables. left and right for example. The variables would then contain the tree branches. The branches are objects that for example could be a tree, id, or number (see syntax example above). A better way is to pass them to a transform from within the unparse rule that will return an atomic object. A tekneque called tree crawling. The unparse tree crawling would have a transform call in the tree branch position that it is to be called with. The transforms "argument(s)" would then be a local variable(s) instead and contain the transforms returnes.(ADD[gen(x),gen(y)]) would call the transform gen with the left branch. The returned object from gen would be in local variable x. Likewise the right tree branch calling gen and the returned object in y. The expectation is that the local variables will then contain atomic objects appropriate for the action of the transform. In the above example two transform rules are used. The srcit transform actions only being different in returning atomic objects. The genit transform loads atomic objects into an allocated register. This allows atomic source objects to be use directly in the arithmetic instructions.
^
Factoring syntax rules alows ambiguous language constructs to be recognized without backtracking. For example in the factor rule from the earlier example. It may be that a function call is required. A problem arises with recognition of an id or a function. We do not know until after an id is processed if an argument list follows. The change is fairly simple.
```
factor = '(' expr ')'|(id('(' arg_list ')':ARGS!2 | .EMPTY)) | number;
arg_list = +[id $(',' id)]+;
```
^
Lexical analysis Token rules, first appearing in CWIC a third generation metacompiler, are called from syntax rules to recognize valid words of the language. They define a sequence of characters that make up valid tokens in the context of the the calling rule. Tokens by default are cataloged as symbols unless an interceeding conversion function is called. Early metacompilers did not have symbol tables. Built in token matching functions for identifiers integers and strings simply created taged strings.
```
let:  'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|
      'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'|
      'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|
      'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z';

bin:  '0'|'1';
oct:  bin|'2'|'3'|'4'|'5'|'6'|'7';
dgt:  oct|'8'|'9';
hex:  dgt|'A'|'B'|'C'|'D'|'E'|'F'|'a'|'b'|'c'|'d'|'e'|'f';

alphanum: let|dgt;

symchr: alphanum|'_';

number .. ('0h'|'0H') hex $hex MAKEHEX[]|
          ('0o'|'0O') oct $oct MAKEOCT[]|
          ('0b'|'0B') bin $bin MAKEBIN[]|
                      dgt $dgt;

id .. let $symchr;
```
Character class rules, used in token rules, simply define a name, of a class, to be associated with a character list. The list may include other character classes by name. i.e. The alphanum class is inclusive of the let and dgt classes. Lexical rules are used to define tokens of a language. The .. defines the proceeding identifier as a token rule. Above ld and number are token rules. Token rules are simple. In recognizing a token they can use string constants and refer only to character class rules. A interceeding conversion function may be called only after the token has been recognized. Interceded functions can only appear as the final operation in token recognition. Character class rules are defined by a : rule. Above let: defines the let character class. Token rules are called from syntax rules.
The normal operation of a token rule is to catalog the token in the symbol table. This may be interceded by calling conversion functions. Numeric conversion functions are used in the number token rule example above.

Cite error: A list-defined reference named "common" is not used in the content (see the help page).

FORTH metacompiler

Programming in FORTH is adding new words to the language and defining their function. Adding words to the FORTH language is metaprogramming, extending the language, creating a new FORTH dialect. FORTH is a specialized metacompiler for FORTH language dialect.

Starting from a root vocabulary, that is defined using machine operations, a standard FORTH implementation can be defined. FORTH may be used to compile itself.

Many Forth advocates call the process of creating a new implementation of FORTH a meta-compilation. That it compiles itself is what makes it a metacompiler. But that is ignoring the fact that the FORTH metacompiler is actually compiling new vocabulary creating a full FORTH language implementation from a minimal base vocabulary. It's the programed creation of new dictionary entries that make FORTH a metacompiler.

Historical

Metacompilers have played a significant role in both computer science and the build-up of the computer industry. Initial metacompilers including Meta-II^[11] and its descendant TreeMeta.^[3] are the better known metacompilers. There were many more metacompiler designs tried during the early history of compiler development.

The ideas about grammars, self-generation and extensions are widely used by Program transformation systems.

Early History of Metacompilers

The early history of metacompilers is closely tied with the history of SIG/PLAN Working group 1 on Syntax Driven Compilers. The group was started primarily through the effort of Howard Metcalfe in the Los Angeles area.^[5] In the fall of 1962 Howard Metcalfe designed two compiler-writing interpreters. One used a bottom-to-top analysis technique based on a method described by Ledley and Wilson.^[6] The other used a top-to-bottom approach based on a work by glennie to generate random English sentences from a context-free grammar.^[7]

At the same time, Val Schorre described two "meta machines". One generative and one analytic. The generative machine was implemented and produced random algebraic expressions. Meta I the first metacompiler was implemented by Schorre on an IBM 1401 at UCLA in January 1963. His original interpreters and metamachines were written directly in a pseudo-machine language. Meta II, however, was written in a higher-level metalanguage able to describe its own compilation into the pseudo-machine language.^[26]^[27]^[28]

Lee Schmidt at Bolt, Beranek, and Newman wrote a metacompiler in March 1963 that utilized a CRT display on the time-sharing PDP-l.^[29] This compiler produced actual machine code rather than interpretive code and was partially bootstrapped from Meta I.

Schorre bootstrapped Mteta II from Meta I during the Spring of 1963. The paper on the refined metacompiler system presented at the 1964 Philadelphia ACM conference is the first paper on a metacompiler available as a general reference. The syntax and implementation technique of Schorre's system laid the foundation for most of the systems that followed. The system was implemented on a small 1401, and was used to implement a small ALGOL-like language.

Many similar systems immediately followed.

Roger Rutman of A. C. Sparkplug developed and implemented LOGIK, a language for logical design simulation, on the IBM 7090 in January 1964.^[30] This compiler used an algorithm that produced efficient code for Boolean expressions.

Another paper in the 1964 ACM proceedings describes Meta III, developed by Schneider and Johnson at UCLA for the IBM 7090.^[31] Meta III represents an attempt to produce efficient machine code, for a large class of languages. Meta III was implemented completely in assembly language. Two compilers were written in Meta III, CODOL, a compiler-writing demonstration compiler, and PUREGOL, a dialect of ALGOL 60. (It was pure gall to call it ALGOL).

Late in 1964, Lee Schmidt bootstrapped the metacompiler EQGEN, from the PDP-l to the Beckman 420. EQGEN was a logic equation generating language.

In 1964, System Development Corporation began a major effort in the development of metacompilers. This effort includes powerful metacompilers, Bookl, and Book2 written in LISP which have extensive tree-searching and backup capability. An outgrowth of one of the Q-32 systems at SDC is Meta 5.^[32] This system was successfully released to a wide number of users and had many string-manipulation applications other than compiling. The Meta 5 system incorporates backup of the input stream and enough other facilities to parse any context-sensitive language. It has many elaborate push-down stacks, attribute setting and testing facilities, and output mechanisms. The fact that Meta 5 successfully translates JOVIAL programs to PL/l programs clearly demonstrates its power and flexibility.

The LOT system was developed during 1966 at Stanford Research Institute and was modeled very closely after Meta II.^[33] It had new special-purpose constructs allowing it to generate a compiler which would in turn be able to compile a subset of PL/l. This system had extensive statistic-gathering facilities and was used to study the characteristics of top-down analysis.

SIMPLE is a specialized translator system designed to aid the writing of pre-processors for PL/I, SIMPLE, written in PL/I, is composed of three components: An executive, a syntax analyzer and a semantic constructor. ^[34]

The TREE META compiler was developed at Stanford Research Institute in Menlo Park, California. April 1968.

The early metacompiler history is well documented in the TREE META manual. TREE META paralleled some of the SDC developments. Unlike earlier metacompilers it separated the semantics processing from the syntax processing. The syntax rules contained tree building operations that combined recognized language elements with tree nodes. The tree structure representation of the input was then processed by a simple form of unparse rules. The unparse rules used node recognition and attribute testing that when matched resulted in the associated action being performed. In addition like tree element could also be tested in an unparse rule. Unparse rules were also a recursive language being able to call unparse rules passing elements of the tree before the action of the unparse rule was performed.

The concept of the metamachine originally put forth by Glennie is so simple that three hardware versions have been designed and one actually implemented. The latter at Washington University in St. Louis. This machine was built from macro-modular components and has for instructions the codes described by Schorre.

CWIC (Compiler for Writing and Implementing Compilers) is the last known Schorre metacompiler. It was developed at System Development Corporation. With the full power of (lisp 2) a list processing language optimizing algorithms could operate on syntax generated lists and trees before code generation. CWIC also had a symbol table built into the language.

Information about later descendants of these metacompilers is not generally available. With the resurgence of domain-specific languages and the need for parser generators which are easy to use, easy to understand, and easy to maintain, metacompilers are becoming a valuable tool for advanced software engineering projects.

CWIC

In 1969-1970, Erwin Book, Dewey Val Schorre, and Steven J. Sherman developed CWIC.^[2] The SDC CWIC(Compiler for Writing and Implementing Compilers) documents are arvhived at Charles Babbage Institute Center for the History of Information Technology (Box 12, folder 21),

CWIC is last documented metacompiler in the Schorre line. It is a marriage of Schorre's meta CWIC is a meta-compiler system composed of three special-purpose languages, each intended to permit the description of certain aspects of translation in a straightforward, natural manner. The syntax language is used to describe the recognition of source text and the construction from it to an intermediate tree structure. The generator language is used to describe the transformation of the tree into appropriate object language. MOL/360 ,Machine Oriented Language-360, is a mid-level systems programming language for the IBM System/360 family of computers that is used to provide an interface with the machine and it's operatoring system. MOL/360 was independently developed in 1968 and used to develop the ADEPT time sharing system.

The syntax language is similar to that of Dewey Val Schorre's line of metacompilers. It most resembles TREEMETA having tree building operations in the syntax language. The unparse rules of TREEMETA are extended to work with the object oriented generator language based on LISP 2.

An eaisly accessable source for information on CWIC is available at: "CWIC/36O system, a compiler for writing and implementing compilers ACM SIGPLAN Notices June 1970 volume=5 issue=6 pages=11–29" Archived System Development Corporation doumentents can be requested from Charles Babbage Institute Center for the History of Information Technolog. The CWIC Technical Manual is archived at Charles Babbage Institute Center for the History of Information Technology (Box 12, folder 21)

CWIC includes three languages:

Syntax: Transforms the source program input, into list structures using specialized grammars. A parsed expression structure is passed to a generator by placement of a generator call in a rule. A tree is a list whose first element is a node. The language has operators, < and >, specifically for making lists. The colon : operator is used to create node objects. :ADD creates an ADD node. The exclamation ! operator combines a number of parsed entries with the most previous created node to make a tree . The syntax rules are processed by generator functions, returning success or failure. The syntax language is very close to TREEMATA. Both use a colon to create a node. CWIC's tree building exclamation !<number> functions the same as TREEMETA's [<number>].
Generator: a named series of transforming rules, each consisting of an unparse, pattern matching, rule. and an output production written in a LISP 2 like language. the translation was to IBM 360 binary machine code. Other facilities of the generator language generalized output.^[2]
MOL-360: an independent mid level implementation language language developed in 1968 and used for writing the underlying support library.

Generators Language

The Generator Language has semantics similar to Lisp. The tree is thought of as a recursive list. The general form of a Generators Language function is:

  function-name(first-unparse_rule) => first_generator_action;
               (second-unparse_rule) => second_generator_action;;
               (third-unparse_rule) => third_generator_action;
                          •
                          •
                          •

The code to process a given tree included the features of a general purpose programming language, plus a form: <stuff>, which would emit (stuff) onto the output file. A generator call may be used in the unparse_rule. The generator is passed the element of unparse_rule pattern in which it is placed and returns values are listed in (). For example:

  expr_gen(ADD[expr_gen(x),expr_gen(y)]) =>
                            <AR + (x*16)+y;>
                            releasereg(y);
                            return x;
          (SUB[expr_gen(x),expr_gen(y)])=>
                            <SR + (x*16)+y;>
                            releasereg(y);
                            return x;
          (MUL[expr_gen(x),expr_gen(y)])=>
                              .
                              .
                              .
          (x)=>     r1 = getreg();
                        load(r1, x);
                        return r1;
...

That is, if the parse tree looks like (ADD[<something1>,<something2>]), expr_gen(x) would be called with <something1> and return x. A variable in the unparse rule is a local variable that can be used in the generator action. expr_gen(y) is called with <something2> and returns y. Of note here is a generator call in an unparse rule is passed the element in the position it occupies. Hopefully in the above x and y will be registers on return. The last transforms is intended to load an atomic into a register and return the register. The first production would be used to generate the 360 "AR" (Add Register) instruction with the appropriate values in general registers. The above example is only a part of a generator. Every generator expression evaluates to a value that con then be further processed. The last transform could just as well have beenr writen as:

               (x)=>     return load(getreg(), x);

In this case load returns it's first parameter, the register returned by getreg(). the functions load and getreg are other CWIC generators.

CWIC addressed domain-specific languages before the term domain-specific language existed

From the authors of CWIC:

"A metacompiler assists the task of compiler-building by automating its noncreative aspects, those aspects that are the same regardless of the language which the produced compiler is to translate. This makes possible the design of languages which are appropriate to the specification of a particular problem. It reduces the cost of producing processors for such languages to a point where it becomes economically feasible to begin the solution of a problem with language design."^[2]

Notes

Cite error: A list-defined reference named "metaop" is not used in the content (see the help page).
Cite error: A list-defined reference named "metadef" is not used in the content (see the help page).

Cite error: A list-defined reference named "meta2" is not used in the content (see the help page).

References

^ ^a ^b ^c ^d ^e ^f ^g A metacompiler, in the most general sense of the term, is a program that reads a metalanguage program as input and translates that program into a set of instructions. "A Tree Meta For The SDS 940 Argumentation Research Center Stanford Research Institute Menlo Park California" (PDF). p. 101/D1. Retrieved 5 January 2015.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r ^s ^t ^u ^v ^w ^x ^y Erwin Book; Dewey Val Schorre; Steven J. Sherman (June 1970). "The CWIC/36O system, a compiler for writing and implementing compilers". ACM SIGPLAN Notices. 5 (6): 11–29. doi:10.1145/954344.954345.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l
C. Stephen Carr, David A. Luther, Sherian Erdmann, The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645, University of Utah Technical Report RADC-TR-69-83. "The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645, University of Utah Technical Report RADC-TR-69-83. C. Stephen Carr, David A. Luther, Sherian Erdmann" (PDF). p. 3 par 3b. Retrieved 5 January 2015.
^ ^a ^b ^c Zellig Sabbettai Harris transformational metalanguages.Zellig Harris
^ ^a ^b ^c ^d ^e Howard Metcalfe, "A Parameterized Compiler Based on Mechanical Linguistics" Planning Research Corporation R-311, March 1, 1963, also in Annual Review in Automatic Programming, Vol. 4 Cite error: The named reference "Metcalfe1" was defined multiple times with different content (see the help page).
^ ^a ^b ^c ^d Robert Ledley and.J. B. Wilson, "Automatic Programming, Language Translation Through Syntactical Analysis," Communications of the Association for Computing Machinery, Vol. 5, No.3 pp. 145-155, March 1962. Cite error: The named reference "Ledleyl" was defined multiple times with different content (see the help page).
^ ^a ^b ^c ^d A. E. Glennie, "On the Syntax Machine and the Construction of a Universal Computer," Technical Report Number 2, AD 240-512, Computation Center, Carnegie Institute of Technology, 1960. Cite error: The named reference "Glenniel" was defined multiple times with different content (see the help page).
^ ACM 1964 META II paper.
^ Harris argued, following Sapir and Bloomfield, that semantics is included in grammar, not separate from it, form and information being two faces of the same coin.Transformational structure in language
^ A tool to manipulate strings of data. 21st national conference of the Association for Computing Machines 1966.
^ ^a ^b ^c ^d ^e META II A SYNTAX-ORENTED COMPILER WRITING LANGUAGE (Dewey Val Schorre UCLA Computing Facility 1964) Cite error: The named reference "METAII" was defined multiple times with different content (see the help page).
^ |A GENERAL PURPOSE TABLE-DRIVEN COMPILER|http://dl.acm.org/citation.cfm%3Fid%3D1464128&ved=0ahUKEwj7trnIi6TQAhVI4IMKHWICBgsQFggfMAE&usg=AFQjCNFz1BFyjnXiJjHxC121Xfi4SyMlPA%7C
^ ^a ^b ^c ^d ^e ^f ^g ^h
Neighbors, J.M. Software Construction using Components. Technical Report 160, Department of Information and Computer Sciences, University of California, Irvine, 1980.
^ Cite error: The named reference CWICi was invoked but never defined (see the help page).
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k Metacompiler: (computer science) A compiler that is used chiefly to construct compilers for other programming languages. "Sci-Tech Dictionary McGraw-Hill Dictionary of Scientific and Technical Terms, 6th edition, published by The McGraw-Hill Companies, Inc".
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l
Wikipedia the free encyclopedia - Metaprogramming
^ ^a ^b
Curse program on Program Analysis and Transformation. By Prof. Harald Sondergaard."Curse on Program Analysis and Transformation". Retrieved 18 September 2014.
^ ^a ^b
Czarnecki, Krzysztof; Eisenecker, Ulrich W. (2000). Generative Programming. ISBN 0-201-30977-7.
^ ^a ^b ^c ^d A metacompller ~~or compiler-compiler~~ is a program whose input is a description of the syntax of a ~~programming~~ language plus a description of its semantics, i.e. of its equivalent In terms of a machine language. The language in which such a description is coded is called a metalanguage. The output of a metacompiler is a compiler for the described language. Thls compiler can then be used to process a program written in that language I.e. to examine it for syntactic correctness and to translate it into a machine language or its equivalent. "The CWIC/36O system, a compiler for writing and implementing compilers Erwin Book, Dewey Val Schorre and Steven J. Sherman". ACM SIGPLAN Notices. June 1970. pp. 12–13. doi:10.1145/954344.954345.
^ ^a ^b ^c ^d In its most general form a metacompiler is a program, written for a machine M which will accept specifications for a programming language L_i; and its equivalent in the language of Machine M_i, and produce a compiler which runs on Machine M. Source programs which are the input to this compiler are written in Language L_i. The output of this compiler is object language which runs on machine M_i. "The CWIC/36O system, a compiler for writing and implementing compilers Erwin Book, Dewey Val Schorre and Steven J. Sherman". ACM SIGPLAN Notices. June 1970. p. 13. doi:10.1145/954344.954345.
^ ^a ^b
Meta - Wikipedia the free eccyclopedia - meta#On higher level of abstraction
^
reductive grammar (computer science) A set of syntactic rules for the analysis of strings to determine whether the strings exist in a language.

McGraw-Hill Dictionary of Scientific & Technical Terms, 6E, Copyright © 2003 by The McGraw-Hill Companies, Inc.
^ A metacompiler assists the task of compiler writing by automating its non creative aspects, those aspects that are the same regardless of the language which the produced compiler is to translate. This makes possible the design of languages which are appropriate to the specificatton of a particular problem. It reduces the cost of producing processors for such languages to a point where it becomes economically feasible to begin the solution of a problem with language design. See Syntax Rules. "The CWIC/36O system, a compiler for writing and implementing compilers Erwin Book, Dewey Val Schorre and Steven J. Sherman". ACM SIGPLAN Notices. June 1970. p. 13. doi:10.1145/954344.954345.
^ "META II is a compiler writing language which consists of syntax equations into which instructions to output assembly language are inserted. The method of writing compilers, given in detail in this paper, may be explained briefly as follows. Each syntax equation is translated into a recursive function which tests the input for a particular phrase structure, and if successful the input is advanced over the phrase. The first compiler writen by hand was for a language called META I. META I was then used to implement the improved compiler META II." "METAII " (PDF). Retrieved 5 January 2015.
^ There are two classes of formal-defination compiler-writing schemes. The productive grammar approach is the most common. A productive grammar consists primarrly of a set of rules that describe a method of generating all possible strings of the language. The reductive or analytical grammar technique states a set of rules that describe a method of analyzing any string of characters and deciding whether that string is in the language. "The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645, University of Utah Technical Report RADC-TR-69-83. C. Stephen Carr, David A. Luther, Sherian Erdmann" (PDF). p. 3 par 3b. Retrieved 5 January 2015.
^
Dewey, Val Schorre (1963). "A Syntax - Directed SMALGOL for the 1401,". ACM Natl. Conf., Denver,Colo.
^
Meta I is described in the paper given at the 1963 Colorado ACM conference. See SMALGOL
^
"Free Online META II Course".

Learn META II using free online META II compiler. Course developed by Jim Neighbors. META II was used in analyzing and transformation of code. See Software Construction using Components
^
L. O. Schmidt, "The Status Bitt ACM SegPlan "Special Interest Group on Programming Languages" Working Group 1 News Letter, 1964.
^
Roger Rutman, "LOGIK. A Syntax Directed Compiler for Computer Bit-Time Simulation," Master Thesis, UCLA, August 1964
^
F. W. Schneider and (G. D. Johnson, "A Syntax-Directed Compiler-writing, Compiler to generate Efficient Code," Proceedings of the 19th National Conference of the Association for Computing Machinery, 1964
^
D. Oppenheim and D. Haggerty, "META 5: A Tool to Manipulate Strings of Data," Proceedings of the 21st National Conference of the Association for Computing Machinery, 1966.
^
Charles R. Kirkley and Johns F. Rulifson, "The LOT System of Syntax Directed Compiling," Stanford Research Institute Internal Report ISR 187531-139, 1966.
^
George J. E. (1967a). Syntax Analyzer, Recognizer, Parser and Semantic interpretation System, Stanford Linear Accelerator Center, November 15, 1967

Cite error: A list-defined reference named "metaclass" is not used in the content (see the help page).

user talk :Steamerandy

[[Category:Compiler construction]] [[Category:Program transformation tools]] [[Category:Extensible syntax programming languages]] [[Category:Domain-specific programming languages]] [[Category:Program analysis]] [[Category:Parser generators]] [[Category:Software design]]

[rules-26] Cite error: The named reference rules was invoked but never defined (see the help page).

[unparse-27] Transforming unparse rules recognize tree patterns, evaluate and transform them into code sequences.
genit(STORE[x,NUM:(y)])=><mov y,x>; (STORE[x,genit(y)])=> <mov y,x>; (#u1[srcit(x),genit(y)])=> <#v1 x,y>; return y; (x)=> y:=allocreg(); <mov x,y>; return y; #u1 = ADD,SUB,MPY,DIV; #v1 = add,sub,mpy,div; srcit(#u1[srcit(x),genit(y)])=> <#v1 x,y>: return y; (x) => return x: #u1 = ADD,SUB,MPY,DIV; #v1 = add,sub,mph,div;

The above are unparsing transform rules that transform the abstract syntax tree from the syntax rules into sequential code.
An unparse transform rule starts with its name followed by a sequence of transforms. Each transform is a unparse rule and an action separated by '=>'.

<unparse rule> => <action>;

The unparse rule is used to match tree patterns and attributes. Branches are assigned to local variables. An add tree would be matched by (ADD[left,right]). The left and right tree branches can be assigned to variables. left and right for example. The variables would then contain the tree branches. The branches are objects that for example could be a tree, id, or number (see syntax example above). A better way is to pass them to a transform from within the unparse rule that will return an atomic object. A tekneque called tree crawling. The unparse tree crawling would have a transform call in the tree branch position that it is to be called with. The transforms "argument(s)" would then be a local variable(s) instead and contain the transforms returnes.(ADD[gen(x),gen(y)]) would call the transform gen with the left branch. The returned object from gen would be in local variable x. Likewise the right tree branch calling gen and the returned object in y. The expectation is that the local variables will then contain atomic objects appropriate for the action of the transform. In the above example two transform rules are used. The srcit transform actions only being different in returning atomic objects. The genit transform loads atomic objects into an allocated register. This allows atomic source objects to be use directly in the arithmetic instructions.

[factor-28] Factoring syntax rules alows ambiguous language constructs to be recognized without backtracking. For example in the factor rule from the earlier example. It may be that a function call is required. A problem arises with recognition of an id or a function. We do not know until after an id is processed if an argument list follows. The change is fairly simple.
factor = '(' expr ')'|(id('(' arg_list ')':ARGS!2 | .EMPTY)) | number; arg_list = +[id $(',' id)]+;

[lexer-29] Lexical analysis Token rules, first appearing in CWIC a third generation metacompiler, are called from syntax rules to recognize valid words of the language. They define a sequence of characters that make up valid tokens in the context of the the calling rule. Tokens by default are cataloged as symbols unless an interceeding conversion function is called. Early metacompilers did not have symbol tables. Built in token matching functions for identifiers integers and strings simply created taged strings.
let: 'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'| 'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'| 'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'| 'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'; bin: '0'|'1'; oct: bin|'2'|'3'|'4'|'5'|'6'|'7'; dgt: oct|'8'|'9'; hex: dgt|'A'|'B'|'C'|'D'|'E'|'F'|'a'|'b'|'c'|'d'|'e'|'f'; alphanum: let|dgt; symchr: alphanum|'_'; number .. ('0h'|'0H') hex $hex MAKEHEX[]| ('0o'|'0O') oct $oct MAKEOCT[]| ('0b'|'0B') bin $bin MAKEBIN[]| dgt $dgt; id .. let $symchr;

Character class rules, used in token rules, simply define a name, of a class, to be associated with a character list. The list may include other character classes by name. i.e. The alphanum class is inclusive of the let and dgt classes. Lexical rules are used to define tokens of a language. The .. defines the proceeding identifier as a token rule. Above ld and number are token rules. Token rules are simple. In recognizing a token they can use string constants and refer only to character class rules. A interceeding conversion function may be called only after the token has been recognized. Interceded functions can only appear as the final operation in token recognition. Character class rules are defined by a : rule. Above let: defines the let character class. Token rules are called from syntax rules.
The normal operation of a token rule is to catalog the token in the symbol table. This may be interceded by calling conversion functions. Numeric conversion functions are used in the number token rule example above.

[metadef-1] ^ ^a ^b ^c ^d ^e ^f ^g A metacompiler, in the most general sense of the term, is a program that reads a metalanguage program as input and translates that program into a set of instructions. "A Tree Meta For The SDS 940 Argumentation Research Center Stanford Research Institute Menlo Park California" (PDF). p. 101/D1. Retrieved 5 January 2015.

[CWIC-2] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l ^m ⁿ ^o ^p ^q ^r ^s ^t ^u ^v ^w ^x ^y Erwin Book; Dewey Val Schorre; Steven J. Sherman (June 1970). "The CWIC/36O system, a compiler for writing and implementing compilers". ACM SIGPLAN Notices. 5 (6): 11–29. doi:10.1145/954344.954345.

[TREEMETA-3] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l
C. Stephen Carr, David A. Luther, Sherian Erdmann, The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645, University of Utah Technical Report RADC-TR-69-83. "The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645, University of Utah Technical Report RADC-TR-69-83. C. Stephen Carr, David A. Luther, Sherian Erdmann" (PDF). p. 3 par 3b. Retrieved 5 January 2015.

[Zellig1-4] Zellig Sabbettai Harris transformational metalanguages.Zellig Harris

[Metcalfe1-5] Howard Metcalfe, "A Parameterized Compiler Based on Mechanical Linguistics" Planning Research Corporation R-311, March 1, 1963, also in Annual Review in Automatic Programming, Vol. 4 Cite error: The named reference "Metcalfe1" was defined multiple times with different content (see the help page).

[Ledleyl-6] Robert Ledley and.J. B. Wilson, "Automatic Programming, Language Translation Through Syntactical Analysis," Communications of the Association for Computing Machinery, Vol. 5, No.3 pp. 145-155, March 1962. Cite error: The named reference "Ledleyl" was defined multiple times with different content (see the help page).

[Glenniel-7] A. E. Glennie, "On the Syntax Machine and the Construction of a Universal Computer," Technical Report Number 2, AD 240-512, Computation Center, Carnegie Institute of Technology, 1960. Cite error: The named reference "Glenniel" was defined multiple times with different content (see the help page).

[META_II-8] ACM 1964 META II paper.

[Zellog2-9] Harris argued, following Sapir and Bloomfield, that semantics is included in grammar, not separate from it, form and information being two faces of the same coin.Transformational structure in language

[META5-10] A tool to manipulate strings of data. 21st national conference of the Association for Computing Machines 1966.

[METAII-11] META II A SYNTAX-ORENTED COMPILER WRITING LANGUAGE (Dewey Val Schorre UCLA Computing Facility 1964) Cite error: The named reference "METAII" was defined multiple times with different content (see the help page).

[AGPTDC-12] |A GENERAL PURPOSE TABLE-DRIVEN COMPILER|http://dl.acm.org/citation.cfm%3Fid%3D1464128&ved=0ahUKEwj7trnIi6TQAhVI4IMKHWICBgsQFggfMAE&usg=AFQjCNFz1BFyjnXiJjHxC121Xfi4SyMlPA%7C

[bayfront-13] ^ ^a ^b ^c ^d ^e ^f ^g ^h
Neighbors, J.M. Software Construction using Components. Technical Report 160, Department of Information and Computer Sciences, University of California, Irvine, 1980.

[CWICi-14] Cite error: The named reference CWICi was invoked but never defined (see the help page).

[McGraw-15] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k Metacompiler: (computer science) A compiler that is used chiefly to construct compilers for other programming languages. "Sci-Tech Dictionary McGraw-Hill Dictionary of Scientific and Technical Terms, 6th edition, published by The McGraw-Hill Companies, Inc".

[metaprog-16] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l
Wikipedia the free encyclopedia - Metaprogramming

[Sondergaard-17] 
Curse program on Program Analysis and Transformation. By Prof. Harald Sondergaard."Curse on Program Analysis and Transformation". Retrieved 18 September 2014.

[genpro-18] 
Czarnecki, Krzysztof; Eisenecker, Ulrich W. (2000). Generative Programming. ISBN 0-201-30977-7.

[CWICd1-19] A metacompller ~~or compiler-compiler~~ is a program whose input is a description of the syntax of a ~~programming~~ language plus a description of its semantics, i.e. of its equivalent In terms of a machine language. The language in which such a description is coded is called a metalanguage. The output of a metacompiler is a compiler for the described language. Thls compiler can then be used to process a program written in that language I.e. to examine it for syntactic correctness and to translate it into a machine language or its equivalent. "The CWIC/36O system, a compiler for writing and implementing compilers Erwin Book, Dewey Val Schorre and Steven J. Sherman". ACM SIGPLAN Notices. June 1970. pp. 12–13. doi:10.1145/954344.954345.

[CWICd2-20] In its most general form a metacompiler is a program, written for a machine M which will accept specifications for a programming language L_i; and its equivalent in the language of Machine M_i, and produce a compiler which runs on Machine M. Source programs which are the input to this compiler are written in Language L_i. The output of this compiler is object language which runs on machine M_i. "The CWIC/36O system, a compiler for writing and implementing compilers Erwin Book, Dewey Val Schorre and Steven J. Sherman". ACM SIGPLAN Notices. June 1970. p. 13. doi:10.1145/954344.954345.

[metaprefix-21] 
Meta - Wikipedia the free eccyclopedia - meta#On higher level of abstraction

[reductive-22] 
reductive grammar (computer science) A set of syntactic rules for the analysis of strings to determine whether the strings exist in a language.

McGraw-Hill Dictionary of Scientific & Technical Terms, 6E, Copyright © 2003 by The McGraw-Hill Companies, Inc.

[CWICd3-23] A metacompiler assists the task of compiler writing by automating its non creative aspects, those aspects that are the same regardless of the language which the produced compiler is to translate. This makes possible the design of languages which are appropriate to the specificatton of a particular problem. It reduces the cost of producing processors for such languages to a point where it becomes economically feasible to begin the solution of a problem with language design. See Syntax Rules. "The CWIC/36O system, a compiler for writing and implementing compilers Erwin Book, Dewey Val Schorre and Steven J. Sherman". ACM SIGPLAN Notices. June 1970. p. 13. doi:10.1145/954344.954345.

[meta2-24] "META II is a compiler writing language which consists of syntax equations into which instructions to output assembly language are inserted. The method of writing compilers, given in detail in this paper, may be explained briefly as follows. Each syntax equation is translated into a recursive function which tests the input for a particular phrase structure, and if successful the input is advanced over the phrase. The first compiler writen by hand was for a language called META I. META I was then used to implement the improved compiler META II." "METAII " (PDF). Retrieved 5 January 2015.

[metaop-25] There are two classes of formal-defination compiler-writing schemes. The productive grammar approach is the most common. A productive grammar consists primarrly of a set of rules that describe a method of generating all possible strings of the language. The reductive or analytical grammar technique states a set of rules that describe a method of analyzing any string of characters and deciding whether that string is in the language. "The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645, University of Utah Technical Report RADC-TR-69-83. C. Stephen Carr, David A. Luther, Sherian Erdmann" (PDF). p. 3 par 3b. Retrieved 5 January 2015.

[SMALGOL-31] 
Dewey, Val Schorre (1963). "A Syntax - Directed SMALGOL for the 1401,". ACM Natl. Conf., Denver,Colo.

[META1-32] 
Meta I is described in the paper given at the 1963 Colorado ACM conference. See SMALGOL

[metat-33] 
"Free Online META II Course".

Learn META II using free online META II compiler. Course developed by Jim Neighbors. META II was used in analyzing and transformation of code. See Software Construction using Components

[Schmidt1-34] 
L. O. Schmidt, "The Status Bitt ACM SegPlan "Special Interest Group on Programming Languages" Working Group 1 News Letter, 1964.

[Rutman1-35] 
Roger Rutman, "LOGIK. A Syntax Directed Compiler for Computer Bit-Time Simulation," Master Thesis, UCLA, August 1964

[Schneiderl-36] 
F. W. Schneider and (G. D. Johnson, "A Syntax-Directed Compiler-writing, Compiler to generate Efficient Code," Proceedings of the 19th National Conference of the Association for Computing Machinery, 1964

[Oppenheim-37] 
D. Oppenheim and D. Haggerty, "META 5: A Tool to Manipulate Strings of Data," Proceedings of the 21st National Conference of the Association for Computing Machinery, 1966.

[Kirkleyl-38] 
Charles R. Kirkley and Johns F. Rulifson, "The LOT System of Syntax Directed Compiling," Stanford Research Institute Internal Report ISR 187531-139, 1966.

[George-39] 
George J. E. (1967a). Syntax Analyzer, Recognizer, Parser and Semantic interpretation System, Stanford Linear Accelerator Center, November 15, 1967

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[** 1]

[** 2]

[** 3]

[** 4]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]