User:Thepigdog/Relational meta programming
In a relational programming environment, meta programming is the generation of statements from data. For example a parser may create statements from text.
Creating statements that are immediately available to the running program creates problems. To avoid these problems, execution of a program is divided into phases. Execution proceeds in one phase until nothing more can be done.
Then all meta statements generated in this phase are converted into statements, and execution continues, using both the original statements and the statements newly created from meta statements.
History
Many code generation systems have been written that generate source code as text. A typical system records meta data in XML and then generates source code. This technique is often used to implement a software framework, or parser. Code is generated to implement,
- an efficient parser for a syntax.
- classes to use for database access.
- classes for allowing access to code from different libraries or languages.
- controller code for interfacing between the GUI (presentation layer) and the model classes, in Model View Controller or any of its variants.
Code generation is often used to generating code for each class and method, which cannot be automatically achieved in the base language.
More recently reflection is often used instead of XML as the preferred repository of the interface definition. A class may be queried for data describing the signature. Extra data may be added to the signature as attributes.
The advantage of generating programs as text is that the generated code may be viewed and checked. The design of code generation may be regarded as two tasks,
- Design the code that needs to be generated.
- Design the means of generating the code.
The problem with this process is that it is more work than simply writing code. More recently languages like ruby allow the construction of methods at run time. Two problems appear with this approach,
- The generated methods are not readily available for inspection.
- There is no clear demarcation of when the code generation occurs.
So the developer may not know what methods are available at a particular point in the program, because new methods may be created in any part of the program.
These problems may be alleviated by careful design and discipline.
Motivation
Everything that can be done using meta programming, can be done without it, but at the expense of implicitly or directly interpreting actions, instead of directly calling methods. Interpreting actions is deciding what action to take based on data, instead of directly coding actions in the language. In the extreme case interpreting actions means constructing an interpreter within the language.
Meta programming allows direct access to the language interpreter/compiler, so avoids the need to interpret actions. Meta programming allows a general description of a problem to be given, and then analyzed to generate the specific actions to be generated to provide an efficient solution to the problem to be generated for the specific case.
For logic programming, general descriptions of problems are given as possibly unsolved equations. Then equation solving algorithms are used and the specific constraints and parameters needed are used to generate an efficient solution for a particular solution.
Program model
A program is logic describing some functionality. A program then has a dual function, as,
- A high level semantic description of functionality, which can be analyzed and queried.
- An implementation of functionality.
Current programming languages provide only the second function. The program exists only as text or as an implementation.
The program model paradigm is that the source code is a representation of the program, supported by the following tools,
- A parser to create the program from the source code text.
- A writer to create text from the program.
- An interpreter/compiler to run the program.
The program may be represented in multiple languages, with a parser/writer pair for each language.
The program may be analyzed by meta programs, in order to,
- Display the logic.
- Analyze the code.
- Perform equation solving.
- Simplify the code.
- Ask particular questions about code logic.
- Restructure code to match a new paradigm.
- Check for security vulnerabilities.
The key requirements for a logic programming system to implement the program model are,
- Language independence (the source code text is not the program).
- Parsers and writers trivially written as logic programs.
- Minimal core of functionality implementing the running of programs as an interpreter.
- Compiler written as a meta program.
- Meta programs may read and write programs.
- The representation of program, and meta program is identical.
To bootstrap the development, programs are represented as data structures in another programming language to allow the creation of programs without writing a parser for the language. This insures that the program logic is the core component, not the syntactical representation of the program as text.
Polymorphic meta programming
Code generation systems create program code as text. This is a clumsy method of constructing code. A more flexible method is to construct structured data in the program that may be directly converted to code. This data may be constructed in the data structures of the language, but this is also a clumsy method of representing program code.
Meta program code should be represented in exactly the same form and language structure as code. The only distinction should be an operator called meta (represented here by the Greek letter eta () that deliminates code from meta code. Two situations occur,
- Meta code within code - represented by .
- Code within meta code - represented by .
For example a precedence parser may include a rule for a term in an expression like,
This expression describes how the function term translates a string into a meta expression. The clumsy expression on the right is,
This expression may be simplified by using of polymorphism. An operator applied to meta expression inputs returns a meta expression. The same operator applied to expression inputs returns an expression.
This allows the clumsy expression above to be written as,
Because and return meta expressions the operator + applied to meta expressions returns a meta expression representing the addition of the factor and the term.
For many situations this eliminates the need for and makes the program more readable. Polymorphism may be used only where there is no meaningful interpretation of the expression. For example,
Adding two meta expressions has no natural meaning so may be used to represent the meta expression for the addition of the two values,
However in,
has means the is the meta expression x a member of the set of member expressions a, b, and x. So the expression is true. Then it is not valid to use polymorphism to interpret this statement as,
Generating a meta identifier from a string
GenMetaId is a function that converts a string into a meta identifier, which is a meta expression. Then an operator applied to the meta expression creates another meta expression. All operators are polymorphic because if applied to an expression, they produce a result, but applied to a meta expression, they produce another meta expression.
A precedence parser
A parser is a function that converts a character string into an a statement that will be asserted in the next phase. A precedence parser allows binary operators such as +, *, -, / but assigns to each operator a precedence number. The precedence number determines how expressions are grouped. For example,
because the operator * has a lower precedence than +.
where,
- precedence returns the precedence number for an operator string.
- GenMetaId creates and returns a meta identifier from a string
- IsName validates the syntax of the string.
- constant parses a string and makes a constant value
- ws checks that a string is one or more characters of white space.
- ews checks that a string is zero or more characters of white space.
Phased evaluation
A relational programming language will equate a meta expression with an expression, which asserts a new meta fact in the logic system. If these meta facts are immediately interpreted as facts, then new facts may be added at any time to the logic system.
The open-world assumption allows the addition of new facts as long as they do not contradict old facts, as this would make the system of facts inconsistent.
???? notes ????
Negation as failure is obviously wrong. The prior probability of a statement being true which cannot be shown to be true is related to the information content of the statement. So it is small but not zero. It seems not good to assume false :(
Curry's paradox
One example of Curry's paradox is created if there is an eval function that converts a string into an expression.
- s = "eval(s) → y"
Then,
- eval(s) = eval(s) → y
which is contradiction produced solely from an expression. It is then a paradox, or inconsistency in the language.
This highlights a problem when we wish to introduce meta programming features into a logic programming environment. For example we may wish to write a parser for a logic programming language in a logic programming language.
For simplicity parsing the program text should produce expressions that may be immediately asserted and used, which leads to the contradiction above. The contradiction is avoided if the parser produces a meta expression. A meta expression is data describing an expression, but for which evaluation has not yet been requested.
This data may be manipulated using logic. When the meta expression is asserted, then it will become an actual expression in the next evaluation phase.
For the paradox described above, eval(s) is a meta expression. Asserting a meta expression in this phase asserts it as logic in the next phase. So,
- eval(s) = meta (eval(s) → y)
and in the next phase,
- eval(s) = meta (eval(s) → y) → y
The contradiction never arises as it is comparing an expression with a meta expression. ???
Open and closed logic systems
When we assert,
then we may believe that.
but it is possible that s could have other values. If the system of equations is closed, then negation as failure will determine that s only has the two elements a and b.
The use of meta programming effectively opens the system of equations by allowing new statements to be constructed from strings. Phased evaluation allows these statements, but only in the next phase of evaluation.
Statement migration between phases
In the next meta phase, all statements that were true in the previous meta phase are true in this phase, plus any meta expressions asserted true in the previous phase.
- What gets evaluated???
- How is the parsing of text request ???
- How is the resulting logic invoked ???
- Saving the program at the end of a phase for later execution (e.g. like a compiler).
- Manipulation of meta statements.
Inheritance and polymorphism
The type set of a variable is the smallest possible set membership consistent with the simple set membership constraints on a variable. A type in a programming language is the implementation of a type set.
In a programming language the type for a variable is usually declared in a type declaration. Mathematics typically uses set membership conditions instead instead of type declarations. Set membership constraints on a variable may be combined to give the type set for the variable.
The TypeSimplify function defined below implements these conditions as a function to determine the types of variables. Type states are used to accumulate the type set information for each variable.
The last rule uses to refer to the identity of the variable. This identity will be used to identify the variable in the type state. These rules also imply,
which is the form often used for efficiency.
These rules may be use to obtain a set of types for variables T, from an expression E. E may be a conjunction of many conditions.
The type for any variable v may be obtained from T by an expression like,
...
State
A state is a type which is usually used to modify the implicit state in an imperative language. It is a set of values associated with variables. The usual definition is,
Description | Definition |
---|---|
Defined 1 | |
Defined 2 | |
Defined 3 | |
Get value 1 | |
Get value 2 | |
Default value |
State (set implementation)
A state may be implemented using a set of variable name, value pairs. The definition is given below,
Description | Definition |
---|---|
Empty state | |
Defined | |
Get value | |
Default value | |
Assignment | |
Assignment (alternate definition) |
Type state
Some additional rules apply to type states. These rules are more easily described using the set implementation,
Description | Definition |
---|---|
Conjunction | |
Disjunction |
For type states, the default value is defined as,
Example
State and imperative programming
Imperative programs are characterized by having implicit state. On the surface imperative programs are very different from functional programs. However monads provide a way of introducing imperative commands into a functional language, by hiding the state in the monad object.
But monads do not give a completely natural way of representing imperative programming. They do not provide a general model for translating an imperative program into a functional language.
Transformations on imperative programs
Mathematical expressions may be transformed using mathematical laws to give equivalent expressions. The same ability is needed for imperative programs. A mathematical model for an imperative program allows a change in the execution order of the program, without changing the overall effect of the program.
State holder
Using the relational evaluation model, a simple model of state transfer may be given.[1] A statement in an imperative program has,
- Input state i
- Input parameters p
- return value x
- Output state o
It is natural in a functional to consider the input state as another parameter. But using the relational evaluation model this is not necessary. the relation model allows the output to be used in determining the input. For example in,
x is determined to be 3, when it is only input to the function.
A state holder is represented by a .
- s - state identifier.
- i - input.
- x - value.
- o - output.
The state identifier will be used to break up monolith state into into individual state variables. Monolithic state is represented by a single program state including all variables.
Because a state holder takes its input from its output, it does not behave in the same way as other mathematical objects. But with its own set of laws it may be included in a mathematical framework.
The advantage of using a state holder is that state may be transferred through any parameter, providing a full model of imperative programming. The state transfer is completely hidden, and there is a direct translation between imperative programs and state holder programs. State only needs to be transferred where needed.
In comparison monads provide only a linear transfer through primary parameters, but not through every parameter.
A disadvantage is that translating an imperative program to a state holder form does not make it obey the fundamental axioms of equality. Different equality axioms are followed by state holders. Separate transformations must be applied to put the code in a functional form.
State identifiers
The usual approach to modelling imperative programs is to consider them to have a single state. State variables access this state to retrieve the value of the variable in that state. However this approach specifies a single execution order on the program. For example,
This is obvious if x and y are modeled separately as two different state variables, but not obvious if there is a single monolithic state. The monolithic state is modified and accessed in both the statements, therefore the must be executed before .
However if x and y are two separate states then and are independent statements that each access there own state variable and the order of execution does not effect the result.
The way in which states interact is defined using the following rules,
State holder rule exclusions
State holders represent imperative statements. Imperative statements have side effects on the state, and so are not mathematical objects. So a state holder does not obey some of the fundamental laws of mathematics. For example,
Instead it only is defined by,
State holder objects do not obey the substitution rule,
The application of lambda abstracts relies on substitution. The beta reduction rule,
does not hold if s is a state holder. So it is not always true that,
The rules to apply are determined by the type of the variable. Variables of type state holder must be excluded from these rules.
State holder rules
Equality and function definition
Equality is defined to have different types depending on the return type of the variable. Where s is any state holder type, the signature,
gives equality is defined by,
For the type,
Equality has it's mathematical definition, and substitution is allowed,
In imperative languages there will be function definitions which are equivalent to mathematical equality, where equality used in the body of the function is always of the type,
Assignment statement
A C style assignment operator may be defined using monolithic state as.
Using state identifiers, the assignment statement may be defined as,
Value lookup
For a state s,
where is the expression y with free occurrences of x replaced by i[x]. Free and bound variables are concepts from lambda calculus. A lambda abstraction binds it's variable. Also a state holder binds it's state identifier.
For a state variable x,
where is the expression y with free occurrences of x replaced by i.
Statement composition
The semicolon ; is used in many languages to separate statements, which are executed sequentially.
Function application
Function application of state holders are defined by the following rules,
Using the state identifier rules, the last rule implies,
Lambda abstraction
Lambda abstraction of state holders is defined by,
Conditional evaluation and execution control
Conditional evaluation is a form of function application where some parameters are ignored if the value is not needed. Where the value being omitted is a state holder, the transfer of state through the parameter is omitted in some languages, giving control of the execution path.
In this case these functions are treated differently and the function application rules are not applied.
If statement
An if statement may be regarded as a function, with 3 parameters.
- Condition
- True case
- False case
The condition function may be state-full or stateless. Stateless is defined by,
where a and b may be state holders, or other types.
State-full is defined by
And expression
In some imperative languages such as C the second condition is only executed if required to compute the result. This is defined by,
Or expression
Similarly for an or exression || in C the second condition is only executed if required to compute the result. This is defined by,
Loops
The while statement is defined by,
If b and c are not a state holders this statement either does nothing or expands recursively forever. If b or c are state holders, the value returned by c does not need to be the same each time, even though the expression is the same.
Local state variables
Return values
The natural structure is that the body of the function is an expression. The return value of the function is the value of the expression. The function call is then equal to the body of the function, with formal parameters replaced by actual parameters. The return value may be stateless, or it may be a state holder.
However, many imperative languages like C ignore the value returned from the expression and rely on the return statement to capture a value from a state. An extra structure called the return frame is needed to implement this structure.
Return frame
The return frame accepts the return value. The value may be stateless, or a state holder.
Return statement
The return statement is defined by,
Pointers arrays and references
Time branching
Roles
A role is an implicit parameter to a function, identified by a name, whose value is requested from its calling context. If the calling context, within another function definition, does not provide the value, then the role becomes a request of any call to the function. The request is passed up the calling hierarchy until the value is provided.
The purpose of roles is to provide an implicit mechanism for constructing parameter parsing through a hierarchy of multiple function calls without explicitly defining the actual or the formal parameter. The role requests the value and the parameter parsing to transport the value is constructed automatically by meta programming.
This has advantages,
- The calling sequence is constructed automatically, eliminating bugs caused by manual construction.
- The reader of the program is not distracted by parameters that are not needed locally in the function.
- A change to a program may introduce the need for a value in a function, without adding all the parameter passing needed to deliver the value.
Role preprocessor
A role pre-processor for a language processes source text and identifies two meta functions,
- provide - Identifies the value for later use.
- request - Requests the creation of all the parameter passing needed to get the value here.
The pre-processor implements a request by adding an extra formal parameter to a function for each value requested in it. The parameter will be used to pass in the requested value. Then for each function call, add the actual parameter and implicitly regard that value as requested in its containing function.
Recursively implement requests in that way until a provide meta function is found for the same name.
The pre-processor could obtain the type of the variable from the provided variable. Alternately the provide meta function could give the type.
Example
In this example, someParameter is provided in the main function, but is needed in MyDetailedLowLevelCalculation.
Making someParameter a global variable would make it harder to later parameterize any of the functions to be called with different values.
Another alternative would be to create a class as a container for all my methods. someParameter could be a member variable which is populated in the constructor. But in a more complex example these methods may already be across different classes. I may end up parsing the value between classes, to get it where I want it.
The most direct and flexible approach is to parse the value as a parameter, but as well as being distracting for the reader, and adding additional parameters to each call, the developer has to do more work, all of which may need to be removed later if the requirements change.
In this example a role-preprocessor adds the extra parameters. The text is show after the pre-processor has been run with added text surrounded by comments as in,
/* Role processor add */ someParameter, logger /* end add */
In the code ... represents code omitted for the sake of brevity. The example code is,
#include ...
#define provide(x, t) // Role processing directive does not do anything here
#define request(x)
class ObjectToBeProcessed;
void MyProcessing();
void MyProcessObject();
double MyDetailedLowLevelCalculation(double v);
void main(int argc, char **argv)
{
Logger logger;
double someParameter = atof(argv[0]);
...;
provide(someParameter, double); // I have the value someParameter, if anyone wants it.
provide(logger, Logger &); // Also I am providing the logger.
MyProcessing( /* Role processor add */ someParameter, logger /* end add */);
}
void MyProcessing( /* Role processor add */ double someParameter, Logger &logger /* end add */)
{
...
{
ObjectToBeProcessed *o = ...;
MyProcessObject(/* Role processor add */ someParameter, logger, /* end add */ o);
}
...
}
void MyProcessObject( /* Role processor add*/ double someParameter, Logger &logger /* end add */)
{
...
double myValue = ...;
double value = MyDetailedLowLevelCalculation( /* Role processor add */ someParameter, /* end add */ myValue);
// Add logging here, for debugging purposes. I might take it out later.
// The logger would normally be a global variable, but again this reduces flexibility.
request(logger);
logger << "MyProcessObject: result = " << to_string(value);
...
}
double MyDetailedLowLevelCalculation( /* Role processor add */ double someParameter, /* end add */ double v)
{
// Oops, now I need some parameter.
request(someParameter); // Please construct all the parameter passing to get the value here.
// See the code added by the role processor.
return v * someParameter; // Some calculation
}
A role pre-processor would be hard to implement in a language without meta processing functionality, because the pre-processor would need to parse and understand at least some of the syntax of the language. Identifying the type of the variable would be challenging.
However a role pre-processor may be written as a few rules in a language supporting meta-programming.
Roles in lambda calculus
Named lambda expression is producer.
Request is
Axioms
Alpha rename applies to and the named .(\lambda_x y.f) w
The following example demonstrates a request. Here the request gets the value w from .
- - beta reduce
Notes.
- Reduction all ways reduce to a single value. Proof?
- However the same lamda term can represent different values.
Solving equations with normal forms
Many simple equations may be solved using normal forms. A normal form of an expression is a representation of an expression such that if the normal forms of two expressions are identical if the two expressions are equal.
Here means the meta expression for x.
In constructing normal forms it is often the case where some duplicate instances of variables are removed. The remaining expression will have a common structure, so that general approaches to solving the equation may be applied.
Normal forms and data structures
There is a correspondence beteen normal for forms and data structures. For a single operator an expression represents a tree structure.
Any change of bracketing may represent a different value. The normal form is the expression including the bracketing. The normal form of two elements combined is
where v is a variable or constant.
The associative law is,
If this law holds for a particular operator then the normal form of the structure reduces to a list,
- ????
Next if the operator is a Abelian then the commutative law applies,
Then the normal form of the structure reduces to a set, ???
where,
Combining normal forms
In this definition it is usual to take equality to allow renaming of any local variables.
References
- ^ Mccarthy, J. (1962). "Towards a Mathematical Science of Computation". In IFIP Congress.