New home at iMonad.com [20.11.2008]
last revision: 0.02 [28.02.2006]
created: [13.08.2005]
by Rossen Radev
Arbol is primary developed to serve as a programming language for Genetic Programming experiments. It is functional programming language inspired by the ideas of other small and esoteric languages ( Unlambda, Lazy K, Joy, Iota, Zot, ...) and purely functional language Haskell (the STG machine). Current version of Arbol language is 1.0. The following sections describe structure of Arbol programs and show some usage examples.
We begin with an example, to give the flavor of what the Arbol program look like:
// Simple I/O ((Z a)b) = (((a b)a)I1) // some input (((ZZ a)b)c) = ((a b)a); O1 = c // and output main = ((Z ZZ) Z)
This program outputs its input, i.e. like a Unix cat program.
The Arbol programs are composed of unordered list of definitions (rewrite rules) separated with new line. Each definition have left and right part separated by "=".
Each definition is a rewrite rule that have tree representation.
For example:
((Z a)b) = (((a b)a)I1)
Where Z is the name of the pattern (definition) and the a,b,c,d,e,f,...,z are variables. The "@" sign denotes the function application.
On the right side of the definition allowed patterns can contain any number of variables found in the left hand side, brackets denoting the application operators and the names of the rewrite rules that are defined before or after current definition. There is one main definition with the name "main" that have no variables.
Because of the uniqueness of this definition in the program, the name "main" can be omitted:
((Z ZZ) Z)
Instead of
main = ((Z ZZ) Z)
For introduction on combinators see the Unlambda manual, Combinator or the paper "To Dissect a Mockingbird".
Input and output in functional languages is tricky business. In Combinatory Logic the input is performed by applying the combinator X to the input stream of combinators a1, a2, a3, ..:
X {a1, a2, a3, ...}
= (X a1) {a2, a3, a4, ...}
= ((X a1) a2) {a3, a4, a5, ...}
= ...
This input stream can be data or program itself. This is done by "attaching" the combinator via the Apply node to the first data and in this way creating the new combinator.
In Unlambda the specialized input/output functions are used (i.e. 256 primitive ".*" functions). When evaluated they have the side effect of reading or writing the characters to the standard input/output.
We use another approach to the input and output:
Our approach is not limited to 256 characters but capable to input/output different type of
data structures and characters (if encoded using some encoding scheme for example Church Numerals).
We have input and output streams (I1, I2, ...., O1, O2, ...). Some of the definitions (rewrite rules)
can work not only on the program tree but on program tree and input-output streams.
So we can write rule that rewrites the tree and attach the subtree (detaching it from the input stream)
and can write rule that rewrites the tree and detach the subtree (attaching it to the output stream).
The input stream is denoted by I1, I2, ... . After applying the rewrite rule the I1, I2, ... are replaced with the current subtree available from the input streams. To denote the output we write two or more definitions separated by semicolon. The first one represents the graph rewrite rule applied to the program tree and the following definitions represent the rewrite rules applied to output streams.
For example:
Input (as in diagram above)
((Z a)b) = (((a b)a)I1)
Output
(((ZZ a)b)c) = ((a b)a); O1 = c
We can use the input and output together in one definition:
(((XYZ a)b)c) = (((a b)a) I1); O1 = c
It is possible to connect directly the input and output too.
(ZZZ a)= (ZZZ a); O1=I1
Here we can find the analogy in biology. The production of proteins for example is performed by transforming the input stream of atoms and molecules to output complex molecules (proteins). The transformation is performed by interpreting the code contained in the DNA.
The standard prelude contains definitions for some of the most used combinators
(I a) = a ((K a)b = a (((S a)b)c) = ((a c)(b c)) (ZA a) = (a I1) ((ZB a)b) = b; O1 = a
"Hello World"
The input/output system make difficult to write simple "Hello World" program. The only possible way is to encode the some alphabet in some combinator scheme and to create program that outputs the "Hello World" string encoded as combinator trees stream.
To be written
Self-reproducing code - Quine:
To be written
We have to note the fact that in the current version of the language there is no way the self-replicating program to reproduce the definitions. The program can produce only the copy of the "main" program tree. If we use the "minimal" version of the language (i.e. using only the combinators from the prelude) it make possible to write such self-replicating programs.
Here we see the analogy with the chemistry in biology. The chemical properties of components involved in the self reproduction of organisms exist "per se", and are not necessary to be reproduced. The reproduction is done only by reproducing of the structure (i.e. composition of chemical components).
Most of the times in our GP experiments we use the fixed set of definitions, and we write programs that are constructed only from the "main" definition. The definitions in this sense play the role of "Language definitions", and we program in a such fixed sub language (artificial chemistry).
Reverse of input:
To be written
We have developed (and continue to develop) the Arbol language to serve as a programming language for Genetic Programming experiments. Currently we develop the GP system for forecasting of time series. Some results and more information will be available soon.
Coming soon. The implementation is written in OCaml. If it is too slow there is possibility to write it in C in the future.
The first experimental version of interpreter have been written in OCaml (G-machine implementation).
Current version have C kernel and is simplified version of Spineless Tagless G-Machine.
Status: