The Arbol Programming Language

New home at iMonad.com [20.11.2008]
last revision: 0.02 [28.02.2006]
created: [13.08.2005]
by Rossen Radev

Arbol language

Table of contents

Introduction

árbol (Spanish) ár·bol m. - tree

Arbol is primary developed to serve as a programming language for Genetic Programming experiments. It is functional programming language inspired by the ideas of other small and esoteric languages ( Unlambda, Lazy K, Joy, Iota, Zot, ...) and purely functional language Haskell (the STG machine). Current version of Arbol language is 1.0. The following sections describe structure of Arbol programs and show some usage examples.

Tutorial

We begin with an example, to give the flavor of what the Arbol program look like:

// Simple I/O
((Z a)b)     = (((a b)a)I1)      // some input
(((ZZ a)b)c) = ((a b)a); O1 = c  // and output
main         = ((Z ZZ) Z)

This program outputs its input, i.e. like a Unix cat program.

Program structure

The Arbol programs are composed of unordered list of definitions (rewrite rules) separated with new line. Each definition have left and right part separated by "=".

Each definition is a rewrite rule that have tree representation.
For example:

((Z a)b)     = (((a b)a)I1)
Rewrite rule that - input

Where Z is the name of the pattern (definition) and the a,b,c,d,e,f,...,z are variables. The "@" sign denotes the function application.

On the right side of the definition allowed patterns can contain any number of variables found in the left hand side, brackets denoting the application operators and the names of the rewrite rules that are defined before or after current definition. There is one main definition with the name "main" that have no variables.

Because of the uniqueness of this definition in the program, the name "main" can be omitted:

((Z ZZ) Z)

Instead of

main = ((Z ZZ) Z)

Combinators

For introduction on combinators see the Unlambda manual, Combinator or the paper "To Dissect a Mockingbird".

Input/Output

Input and output in functional languages is tricky business. In Combinatory Logic the input is performed by applying the combinator X to the input stream of combinators a1, a2, a3, ..:

     X        {a1, a2, a3, ...}
=  (X a1)     {a2, a3, a4, ...}
= ((X a1) a2) {a3, a4, a5, ...}
= ...

This input stream can be data or program itself. This is done by "attaching" the combinator via the Apply node to the first data and in this way creating the new combinator.

In Unlambda the specialized input/output functions are used (i.e. 256 primitive ".*" functions). When evaluated they have the side effect of reading or writing the characters to the standard input/output.

We use another approach to the input and output:
Our approach is not limited to 256 characters but capable to input/output different type of data structures and characters (if encoded using some encoding scheme for example Church Numerals). We have input and output streams (I1, I2, ...., O1, O2, ...). Some of the definitions (rewrite rules) can work not only on the program tree but on program tree and input-output streams. So we can write rule that rewrites the tree and attach the subtree (detaching it from the input stream) and can write rule that rewrites the tree and detach the subtree (attaching it to the output stream).

The input stream is denoted by I1, I2, ... . After applying the rewrite rule the I1, I2, ... are replaced with the current subtree available from the input streams. To denote the output we write two or more definitions separated by semicolon. The first one represents the graph rewrite rule applied to the program tree and the following definitions represent the rewrite rules applied to output streams.

For example:
Input (as in diagram above)

((Z a)b)     = (((a b)a)I1)
Rewrite rule that - input

Output

(((ZZ a)b)c) = ((a b)a); O1 = c
Rewrite rule that - input

We can use the input and output together in one definition:

(((XYZ a)b)c) = (((a b)a) I1); O1 = c

It is possible to connect directly the input and output too.

 (ZZZ a)= (ZZZ a); O1=I1

Here we can find the analogy in biology. The production of proteins for example is performed by transforming the input stream of atoms and molecules to output complex molecules (proteins). The transformation is performed by interpreting the code contained in the DNA.

Standard prelude

The standard prelude contains definitions for some of the most used combinators

 
 (I a)      =  a
((K a)b     =  a
(((S a)b)c) =  ((a c)(b c))
(ZA a)      =  (a I1)
((ZB a)b)   =  b; O1 = a
This prelude contains combinators (SKI calculus) that are enough to program without additional definitions (i.e. Turing-complete language). In this way we can write programs that contain only one definition - "main", and our programs become similar to the Unlambda programs (instead of the input/output functions).

Examples

"Hello World"

The input/output system make difficult to write simple "Hello World" program. The only possible way is to encode the some alphabet in some combinator scheme and to create program that outputs the "Hello World" string encoded as combinator trees stream.

To be written

Self-reproducing code - Quine:

To be written

We have to note the fact that in the current version of the language there is no way the self-replicating program to reproduce the definitions. The program can produce only the copy of the "main" program tree. If we use the "minimal" version of the language (i.e. using only the combinators from the prelude) it make possible to write such self-replicating programs.

Here we see the analogy with the chemistry in biology. The chemical properties of components involved in the self reproduction of organisms exist "per se", and are not necessary to be reproduced. The reproduction is done only by reproducing of the structure (i.e. composition of chemical components).

Most of the times in our GP experiments we use the fixed set of definitions, and we write programs that are constructed only from the "main" definition. The definitions in this sense play the role of "Language definitions", and we program in a such fixed sub language (artificial chemistry).

Reverse of input:

To be written

Genetic Programming with Functional Programming Languages

We have developed (and continue to develop) the Arbol language to serve as a programming language for Genetic Programming experiments. Currently we develop the GP system for forecasting of time series. Some results and more information will be available soon.

Download

Coming soon. The implementation is written in OCaml. If it is too slow there is possibility to write it in C in the future.

Interpreter implementation

The first experimental version of interpreter have been written in OCaml (G-machine implementation).

Current version have C kernel and is simplified version of Spineless Tagless G-Machine.

Status:

References

[iMonad home]