Log in


Previous Entry Share Next Entry
2008-01-20 17:24: Fix

I recently had occasion to write a Forth interpreter in Linden Scripting Language (in Second Life). I've done it before (more than twice) and it only took me an hour or two. I've written the odd Scheme interpreter (and compiler) too. Forth and Scheme are both elegant little things, and I've always wanted to think of a way of combining them, but with the aim of making Scheme smaller and simpler, but making Forth a bit more workable.

Well, I've come up with something that's smaller than either, but may not be at all workable.

This is somewhat stream-of-consciousness language design. If you think this is somewhat rambling and random you should have seen what I edited it down from. I'm not claiming that this is brilliant stuff — it's just something that caused me a brainstorm and that I've enjoyed thinking about. I'm posting it in case other people might enjoy it, and who knows, it might be useful for something. But it's probably an esolang.

For the sake of this article I'm calling my language "Fix".

One of the big advantages of a Forth interpreter is that there is no need for recursive parsing, and that helps keep it very small. Naturally there are recursive things in the language (nested if/then and loops etc.) but these are implemented in Forth itself in a way which I won't explain right now. It would be reduce the size of a Scheme considerably if you didn't need a recursive reader.

I found out some other people have thought about it. There's Joy and Factor but they didn't go far enough, I think, and I wanted to explore really deep functional ideas.

The part that is right, though, is to consider each "word" as a function that takes a stack and returns a stack. All the interpreter does is execute words in order, apply them to the stack. (See Concatenative programming language). However, you need some means of abstraction. In Forth you define a word like this:

: my-word ... ;

where ":" reads the next token and defines a new word in the dictionary, then flips the interpreter into "compilation" mode, where instead of executing words their addresses are pushed onto the end of the dictionary. The ";" adds a "return" instruction to the dictionary and flips the interpreter back into "immediate" mode where it runs words as it reads them. It's extremely simple. Replace "..." with the words in the body of the definition.

So what if it was an anonymous definition? Instead of defining a new word in the dictionary the address where the words are compiled could just be pushed onto the stack. I'll write the anonymous word definer as "{" and the end as "}". So instead of:

: double 2 * ;
5 double

which would print "10". You could write:

5 { 2 * } execute
where "execute" calls the address at the top of the stack. Note the syntax similarity to PostScript.

(Note: this immediately introduces problems of garbage collection, since the compiled code is now inaccessible.)

I'm a fan of Church encoding and the first thing I think about in a language is how it might be achieved. I'm particularly fond of Church encodings in System F but that's another story.

The Church booleans are:

  true  ::= λxy.x
  false ::= λxy.y

So that

  true "yes" "no" --> "yes"
  false "yes" "no" --> "no"

In Fix, you could define these thus:

  true  ::= swap drop
  false ::= drop

("Swap" is a Forth word which swaps the top two things on the stack, "drop" discards the top thing.)

So now

"no" "yes" true --> "yes"
"no" "yes" false --> "no"

Bear with me on the order reversal. It will make sense in a moment. Consider the Church numbers, where the number N is something which applies a function N times:

  0    ::= λfa.a
  1    ::= λfa.fa
  2    ::= λfa.f(fa)
  succ ::= λnfa.f(nfa)
  add  ::= λmn.m succ n   or λmnfa.mf(nfa)
  mul  ::= λmn.m (add n)  or λmnf.n(mf)
  exp  ::= λmn.nm

Well, in Fix, the number N can be encoded as something which executes the top element of the stack N times, applying it to the rest of the stack, without caring how many items it produces or consumes. So where can we keep the thing being applied while the whole stack is fed to the top thing? This problem occurs in Forth too, and there are two words ">R" and "R>" which transfer one thing from the operand stack to the return stack, and vice versa. I hadn't realized it before, but thinking back to the Forth I've written, these are exactly so you can do higher-order stuff in Forth.

  0    ::= false          (See, I told you it would make sense.)
  1    ::= execute        (I'm going to start writing "1" for execute.)
  2    ::= dup >R 1 R> 1
  succ ::= {dup >R} swap cat {R> 1} cat

What?! OK, well it turns out that concatenating two sequences of words is the same thing as composing two functions. So

  {a b c} 1 {d e f} 1

is the same as

  {a b c d e f} 1

So we introduce an operator "cat" which joins sequences:

  {a b c} {d e f} cat

(This wouldn't actually be implemented as list concatenation. It can simply compile a two-word object that calls the first then second thing, so it's a lot like a list cons.)

So let's see how "succ" works. Here's an execution trace. The stack is on the left, the remaining program is on the right.

        0                       succ
        0                       {dup >R} swap cat {R> 1} cat
        0 {dup >R}              swap cat {R> 1} cat
        {dup >R} 0              cat {R> 1} cat
        {dup >R 0}              {R> 1} cat
        {dup >R 0} {R> 1}       cat
        {dup >R 0 R> 1}

It should be clear to see that "dup >R 0 R>" is a no-op, so this is the same as "1". In general,

        n                       succ
        {dup >R}n{R> 1}

which is a function which executes the top thing on the stack one more time than "n". So, now you can write "{drop} 2" and drop two things from the stack! This is obviously a Good Thing.

        a b c                   {drop} 2
        a b c {drop}            2
        a b c {drop}            dup >R 1 R> 1
        a b c {drop} {drop}     >R 1 R> 1
        a b c {drop}            1 R> 1
        a b                     R> 1
        a b {drop}              1

An aside about efficiency. WIth this definition, the number N uses O(N) levels of the return stack, when there's clearly something better:

        3    ::= dup R> 1 R> dup R> 1 R> 1

On the other hand, there's no reason to just apply "succ" to numbers. I haven't though of another use for it, but I bet there is one. Now:

        factorial ::=
                {1} {1} rot
                {over mul swap succ swap} swap 1
                swap drop

How does it work? Where's the recursion? Ha!

For my next trick, let's consider lists. There's something I like to call "Church lists" by analogy with "Church numbers" even though I'm not sure Church had anything to do with them. In Lambda calculus:

        nil     ::= λfa.a
        [x]     ::= λfa.fxa
        [x y]   ::= λfa.fy(fxa)
        [x y z] ::= λfa.fz(fy(fxa))
        cons    ::= λelfa.fe(lfa)

In other words, a list takes a function and an identity, and applies the function to each element in turn. A list is represented by its own "fold" function. So:

        [1 2 3] + 0

would evaluate to 6, and

        [1 2 3] cons nil

evaluates to [3 2 1]. Let's do this in "Fix". First of all, I'm going to define "R@" which copies the top element of the return stack to the operand stack, and is equivalent to "R> dup >R".

        []      ::= 0           (Neato!)
        [x]     ::= >R x R@ 1 R> []
        [x y]   ::= >R x R@ 1 y R@ 1 R> []
        [x y z] ::= >R x R@ 1 y R@ 1 z R@ 1 R> []
        cons    ::= {>R} swap cat {R@ 1 R>} cat swap cat

Let's see "cons" in action:

        {[]} x                  cons
        {[]} x                  {>R} swap cat {R@ 1 R>} cat swap cat
        {[]} {>R}x{R@ 1 R>}     swap cat
        {>R}x{R@ 1 R> []}

In other words, it stashes a copy of the function, pushes x, appies the function to it, and leaves another copy of the function for the rest of the list. In this case the rest is [], which discards the function. Let's try adding up the contents of a list:

        0 +                     [1 2 3]
        0 +                     >R {1} R@ 1 R> [2 3]
        0                       {1} R@ 1 R> [2 3]
        0 1                     R@ 1 R> [2 3]
        0 1 +                   1 R> [2 3]
        1                       R> [2 3]
        1 +                     [2 3]
        1 +                     >R {2} R@ 1 R> [3]
        1 2 +                   1 R> [3]
        3 +                     [3]
        3 +                     >R {3} R@ 1 R> []
        3 3 +                   1 R> []
        6 +                     []

I'll leave executing "[] cons [1 2 3]" as an exercise.

One of the consequences of making numbers, booleans, etc. functions is that you have to defer execution of them if you don't want them to do anything. So to get a literal number 5 on the stack you have to say "{5}" in your code. Otherwise 5 will execute and call the top thing on the stack five times.

So what are these braces? Well, the open brace pushes the compile pointer onto the stack and flips the interpreter into "compile" mode, to start compiling subsequent words. When the close brace is reached it appends a "return" opcode and flips back to execute mode. But what about nesting? It's clear that we want programs like:

  {0} < {{"non-negative"}} {{"negative"}} rot 1 print

It's time to talk about how Forth does "IF". Remember that Forth's syntax is a bit weird:

  0 < IF ." negative" ELSE ." non-negative" ENDIF

When a Forth word is defined it can be declared as "immediate", which means that it gets run straight away even when the interpreter is compiling. The word can therefore mess with the compiler state or do some compiling itself. "IF" is immediate, and typically does something like this:

  1. Push the current compilation pointer (what will be the PC).
  2. Compile a conditional branch but leave the destination address empty. (In Forth this means compiling a call to a word which implements the run-time semantics of "IF" by messing with its return address, since all instructions are CALLs.)

What "ENDIF" does it to fill in the branch at the address on the stack to point to the current compilation pointer. So it's back-patching the IF branch. ELSE does this:

  1. Push the compilation pointer.
  2. Compile an unconditional branch with no destination.
  3. swap
  4. Back-patch the IF branch at the top of stack.

So what about when you're compiling and you meet a "{"? Well, it turns out to be very simple. You just push the CP and compile a dummy CALL. When you reach a "}" you compile a return, then back-patch the CALL to come to the current CP, then compile a "R>". At run-time, this will copy the address of the instruction after the CALL to the stack, which is the entry point of the code block.

It nests.


  dup {0} > {drop {{"negative"}} {{0} = {{"zero"}} {{"positive"}} rot 1} rot 1 print

(Don't worry about the double braces around the strings for the moment.) I'm going to hand-compile this into a sort of psuedo-assembler. Fortunately this is trivial.

		CALL dup
		CALL l0
		CALL 0
	l0:	CALL R>
		CALL >
		CALL l1
		CALL drop
		PUSH "negative"
	l1:	CALL R>
		CALL l2
		CALL l3
		CALL 0
	l3:	CALL R>
		CALL =
		CALL l4
		PUSH "zero"
	l4:	CALL R>
		CALL l5
		PUSH "positive"
	l5:	CALL R>
		CALL 1
	l2:	CALL R>
		CALL 1
		CALL print

(Please don't worry about how ">" and "=" work for the moment, or the fact that "{w}" can be optimised, and especially "{0}".)

So the only other things that "{" and "}" need to do is keep track of a nesting level and switch back to immediate execution mode. This only complicates things slightly. "{" needs to push two things instead of one -- the CP and the mode -- and "}" needs to restore the mode.

If I'm right, nothing else is needed for flow control. (By the way, this is not how PostScript does it.)

More about quoting. In Lisp, some things need quoting and some don't. Numbers, for example, don't need to be quoted, because their result is themselves. Forth is similar: "executing" a number pushes it on to the stack. In my proposal this is not the case: numbers, booleans, and lists are functions. Strings ought to be too. So what is the representation of one of these things? It has a lot to do with boxing.

Mostly, we accept the idea that we represent functions by points to something. It's the same here. "2" could be represented by a pointer to something which runs the top thing on the stack twice. (Forget about optimising this for a moment.) But if you just say "2" then it runs! If you say "{2}" then you get a pointer to something which runs the top thing on the stack twice, in other words, 2. There's something strange going on here: it's like function abstraction and quoting have become the same thing.

In theory, all the numbers appear in the dictionary of definitions where we find things like "drop" and "dup", and which we extend when we define new words. So in theory there's a word "2" which we can look up to get its entry point. But if we use tagging, then we can just use a binary representation of 2 as the representation of the 2 function.

Furthermore, a real implementation can easily spot that for any word, "{word}" is equivalent to the entry point of the word, and can push that instead. It's only when there's a sequence of words in brackets that we need to build new code to compose the operations inside.

A note on strings. If strings are lists of characters then we expect them to be self-folders, so if "emit" sends one character to the screen, this works:

  {emit} "Hello, world!"

because it's like:

  {emit} ['H' 'e' 'l' 'l' 'o' ',' ' ' 'w' 'o' 'r' 'l' 'd' '!']

The string is a function which applies the top thing on the stack to each of its characters in turn. To stop the string running we have to quote it.

Now, cunningly, my earlier program

  dup {0} > {drop {{"negative"}} {{0} = {{"zero"}} {{"positive"}} rot 1} rot 1 print

could be written

  {emit} dup {0} > {drop "negative"} {{0} = {"zero"} {"positive"} rot 1} rot 1

Anyway, about loops and recursion. "Who needs 'em?" is what I say. Every piece of data is a loop over itself. What else are you going to do? My factorial says:

  {1} {1} rot {over mul swap succ swap} swap 1 swap drop

The two 1s are the counter and the accumulated total so far, then it simply repeats the "over mul swap succ swap" N times.

But what about recursion? Well, there are fixed point combinators. Curry's Y is

  Y = λf.(λx.f(xx))(λx.f(xx))

So that's

  {dup 1} swap cat dup 1

Let's try it.

	f			{dup 1} swap cat dup 1
	{dup 1 f}		dup 1
	{dup 1 f} {dup 1 f}	1
	{dup 1 f}		dup 1 f
	{dup 1 f} {dup 1 f}	1 f
	{dup 1 f} 		dup 1 f f
	{dup 1 f} {dup 1 f}	1 f f
	{dup 1 f}		dup 1 f f f
	{dup 1 f} {dup 1 f}	1 f f f

So it's non-terminating but is definitely producing a recursion of the function in some sense. Perhaps what we need is the applicative-order fixed point combinator

  Y = λf.(λx.f(λy.xxy))(λx.f(λy.xxy))

So we just need Y to feed the function to itself and the stack when it's applied. To do this we need an operator "defer" which turns the top thing on the stack into an operator which pushes that thing back on the stack when run, so that "defer 1" is a no-op. You can think of this as adding a layer of braces. (Or, you can notice that it is eta-abstraction.) Then:

  Y ::= {defer {dup 1} cat} swap cat dup 1

Here's a recursive factorial written with the Y combinator.

  {swap dup {0} = {dup pred rot 1 mul} {{drop} 2 {1}} rot 1} Y

Let's try it. I'm writing "F" for the recursive part of factorial.

  2 F                                   {defer {dup 1} cat} swap cat dup 1
  2 {defer {dup 1} cat}F                dup 1
  2 {defer {dup 1} cat}F                defer {dup 1} cat F
  2 {{defer {dup 1} cat}F dup 1}        F
  {{defer {dup 1} cat}F dup 1} 2        dup pred rot 1 mul
  {{defer {dup 1} cat}F dup 1} 2 1      rot 1 mul
  2 1 {{defer {dup 1} cat}F dup 1}      1 mul
  2 1                                   {defer {dup 1} cat}F dup 1 mul
  2 1 {defer {dup 1} cat}F              dup 1 mul
  2 1 {defer {dup 1} cat}F              defer {dup 1} cat F mul
  2 1 {{defer {dup 1} cat}F dup 1}      F mul
  2 1 0                                 {defer {dup 1} cat}F dup 1 mul mul
  2 1 0 {defer {dup 1} cat}F            dup 1 mul mul
  2 1 0 {defer {dup 1} cat}F dup 1}     F mul mul
  2 1 0 {defer {dup 1} cat}F dup 1}     {drop} 2 {1} mul mul
  2 1 1                                 mul mul

I think it's time I implemented the interpreter and compiler to find out what they look like.

Tags: , , , ,


[User Picture]
Date:2010-05-13 16:20 (UTC)



So I have a practical compiler and VM running in Linden Scripting Language. It's pretty small and fast, and has just six core words:

defines a word, binding a string to an entry address
start a block
end a block, leaving the entry point on the stack
call the entry point on the stack
fixed point operator
select and calls one of two entry points depending on a third value

In addition, any call just before a "}" is converted to a jump, so tail recursion works. Here's a recursive factorial program.

"fact" { 
    over 0 =
    { drop drop 1 }
    { over 1 - swap ! * }
} recursive ;

The "recursive" word is the witty thing here. It compiles a little bit of code which pops an address from the stack, then pushes its own address and jumps to that address. So when "fact" is bound above, it's actually bound to one of those chunks, which then jumps to the main body. When the body does a "!" it's actually calling itself, recursive, or "Y f" if you prefer.

Here's an iterative version which doesn't consume O(n) stack levels.

"fact-loop" {
    over 0 =
    { drop drop }
        rot 2 pick *
        rot 1 -
        rot !
} recursive ;

"ifact" {
    1 swap fact-loop
} ;
[User Picture]
Date:2010-05-13 16:32 (UTC)

Re: Update


Incidentally, I'm toying with a CPS version of the language. The current version has two stacks: the operand stack (with the data on it) and an operator stack (with return addresses). This actually causes a problem with the VM being re-entrant when it calls other language code, and it prevents me doing nice things with coroutines. I think it can work quite well with just one stack.

Just two changes are needed.

  1. When an address is called, the VM will push the return address onto the operand stack (the only stack in this case).
  2. When the end of a block is reached, pop and jump to the address at the top of the operand stack.

These are trivial changes. What they mean is that the entire state of computation is now on the operand stack. The recursive factorial program now looks like this:

"fact" {
  2 pick 0 =
  { drop swap drop 1 swap }
  { 2 pick 1 - swap ! rot * swap }
} recursive ;

At the start of the first line, "2 pick 0 =", there are three things on the stack: the recursion address, then the return address, then the number. "2 pick" gets a copy of the number to check if it's zero. In the zero case, we drop the recursion, replace the number with 1, and then jump to the return address. In the non-zero case, we copy the number, decrement it, call the recursion (which jumps back here later), multiply by the number we first thought of, then jump to the return address.


Powered by LiveJournal.com