Archive

Archive for July, 2007

NEXT please! Compiling iteration structures in BBC BASIC

This posting has been re-written from an earlier version which contained unnecessary complexity

Following on from my proposal on how to compiler BBC BASIC functions with variant return types, in response to Richard Russell’s compilation challenges, I’d like to move on to compiling BBC BASIC loop structures. Nonetheless, several readers have comments how useful even a compiler that did not support these features would be.

The claimed difficulty arises is because BBC BASIC maintains a stack at run-time for managing loop structures such as FOR..NEXT, REPEAT..UNTIL and WHILE..ENDWHILE. As the interpreter executes the program it presumably uses the instructions it encounters to manipulate this stack. They key point here is that the tokens listed above are not merely loop delimiters used by the parser, they are actually statements which modify the internal state of the interpreter.

The situation is further complicated by the quirky behaviour of the NEXT statement. First of all, the loop variable following the NEXT is optional. If it is omitted the next iteration of the FOR loop which is topmost on the stack (i.e. the inner-most loop) is executed. However, it is also possible to supply a comma-separated list of loop variables, for example NEXT i%,j%. This is equivalent to NEXT i% : NEXT j%.

Richard goes on to illustrate the point with a real-world example,

Here there are two NEXT I% statements; which gets used depends at run-time on the contents of the array. This is a real-world example (although the code could no doubt be written more elegantly).

DEF FNscan(A()) : LOCAL I%
FOR I% = 0 TO DIM(A(),1)
IF A(I%) = 0 THEN
I% = DIM(A(),1)
NEXT I%
= TRUE
ENDIF
REM Some more code here
NEXT I%
= FALSE

This time, I will use a different example, which is very similar in structure to Richard’s code, but which excludes the array, for the sake of clarity.

FOR i% = 1 TO 5
IF i% = 3 THEN
PRINT "Three!"
NEXT
ENDIF
PRINT "i% is ";i%
NEXT

Note there is one FOR statement and two NEXT statements, so it is not possible to pair them. I simply print the loop counter, or in one case print a different string and NEXT early.

Formatted like this we can see that the inner NEXT is exactly like the continue statement in C#.

for (int i = 1; i <= 5 ; ++i)
{
   if (i == 3)
   {
      Console.WriteLine("Three!");
      continue; // next
   }
   Console.WriteLine("i% is {0}", counter);
   continue; //next
}

So, we can do something isomorphic in C#, C# is compilable to CIL, ergo this particular BBC BASIC construct is compilable to C#. Obviously.

However, more nefarious constructs are possible in BBC BASIC, such as this:

FOR i% = 1 TO 5
FOR j% = 1 TO 5
PRINT "i% = ";i%,"j% = ";j%
NEXT i%

where we iterate on the outer loop, from ‘within’ the inner one. It may be possible to weed these out at compile time in simple cases such as this, and replace the FOR j%= 1 with a simple assignment. If not, its definitely possible to deal with it at run-time

The equivalent code using the C# a simple ForNext class I devised looks like this:

ForNext outer = new ForNext("i%", 1, 5, 1);
outer.execute(delegate(int counter)
        {
            ForNext inner = new ForNext("j%", 1, 5, 1);
            inner.execute(delegate(int counter2)
            {
                Console.WriteLine("i% = {0} j% = {1}", counter, counter2);
                ForNext.Next("i%");
            });
         });

For the run-time solution, I create a ForNext loop object, initialized with the counter name which is just a label and the start, end and step values. I then pass the loop body to the ForNext object as a C# anonymous delegate. NEXT statements are implemented as calls to the static method ForNext.Next() with an optional counter name.

When run, the code produces,

i% = 1     j% = 1
i% = 2     j% = 1
i% = 3     j% = 1
i% = 4     j% = 1
i% = 5     j% = 1

Of course, some machinery in the ForNext class is required to make this work. In this case we use the .NET exception handling machinery to manage the stack of FOR..NEXT loops on our behalf, calling ForNext.Next() exits the current iteration of the top-most loop on the stack and if necessary begins the next iteration.

class ForNext
{
    public delegate void Body(int counter);
    private string id;
    private int end;
    private int step;
    private int counter;

    private class NextException : System.Exception
    {
        public string id;

        public NextException(string id)
        {
            this.id = id;
        }

        public NextException() :
            this(null)
        {
        }
    }

    public ForNext(string id, int start, int end, int step)
    {
        this.id = id;
        this.counter = start;
        this.end   = end;
        this.step  = step;
    }

    public void execute(Body body)
    {
        while (counter <= end)
        {
            try
            {
                body(counter);
            }
            catch (NextException e)
            {   
                if (e.id != null && e.id != this.id)
                {
                    throw;
                }
                counter += step;
            }
        }
    }

    public static void Next()
    {
        if (counter < end)
        {
            throw new NextException();
        }
    }

    public static void Next(string id)
    {
        if (counter < end)
        {
            throw new NextException(id);
        }
    }
}

Note how in this example we pass the loop counter label as ForNext.Next("i%").

Of course, relying on the .NET exception handling mechanism for all such loops in BBC BASIC would be overkill. The majority of FOR..NEXT constructs are must simpler than this. The challenge moves on to detecting the simple cases using static analysis of the program to be compiled, and generating normal loop code in these simple cases. We would only fall back to the mechanism described here when required, if multiple unlabelled NEXT statements are present. It remains to be seen how feasible detection of these cases will be.

Of course, in functional languages, explicit iteration structures such as FOR..NEXT need not be present at all, since all iteration is achieved through recursion, usually with tail-calls. Another feature invented in the functional language LISP and now commonly available in more sophisticated languages is the concept of continuations, which are basically “GOTOs with parameters”. Both exception handling, as used in this post, and other complex control structures can be implemented using continuations, as can all the iteration structures of BBC BASIC, even with mismatched FOR..NEXT statements, even when the NEXT statement is not labelled with a counter variable.

Categories: .NET, C++, computing, OWL BASIC, software Tags: