Tulip is an interpreter for a minimal Lisp-like language.
Tulip works by translating Lisp code into Perl functions via ``eval'' and ``sub''. This is less efficient than you'd think, since all variable lookups are actually done using Perl hashes instead of the real Perl namespace.
I have attempted to document this program reasonably well, so hopefully, this code will all make sense. In general, I've tried to go for simplicity over performance or, say, usefulness.
(Note: Any resemblance between Tulip and a published Lisp or Scheme standard is purely coincidental).
Tulip and related files are released under the terms of the new BSD license, as follows:
Copyright (c) 2005, Chris Reuter
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the developer nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
We start with a few includes. This should all be part of your standard Perl distribution.
(Note that I don't have a Perl version specified. I wrote this using 5.8.3, although it should work on any version recent enough to support the use of ``foreach my $x (...)''. I haven't tested this though, so your mileage may vary.
use strict; use FileHandle; use Getopt::Long;
We use global variables for them and parse the argument list with Getopt::Long (which, I should mention, r0xx0rz!!!!)
They are:
These are really only useful for debugging.
my $show_trans = 0; my $show_macro = 0; my $help = 0; my $verbose = 0; GetOptions ('show-trans' => \$show_trans, 'show-translated' => \$show_trans, 'show-macro' => \$show_macro, 'verbose' => \$verbose, 'help' => \$help) or do { print "Invalid argument.\n"; $help = 1; }; $show_trans and do {$verbose = 1}; if ($help) { print "Tulip, a minimal Lisp-like language in Perl Options: --show-trans -- Show the generated perl code but don't run it. --show-macro -- Print out macro expansions as they're evaluated. --verbose -- Display more messages. --help -- Print this message. "; exit (0); }
We use a hash to hold the global namespace. Actually, this is a bit more complicated than that, as explained below in the section describing the Context Stack.
The variable t
is defined here as a 3-element array. I do this
because t
should be non-nil but not of any other type. That is,
t
is an object whose type consists only of itself. Internally, we
normally just use references to this global.
The variable $EOF
is similar but is not visible to Lisp programs.
It's used internally to detect end-of-file at read time.
my %globals = (); my $true = ['t_obj', 't_obj', 't_obj']; my $EOF = ['eof', 0, 0]; $globals{nil} = undef; $globals{t} = \$true;
Macros are simple. They're just Lisp functions that get called by the
parser. The built-in function _macro
stores a reference to a
function in global hash %macros
and translate_expr
calls it
whenever a list begins with a macro name.
The macro function takes one argument--the expression--and its return value is parsed in place of the macro call.
my %macros = ();
A context (or ``activation record'' or ``stack frame'') is the collection of variables visible in the current scope. Unlike C (or Perl), we can't just use a stack for them.
Consider the following example:
(defun+ _init () (count) (setq count 0) (defun next () (setq count (+ count 1))))
(_init)
(printnl (next)) (printnl (next)) (printnl (next))
The local variable count
needs to hang around as long as the
function next
exists. Putting it on a stack would make it go away
while it was still needed.
Furthermore, Tulip is lexically scoped, so a function doesn't have access to the caller's locals. This means that there are two different ways of organizing contexts, by call and by scope.
Here's how it works:
return [{}, $outer_context];Each lambda contains a reference to the context in which it was created. The global variable
@stack
contains the call stack. Whenever a
function starts up, it first calls push_context
which creates a new
context and pushes it to the top of the stack. This context takes the
lambda's creation context as its outer context. Before returning, it
calls pop_context
to remove the context from the stack.
Global variables are stored in the context at the bottom of the
stack. This is also referenced by the global variable
$global_context
. Its $outer_context
is undef.
When a function tries to access a variable, it checks the current
context to see if that name is defined. If not, it tries the context
referenced by $context-
[1]>, continuing to chase pointers until it
either finds a context with the name defined or reaches the end of the
chain and flags an error.
In this way, contexts are popped off the stack when their functions return but they stay alive as long as something points to them.
(In other words, we use the Perl garbage collector to do our memory management.)
my @stack = (); my $global_context = [\%globals, undef]; push @stack, $global_context;
Create a new context with the argument as the reference to the outer context.
sub new_context ( $ ) { my $outer_context = shift; return [{}, $outer_context]; }
Create a new context with the argument as the reference to the outer context, then push it onto the call stack.
sub push_context ( $ ) { my $outer_context = shift; push @stack, new_context ($outer_context); }
Pop the top context off the stack.
sub pop_context () { pop @stack; }
Return the current context.
sub curr_context () { return $stack [-1]; }
Find the innermost context which as defined $var_name
sub context_with_name ( $ ) { my $var_name = shift; for (my $curr = curr_context; defined($curr); $curr = $curr->[1]) { if (exists($curr->[0]->{$var_name})) { return $curr; } } return undef; }
Define a variable in context $cc
with name $name
and initial value
$value
(which may be undef). Raise an error if it's already
defined.
sub def_var_in ( $$$ ) { my ($name, $value, $cc) = @_; die "Variable '$name' already defined in current context." if exists($cc->{$name}); $cc->{$name} = $value; }
Define a variable with the name $name
in the current context and set
$value
to it. $value
may be undef.
sub def_var ( $$ ) { my ($name, $value) = @_; my $cc = curr_context->[0]; def_var_in ($name, $value, $cc); }
Define a variable in the global namespace.
sub def_global ( $$ ) { my ($name, $value) = @_; def_var_in ($name, $value, $global_context->[0]); }
Set the variable named by $name
to $value
in the current innermost
context, IF $name
HAS BEEN DEFINED.
sub set_var ( $$ ) { my $name = shift; my $value = shift; my $context = context_with_name ($name); die "Undefined variable: $name" unless defined($context); # Protect the 2 reserved names we've got. if ($context == $global_context) { if ($name eq 't' || $name eq 'nil') { die "Attempted to set read-only variable '$name'.\n"; } } $context->[0]->{$name} = $value; }
Return the value of the variable named by $name
if it exists.
sub lookup_var ( $ ) { my $name = shift; my $cc = context_with_name ($name); if (!exists ($cc->[0]->{$name})) { die "Unknown name: $name\n"; } return $cc->[0]->{$name}; }
Lookup up a variable and raise an error if it's not a function.
sub lookup_var_func ( $ ) { my $result = lookup_var (shift); die "Expecting function.\n" unless is_lambda ($result); return $result; }
Tulip implements pairs as 2-element arrays with undef (the Perl undefined value) shoehorned in as the empty list (i.e. nil).
For those that aren't familiar with Lisp lists, here's an overview:
Lisp has a built-in type called a pair or ``cons''. It is a data structure containing two pointers, called (for historical reasons) the ``car'' and ``cdr'' respectively. (I could have gone and called them ``first'' and ``rest'' or something, but that would make it too easy and we can't have that.)
A proper list is either:
Thus, the cdr of a proper list points to another proper list.
Cool, eh?
sub lforeach ( $$ );
Convert a Tulip list to a perl array.
sub flatten ( $ ) { my $list = shift; my @result = (); my $curr; ensure_cons ($list); while (defined ($curr = car ($list))) { push @result, $curr; $list = cdr ($list); } return \@result; }
Convert a perl array to a lisp list. Takes variable argument list.
sub mk_list { my $result = undef; my $curr; return undef if (scalar (@_) == 0); do { $curr = pop; $result = [$curr, $result]; } while (scalar(@_) > 0); return $result; }
Function to create a pair.
sub mk_pair ( $$ ) { my ($left, $right) = @_; return [$left, $right]; }
Return the first element of a list or undef (i.e. nil)
sub car ( $ ) { my $list = shift; ensure_cons ($list); defined ($list) or return undef; return ${$list}[0]; }
Return the rest of a list.
sub cdr ( $ ) { my $list = shift; ensure_cons ($list); defined ($list) or return undef; return ${$list}[1]; }
Return the second element of a list
sub cadr ( $ ) { return car (cdr (shift)); }
Return the third element of a list
sub caddr ( $ ) { return car (cdr (cdr (shift))); }
Return the length of a list.
sub llength ( $ ) { my $list = shift; my $count = 0; ensure_cons ($list); lforeach $list, sub {$count++}; return $count; }
Return the last element in a list.
sub llast { my $list = shift; ensure_cons ($list); if (!defined (cdr ($list))) { return car ($list); } return llast (cdr ($list)); }
Evaluate a sub over all elements of a list.
sub lforeach ( $$ ) { my $list = shift; # The list my $sub = shift; # The thing to evaluate over it return unless defined($list); for ( ; defined ($list); $list = cdr ($list)) { my $is_last = !defined(cdr ($list)); &{$sub}(car ($list), $is_last); } }
We implement types using the Perl type system.
All Lisp objects are stored in Perl scalar variables. Under some circumstances, these values are references to Perl objects. From their layout, we can identify their type.
We use the following internal representation:
[$car, $cdr]
$expr != 0 || $expr eq '0'
is true if $expr is a number and false if it is a symbol.
Note that I may change the number of elements in this array at some later time, so an object is a lambda if it is an array of at least 3 elements and a code reference as the first item.
A lambda looks like this:
[\&code, $context, \@arg_list, \@local_list]
Context is a reference to the context (i.e. stack frame) in which this lambda was created. The remaining elements are the names (in Perl lists) of formal arguments and local variables.
For example, suppose we want to pass a list to a function:
(foo (bar quux baz))
won't work because the evaluator interprets the argument (the list ``(bar quux baz)'') as an expression. To work around this, we have the quote special form:
(foo '(bar quux baz))
This causes the parser (as such) to parse the thing after the quote but return it as a quoted object, i.e. as a reference to a 1-dimensional array.
You can quote anything but it really only makes sense to quote lists and names.
['t_obj', 't_obj', 't_obj']
Its type isn't really important except in that it's testably different from all of the other types.
t is the value returned by predicates when they succeed. Since every value except nil is considered true, t is only important for not being nil.
['eof', 0, 0]
It is not actually visible to user programs. It is mostly used for catching unexpected EOF errors in tulip_read().
Currently, only FileHandle is used in this way.
Given an expression, convert it to a quoted version.
sub quote ( $ ) { my $expr = shift; return [$expr]; }
Given a quoted thing, unquote it. It is harmless to unquote an unquoted thing.
sub unquote ( $ ) { my $expr = shift; if (!is_quote_expr ($expr)) { return $expr; } return $expr->[0]; }
This section contains type tests.
Return true if the result is a blessed Perl object. (Actually, this is a bit more picky--any type which is all upper-case is automatically rejected, based on the theory that this is the reserved namespace for built-in types.)
sub is_opaque_obj ( $ ) { my $expr = shift; my $type = ref($expr); if ($type eq "" || $type =~ /^[A-Z]+/) { return 0; } return 1; }
Return true if the result is a lambda.
sub is_lambda ( $ ) { my $expr = shift; return ref($expr) eq 'ARRAY' && ref($expr->[0]) eq 'CODE' && scalar(@{$expr}) >= 3; }
Return true if expr is a number, false otherwise.
sub is_number ( $ ) { my $expr = shift; return ref($expr) eq "" && ($expr != 0 || $expr eq '0'); }
Determine whether the argument is a name to be looked up (i.e. a bare name--at run time, this will be replaced with whatever it's been set to.)
sub is_symbol ( $ ) { my $expr = shift; return defined($expr) && (ref ($expr) eq "") && (!is_number ($expr)); }
Return true if the argument is a quoted symbol
sub is_quoted_symbol ( $ ) { my $expr = shift; return is_quote_expr ($expr) && is_symbol ($expr->[0]); }
Return true if expr is a quoted thing.
sub is_quote_expr ( $ ) { my $expr = shift; return ref($expr) eq 'ARRAY' && scalar(@{$expr}) == 1; }
Return true if expr is a string. Note that strings and symbols are the same type here.
sub is_string ( $ ) { my $expr = shift; return ref ($expr) eq "SCALAR"; }
Return true if expr is a cons cell. Nil (i.e. undef) is considered a cons.
sub is_cons ( $ ) { my $expr = shift; return 1 if (!defined($expr)); return 1 if (ref($expr) eq 'ARRAY' and scalar(@{$expr}) == 2); return 0; }
Determine whether the argument is a proper list. Slow.
sub is_proper_list ( $ ); # Since the function recurses. sub is_proper_list ( $ ) { my $expr = shift; return 0 unless is_cons ($expr); # The empty list is a proper list. return 1 if !defined ($expr); # It's not a proper list if the cdr sin't a cons. return 0 if !is_cons(cdr($expr)); # Otherwise, recurse return is_proper_list (cdr ($expr)); }
Determine whether expression is a literal.
sub is_literal ( $ ) { my $expr = shift; return 1 if is_quote_expr ($expr); return 1 if is_number ($expr); return 1 if is_string ($expr); return 0; }
Return true if $expr
is the magical (internal) EOF object.
sub is_eof ( $ ) { my $expr = shift; return $expr == \$EOF; }
This section contains routines which raise an error if the argument is not of hte expected type.
Ensure that the argument is a cons cell
sub ensure_cons ( $ ) { my $arg = shift; die "Expecting a list, got '@{[printable_string($arg)]}'\n" unless is_cons ($arg); }
Ensure that the argument is a string
sub ensure_string ( $ ) { my $arg = shift; die "Expecting a string, got '@{[printable_string($arg)]}'\n" unless is_string ($arg); }
Ensure that the argument is a symbol
sub ensure_symbol ( $ ) { my $arg = shift; die "Expecting a symbol, got '@{[printable_string($arg)]}'\n" unless is_symbol ($arg); }
Ensure that the argument is a lambda
sub ensure_lambda ( $ ) { my $arg = shift; die "Expecting a function, got '@{[printable_string($arg)]}'\n" unless is_lambda ($arg); }
Ensure that the argument is a number.
sub ensure_num ( $ ) { my $arg = shift; die "Expectinga number, got '@{[printable_string($arg)]}'\n" unless is_number ($arg); }
This section contains the code used to translate Lisp into Perl.
The workhorse function here is translate_elem()
and that's where you
should start looking if you want to understand how this works.
Translation is pretty braindead. Instead of using Perl local
variables, we emit calls to lookup_var()
to read from variables.
(Writing gets done by calls to the built-in functions _set and
_define.)
We call Lisp functions by generating calls to the function ``call_func'' with the first argument being the lambda to call and the remaining arguments being translated expressions to produce those.
Arguments to Lisp functions are passed a single proper list.
Probably the easiest way to understand all of this is to run some code through tulip with the --show-translated flag. That will print out the translated code.
Given an unquoted list, return a Perl expression that recreates that
list. If $quoted
is true, the resulting code produces a quoted list
rather than an unquoted one.
sub translate_quoted_list ( $$$ ) { my $list = shift; my $quoted = shift; my $indent = shift; my $close = 0; my $result = ""; lforeach $list, sub { my $elem = shift; ++$close; $result .= "["; $result .= translate_literal ($elem); $result .= ", "; }; $result .= 'undef' . (']' x $close); if ($quoted) { $result = "\[$result\]"; } $result = (' ' x $indent) . $result; return $result; }
Make the given string into a Perl string constant.
sub make_str_constant ( $ ) { my $arg = shift; local $_ = ${$arg}; s/\'/\\'/g; return "'$_'"; }
Given a literal expression, produce a perl expression that reproduces it.
sub translate_literal ( $$ ) { my $expr = shift; my $indent = shift; my $istr = ' ' x $indent; my $quoted = 0; my $result = ""; # The argument may be quoted or unquoted and we need to handle both # cases. The Usual Way is to strip off the quote, do the # translation and then wrap the result with an array if the quote # was there. if (is_quote_expr ($expr)) { $quoted = 1; $expr = unquote ($expr); } # We handle lists here, returning if we've got one. The quoting # gets dealt with by translate_quoted_list since this is a somewhat # more complicated case. if (is_cons ($expr)) { return translate_quoted_list ($expr, $quoted, $indent); } # Otherwise, set $result to the translated value. if (is_number ($expr)) { $result = $expr; } elsif (is_string ($expr)) { $result = '\\'.make_str_constant($expr); } elsif (is_symbol ($expr)) { $result = "'$expr'"; } else { die "Unknown literal type.\n"; } # Requote of necessary. if ($quoted) { $result = "\[$result\]"; } # And fixup the formatting. $result = $istr . $result; return $result; }
Translate an element in a list. If $want_func
is 1, add code to
check to make sure the result is executable. This is the entry
point for evaluating a single Lisp thing of unknown nature.
sub translate_elem ( $$$ ) { my $element = shift; my $indent = shift; my $want_func = shift; my $result; $element = expand_macros ($element); if (is_symbol ($element)) { my $func = $want_func ? "_func" : ""; $element =~ s.\\.\\\\.g; return (' ' x $indent)."lookup_var$func (\'$element\')"; } if (!defined ($element)) { return "undef"; } if ($want_func) { die "Expecting function, got '@{[printable_string($element)]}'\n"; } if (is_literal ($element)) { # Since quotedness is intrinsic to the type, we need to strip that # off when passing the element as an argument. if (is_quote_expr ($element)) { $element = unquote ($element); } $result = translate_literal ($element, $indent); } else { $result = translate_expr ($element, $indent); } return $result; }
Given a function call (or something that looks like one, anyway), see if it's actually a macro invocation and if so, call the macro function on it and return the result. Otherwise, just return the argument.
sub expand_macros ( $ ) { my $expr = shift; my $name; # Non-lists contain no macros if (!is_cons ($expr)) { return $expr; } $name = car ($expr); if (!is_symbol ($name)) { return $expr; } if (!defined ($macros{$name})) { return $expr; } if ($show_macro) { print "Macro $name. Before:\n", printable_string($expr), "\n"; } $expr = call_func ($macros{$name}, $expr); if ($show_macro) { print "After:\n", printable_string($expr), "\n"; } return $expr; }
Given a list containing a Lisp expression, translate it into Perl.
sub translate_expr ( $$ ) { my $expr = shift; my $indent = shift; my $result; my $indent_string = ' ' x ($indent*2); my $lookup_expr = translate_elem (car ($expr), $indent+1, 1); $result = $indent_string . "call_func (\n"; $result .= $indent_string . $lookup_expr . ",\n"; lforeach cdr($expr), sub { my $element = shift; $result .= $indent_string . translate_elem ($element, $indent+1, 0); $result .= ",\n"; }; $result =~ s/\,\s*$/\)/; return $result; }
Given a list of lisp expressions, generate a function which evaluates them one after the other in order.
sub translate_progn ( $$$ ) { my $progn = shift; my $tmp = shift; my $indent = shift; my $indent_string = ' ' x $indent; my $result = "${indent_string}$tmp = do \{\n"; lforeach $progn, sub { my $expr = shift; my $is_last = shift; $result .= translate_elem ($expr, $indent+1, 0); $result .= ";\n"; }; $result .= "$indent_string\};\n"; return $result; }
Given a lambda expression, produce a perl function that performs its actions.
sub translate_lambda ( $ ) { my $expr = shift; my $result; $result = 'sub { my $args = shift; my $lambda = shift; my $result; make_locals ($lambda, $args); '; $result .= translate_progn ($expr, '$result', 1); $result .= ' pop_context; return $result; } '; if ($show_trans) { print "Lambda translation:\n$result\n"; } return $result; }
Make sure that $arg_list
is a valid lisp argument list.
sub check_args ( $ ) { my @arg_list = flatten (shift); if ($arg_list[-2] eq '&rest') { $arg_list[-2] = undef; } foreach my $arg (@arg_list) { if ($arg eq '&rest') { die "\&rest flag needs to be second-last argument.\n"; } } }
This is the meat of the lambda function.
sub lambda ( $$$ ) { my ($args, $locals, $body) = @_; my $src = translate_lambda ($body); check_args ($args); my $perl_code = eval $src; die "Internal complile error: $@\nCode:\n$src\n" if ($@ ne ""); return [$perl_code, curr_context, $args, $locals]; }
Translate and print. This is a debugging aid.
sub show_translated ( $ ) { my $expr = shift; print translate_elem($expr, 0, 0); print "\n"; }
These are routines that get called by translated code exclusively.
Uh, not much else to say.
Create and initialize the current context. This gets called by generated code.
sub make_locals { my ($lambda, $args) = @_; my @flat_args = @{flatten($args)}; my $rest_flag = 0; # Create the new context. push_context ($lambda->[1]); # Create the formal arguments and bind the actual arguments to them. # I use nested blocks (i.e. '{{' and '}}') here so that the 'next' # and 'last' will work (albeit not exactly as expected in the case # of last). If this convuses you, type 'perldoc perlsyn' and skip # to the section labeled 'Loop Control'. lforeach $lambda->[2], sub {{ my $formal = shift; my $is_last = shift; if ($formal eq '&rest') { $rest_flag = 1; next; } if ($rest_flag) { die "\&rest must refer to the last item in an argument list." unless $is_last; def_var ($formal, mk_list (@flat_args)); last; # This does what we want but only because this is the last item } my $actual = shift @flat_args; def_var ($formal, $actual); }}; # Create the local variables lforeach $lambda->[3], sub { my $local_name = shift; def_var ($local_name, undef); }; }
Call the function given by the first argument (a lambda) with the rest of the argument list.
sub call_func { my $lambda = shift; my $num_args = scalar (@_); my @formals = @{flatten ($lambda->[2])}; # Blech! O(n) algorithm! my $expected_args = scalar (@formals); my $plural = $expected_args > 1 ? "s" : ""; # If this function can take a variable number of arguments, make # sure there are enough. if ($expected_args >= 2 && $formals[-2] eq '&rest') { --$expected_args; if ($expected_args > $num_args) { die "Expecting at least $expected_args argument$plural, got " . "$num_args.\n"; } } elsif ($expected_args != $num_args) { die "Expecting $expected_args argument$plural, got $num_args.\n"; } return &{$lambda->[0]} (mk_list(@_), $lambda); }
This is guts of the Tulip 'eval' function.
Not much to it, actually. It just translates and calls Perl's ``eval'' on the code. Easy!
Evaluate a tulip list as a function
sub tulip_eval { my $expr = shift; my $code = translate_elem ($expr, 0, 0); print "Translated expression:\n$code\n" if ($show_trans); my $result = eval $code; if ($@ ne "") { die "Perl Error: $@\nCode: $code\n\n"; } return $result; }
The function 'read' is what passes for a parser in Tulip. The actual entry point is called ``tulip_read'' although it's accessible from Lisp by the name ``read''.
There are several hooks into read from Lisp:
prim()
(see below) and
is mostly a wrapper around FileHandle->getline.
These may be overridden by user programs in order to wreak all kinds of havoc on the Lisp language. Note that I haven't tested these features yet.
These routines define 3 global variables. Yes, this is terrible coding style--sue me.
The variables $line_count and $last_file are used to keep track of
where in the file we are so that synerr()
can report the location.
Since these get updated by _read-line() and load()
(see below), too
much syntactical fiddling could throw these off.
$line_buffer holds some or all of the current line being tokenized. Yawn.
sub read_tolerantly( $ ); my $line_buffer = undef; # External cache for one line my $line_count = 0; # Current line number fetched by fill_buffer my $last_file = ""; # Last file opened by readEval.
Die, displaying line number and filename information. This is only
meaningful if $last_file
and $line_count
were both set recently.
sub synerr ( $ ) { my $msg = shift; chomp $msg; die "Read error: $msg at $line_count in $last_file.\n"; }
If necessary, read the next line from input and store the result in
$line_buffer
, calling a the _read-line primitive.
sub fill_buffer { my $fh = shift; my $result = undef; my $read_line = $globals{'_read-line'}; while ($line_buffer =~ /^\s*$/) { $line_buffer = &{$read_line->[0]} (mk_list ($fh)); ++$line_count; if (!defined($line_buffer)) { return; } } }
Return a pair containing the first argument (a symbol) and the
second converted to a Lisp string object (i.e. a reference). This
is here because I don't want to return a reference to $1
, since
that's magical and (I think) likely to change underneath me.
sub mk_tok_pair ( $$ ) { my ($type, $tok) = @_; return mk_pair ($type, \$tok); }
Read the next token from $fh
. This is callable as a primitive.
sub next_token { my $arglist = shift; my $fh = car ($arglist); local $_; my $token = undef; fill_buffer ($fh); # Return undef if we've reached EOF. return undef if (!defined ($line_buffer)); $_ = $line_buffer; s/^\s*//; # If we've reached a comment, clear the line and try again. if (/^\;/) { $line_buffer = ""; return next_token ($arglist); } SWITCH: { my $kw_re = '^([-_+-~!%^&*|[:alpha:]]+)'; s{^(\'|\(|\))}{} and do {$token = mk_tok_pair ('punctuation', $1); last;}; s/$kw_re//o and do {$token = mk_tok_pair ('name', $1); last;}; s/^\"([^\"]*)\"// and do { my $tok = $1; $tok =~ s/\\n/\n/g; $token = mk_pair ('string', \$tok); last; }; s/^(\d+(\.\d+))// and do { $token = mk_tok_pair ('number', $1); last; }; synerr "Invalid token near: $line_buffer.\n"; }; $line_buffer = $_; return $token }
Read a quoted item. If the item is a list or a symbol, return it quoted. Otherwise, just return it. # Note: I'm almost positive I can remove the tests and just quote everything. I even think it might be more correct, but you know what? It's late and I want to get this thing done and play Diablo II for a while, so I'm not going to bother.
sub read_quoted ( $ ) { my $fh = shift; my $rest = read_tolerantly ($fh); if ($rest == $EOF) { synerr "Unexpected end-of-file.\n"; } # Quoted names are symbols if (is_symbol ($rest)) { return quote ($rest); } # If it's a list, return it in an array ref, as per the convention. if (is_cons ($rest)) { return quote($rest); } return $rest; }
Read a list and return it.
sub read_list ( $ ) { my $fh = shift; my @result_array; my $elem; while (1) { $elem = read_tolerantly ($fh); if ($elem == $EOF) { synerr "Unexpected end-of-file.\n"; } if ($elem eq ")") { return mk_list (@result_array); } push @result_array, $elem; } # If we get here, we've hit the end-of-file without finding a # closing paren. Oops. synerr "Missing ')' at end of file.\n"; }
This is the guts of the read function, only it returns ``)'' instead of treating it as an error, since it may have been called recursively.
sub read_tolerantly ( $ ) { my $fh = shift; my $result = undef; my $next_tok = $globals{'_next-token'}; my $tok = &{$next_tok->[0]} (mk_list($fh)); my $tok_text; if (!defined ($tok)) { return $EOF; } $tok_text = ${cdr ($tok)}; my $toktype = car ($tok); if ($toktype eq 'punctuation') { if ($tok_text eq ")") { return $tok_text; } if ($tok_text eq "'") { return read_quoted ($fh); } if ($tok_text eq "(") { return read_list ($fh); } } # A quoted name is just a string, which can be used as a name. if ($toktype eq 'string') { return \$tok_text; } if ($toktype eq 'name' or $toktype eq 'number') { return $tok_text; } synerr "Unknown input.\n"; # I don't think we can get here. }
The lisp read function. This expects to be called at the top level of nesting.
sub tulip_read { my $fh = shift; my $result = read_tolerantly ($fh); synerr "Unexpected ')'" if ($result eq ")"); # Users don't see EOF. if ($result == $EOF) { return undef; } return $result; }
This section contains all of the primitives and the code used to support them and their definitions.
The function 'prim' is used to define primitives. It takes 3 arguments:
- The number of formal arguments (or -1 if it's a variable number) - The public name of the function - A reference to the sub implementing the function.
The sub must take one or two arguments. The first argument is the list of actual arguments passed by the calling Lisp routine. The second argument (which is usually ignored here) is the lambda of the function.
Note that there is structurally no difference between a primitive and a lambda at this stage. Both are a Perl function and some extra information and what they do with it is up to them.
Defining new primitives is simple. Just call prim with a reference to a function taking one argument (the arg. list) and you're done.
Routine to define a primitive.
sub prim ( $$$ ) { my ($num_args, $name, $function) = @_; my $formals; my $lambda; if ($num_args < 0) { $formals = mk_list (qw'&rest args'); } else { $formals = mk_list (split (/ /, 'x ' x $num_args)); } $lambda = [$function, $global_context, $formals, []]; $globals{$name} = $lambda; }
Given a lisp object, return a string suitable for printing that describes it.
sub printable_string ( $ ); # Needed for recursive call to parse sub printable_string ( $ ) { my $arg = shift; my $desc; if (!defined ($arg)) { return "nil"; } if (is_number ($arg)) { $desc = $arg; } elsif (is_opaque_obj ($arg)) { $desc = "<".ref($arg).">"; } elsif (is_string ($arg)) { $desc = "\"${$arg}\""; } elsif (is_symbol ($arg)) { $desc = $arg; } elsif (is_lambda ($arg)) { $desc = "<function>"; } elsif (is_quote_expr ($arg)) { return "'" . printable_string (unquote ($arg)); } elsif ($arg == \$true) { $desc = "t"; } elsif (is_proper_list ($arg)) { $desc = "("; lforeach $arg, sub { my $item = shift; my $is_last = shift; $desc .= printable_string ($item); $desc .= " " unless $is_last; }; $desc .= ")"; } elsif (is_cons ($arg)) { $desc = "(cons "; $desc .= printable_string (car ($arg)); $desc .= " "; $desc .= printable_string (cdr ($arg)); $desc .= ')'; } else { $desc = "<unknown>"; } return $desc; }
The lisp print function. It attempts to return a parseable description, although that's not necessarily going to work correctly.
sub tulip_print ( $ ) { my $arg = car (shift); my $line = printable_string ($arg); print $line; return undef; } prim 2, 'cons', sub { my $arg = shift; return [car ($arg), cadr ($arg)] }; prim 1, 'car', sub {return car (shift->[0])}; prim 1, 'cdr', sub {return cdr (shift->[0])}; prim 1, 'cadr', sub {return cadr (shift->[0])}; prim 1, 'caddr', sub {return caddr (shift->[0])}; prim 1, 'length', sub {return llength (shift->[0])}; prim 1, 'last', sub {return llast (car (shift))}; prim 2, 'set', sub { my $arg = shift; my $name = car ($arg); my $value = cadr ($arg); if (!is_symbol ($name)) { die "First argument to 'set must be a symbol.\n"; } set_var ($name, $value); return $value; }; prim 1, 'print', \&tulip_print; prim 1, 'printnl', sub {tulip_print(shift); print "\n"; undef}; prim 1, 'prints', sub { my $arg = car (shift); ensure_string ($arg); print ${$arg}; }; prim -1, 'sprintf', sub { my $args = shift; my @arg_list; my $result; ensure_string (car ($args)); my $fmt = ${car($args)}; lforeach cdr ($args), sub { my $item = shift; if (ref($item) eq 'SCALAR') { push @arg_list, ${$item}; } else { push @arg_list, $item; } }; $result = sprintf $fmt, @arg_list; return \$result; }; prim 2, '_define', sub { my $args = shift; my ($name, $value) = @{flatten ($args)}; def_global ($name, $value); }; prim 3, '_lambda', sub { my $args = shift; return lambda (car ($args), cadr ($args), caddr ($args)); }; prim 2, '_macro', sub { my $args = shift; my $name = car ($args); my $function = cadr ($args); ensure_symbol ($name); ensure_lambda ($function) if (defined ($function)); $macros{$name} = $function; return undef; }; prim 1, 'eval', sub { my $args = shift; my $expr = car ($args); return tulip_eval ($expr); }; prim -1, 'list', sub {return shift}; prim 2, '_while', sub { my $arg = shift; my $test = car ($arg); my $block = cadr ($arg); while (1) { my $tf = tulip_eval ($test); if (!defined ($tf)) { return undef; } tulip_eval ($block); } return undef; # not reached. }; prim 2, '+', sub { my $arg = shift; return car ($arg) + cadr ($arg); }; prim 2, '-', sub { my $arg = shift; return car ($arg) - cadr ($arg); }; prim 2, '*', sub { my $arg = shift; return car ($arg) * cadr ($arg); }; prim 2, '/', sub { my $arg = shift; return car ($arg) / cadr ($arg); }; prim 3, '_if', sub { my $arg = shift; my $tf = tulip_eval (car ($arg)); if (defined ($tf)) { return tulip_eval (cadr ($arg)); } else { return tulip_eval (caddr ($arg)); } }; prim 2, '==', sub { my $arg = shift; my $left = car ($arg); my $right = cadr ($arg); return $left eq $right ? $globals{t} : undef; }; prim 2, '<', sub { my $arg = shift; my $left = car ($arg); my $right = cadr ($arg); ensure_num ($left); ensure_num ($right); return $left < $right ? $globals{t} : undef; }; prim 1, 'read', sub { my $args = shift; return tulip_read (car ($args)); }; prim 1, 'open', sub { my $args = shift; my $filename = car ($args); my $fh = new FileHandle $filename; return $fh; }; prim 1, 'close', sub { my $args = shift; my $fh = car ($args); $fh->close; return undef; }; prim 2, 'fputs', sub { my $args = shift; my $fh = car ($args); my $str = cadr ($args); ensure_string ($str); print $fh ${$str}; return undef; }; prim 2, 'fget', sub { my $args = shift; my $fh = car ($args); return $fh->getc; }; prim 1, '_read-line', sub { my $arg = shift; my $fh = car ($arg); if ($fh->eof) { return undef; } else { return $fh->getline; } }; prim 1, '_next-token', \&next_token; prim 1, 'quote', sub {return quote (car (shift))}; prim 1, 'unquote', sub {return unquote (car (shift))}; prim 1, 'string->symbol', sub { my $arg = car (shift); ensure_string ($arg); return ${$arg}; };
This is the mainline routine. It's pretty simple. It goes through the argument list, opens each filename on it and evaluates their contents.
It evaluates each expression as soon as it has read it in, so expressions that change reading behaviour (by, say, overriding _next-token) take effect immediately.
Read, then evaluate all expressions in the file referenced by the given FileHandle.
sub readEval ( $ ) { my $fh = shift; while (!$fh->eof) { my ($expr, $result); $expr = tulip_read ($fh); next if (!defined ($expr)); $result = tulip_eval ($expr); if ($verbose) { print printable_string ($result), "\n"; } } }
Open the given filename for reading, then call readEval on it.
sub load ( $ ) { my $fileName = shift; my $fh; $fh = new FileHandle ("< $fileName") or die "Unable to open filename $fileName\n"; $line_count = 0; $last_file = $fileName; print "Reading in $fileName:\n" if ($verbose); eval { # Comment out the eval to get better backtraces from Perl readEval ($fh); }; if ($@ ne "") { synerr "$@"; } $fh->close; }
Mainline routine
sub go { my $file; for my $fn (@ARGV) { print "Loading $fn:\n" if ($verbose); load ($fn); } } go;