Site update
I have now updated jj (the software that runs this blog) to be slightly less broken. It is also now on http://codeberg.org instead of http://github.com, although the latter repository is still archived.
This will be of interest to the zero people who use this software and are not me.
# Posted 2024-10-20 18:20:38 UTC; last changed 2024-10-20 18:17:15 UTC
Loom: A Programming Language
I'm a programming language nerd enthusiast, and one of the ways
this manifests itself is in the occasional urge to design a new
language. There have been
multiple
such attempts
in my past and I succumbed to the urge again last year.
Here's the result. It's called Loom.
The initial implementation is available here, along with a language reference and library reference.
If you want a more gentle introduction to it, read on.
This post is mostly about the ideas behind it but it can also serve as a wierdly overthought tutorial.
Overview
Roughly speaking, Loom is a dialect of Smalltalk with C++-style syntax. Its goal is to be:
- Purely object-oriented in the Smalltalk sense
- Homoiconic
- Minimal
- Transparent
- Easy to implement
The core ideas were stolen from Smalltalk when they left the doors unlocked one night while the syntax was accidentally-on-purpose stolen from sclang. I also shoplifted some useful concepts from Ruby, and a few Lisp ideas may also have somehow found their way into my bag.
The three main ideas behind Loom are:
- Everything is an object.
- Everything is done by sending a message1 to an object.
- Both compiling and running Loom code are simple enough processes that you can hold them in your head.
Running stuff
If you clone and (successfully) build the sources linked above, you'll
have the Loom interpreter. When run with no arguments, it will drop into
an interactive session (aka a REPL
):
$ ./src/loom
Loom REPL. Hooray!
> 3 + 4
7
(It helps to have rlwrap installed; the script in src/loom
will use it
if it's available.)
You quit it with an EOF character (CTRL+D on *nix).
And if you run it with a Loom program, it will attempt to execute it, as one does:
$ ./src/loom examples/sieve.loom 40
Solving up to 40...
Primes up to 40:
2 3 5 7 11 13 17 19 23 29 31 37
Okay, onward to the language itself. Let's start with some basic stuff:
Basic Stuff
Comments begin with #
or //
and go to the end of the line:
// comment
# also a comment
2 + 3; // comment after a statement
(I just couldn't pick a favourite.)
Numbers and strings are as you'd expect from a C-style syntax:
123 # Decimal integer
420.69 # Decimal float
0xBADCAFE # Hex integer
0b1011 # Binary integer
"Hello!\nworld!\n" # Some C-style escapes are supported
There's also syntax for creating symbols and vectors. These look like literals but aren't:
[1, 2, 3] # Vector (i.e. array)
:foo # The symbol 'foo'
Symbols represent names within the system, just like in Lisp, Smalltalk, Ruby, and other right-thinking languages.
And Vectors are what I call arrays, because I'm pretentious. And also
because I'm reserving the word Array
for a possible future type that's
more primitive. (Vectors can be resized in place; arrays can't. In the
future, I may want to implement Vectors around Array instances so I
don't need to use the host system's types. But I digress.)
Names (used for variables and methods) follow the standard C convention:
foo_bar_quux
_fooBarQuux42Baz
That is, anything matching the regexp /^[_a-zA-Z][_a-zA-Z0-9]*/
.
However,
Any variable name beginning with an upper-case letter is a constant:
def Pi = 3.14159
(This also works for method arguments and locals; the latter isn't useful and is kind of a bug.)
There are a few well-known constants (
Self
,True
,False
,Nil
, andHere
) whose lowercase names (self
,true
,false
, etc.) are reserved by the parser and expanded into their upper-case versions. So (e.g.)nil
is just another way of writingNil
.Any character (almost) can be part of a name if it's quoted with backtick characters (
`
):`$20, same as in town` = 20; PoliteObject.new.`please initialize this instance`();
This last property leads to some clever hackery we can do with syntax.
Message syntax
Sending a message to an object follows the usual C++-style syntax we know and love:
object.message(arg1, arg2, arg3)
Since this is the only thing you can do, Loom coding (and reading) would normally be a huge slog. We work around this in a number of ways, mostly by fiddling with the syntax.
For example, consider some basic arithmetic:
3.mult(4).add(1);
We can (and do) use backticks and give the methods more operator-like names:
3.`*`(4).`+`(1);
and this is slightly better but still not very readable.
So the parser2 treats any token made of operator
characters as
implying the `` .`...` ``
part. And if there's no open parenthesis
token (`(`
) afterward, it treats this as equivalent to taking the next
token and wrapping the parens around that.
Which means the above can be written as
3 * 4 + 1;
3 + (4) + 1;
3 + 4 + (1);
This also means that Loom doesn't need brackets3. If you need to change the order of evaluation, you can use the argument list parens:
a + (b * c) + (d * e);
// Equivalent to
a.`+`(b.`+`(c)).`+`(d.`+`(e));
We do other things with syntax. If a (non-operator) message has no arguments, it's safe to leave off the trailing parens:
b = a.foo;
So getter methods are basically free. For setters, the parser looks for
a trailing =
token and if it finds it, first renames the message to
have a trailing underscore and then passes the expression after the =
as its argument. The following are equivalent:
b.foo = bobo.count + 1;
b.foo_(bobo.count() + 1);
Semi-related, C-style array access syntax gets expanded into at
and
atPut
message sends:
x = a[n + 1];
// is equivalent to
x = a.at(n + 1);
x[n + 1] = 42;
// is equivalent to
x.atPut(n + 1, 42);
So vector access looks the way you'd expect it to, but so does the (very
slow) Dictionary class. Anything that implements at
and atPut
can be
accessed with this syntax.
Okay, onward to the deep end.
Quoting
Loom does Lisp-style quoting. You mostly don't need to worry about it unless you're poking around the internals, but as I intend to do just that, this is necessary.
The syntax for a quoted expression is the expression surrounded by
special brackets :(
and )
. For example:
x = :( foo );
Quotes keep the things between the brackets from being evaluated. So in
the above snippet, x
gets the symbol foo
instead of the value of
the variable named foo
.
(And yes, the :foo
syntax above is just shorthand for :(foo)
.)
Most Loom objects just evaluate to themselves, so quoting them has no effect. The exceptions are symbols (as above), message send expressions, and quoted expressions themselves.
There's one extra bit of quote-related syntax. A Vector expression
prefixed with a colon (:[
instead of [
) is equivalent to quoting
each element of the vector. The following are equivalent:
:[a, b, c]
[:a, :b, :c]
Vector.with(:a, :b, :c)
Quotes end up being vital for a lot of metaprogramming-related things, and since the underlying machinery of Loom is already based on metaprogramming, we need them.
(I initially tried to avoid adding this feature. I thought I could
simply decompose each object into an expression that recreated it, so
that (e.g.) :foo
would expand to "foo".intern
. This might be
viable, but debugging any kind of metaprogramming was a nightmare
problem that Quote mostly removed.)
Objects and Classes
Now, let's talk about object-oriented programming. Here's a class definition:
def ContactInfo = Object.subclass(:[name, address, work_phone,
home_phone, tags]);
Let's start to the left of the =
.
The def
keyword defines a global constant, ContactInfo
and assigns
the result of the expression after the =
to it. (def
is syntax
that expands to a call to Here.defglobal(...)
. I'll get to Here
later.)
To the right, we see Object
. This is the root class which, like all
other classes, is an object. Its method subclass
creates the new
class and its instance variables (aka slots
) are defined by the
array of symbols subclass
receives as its first argument.
Most classes have an initializer method (constructor
in C++-speak):
ContactInfo::initialize = { | name_arg |
name = name_arg;
tags = [];
};
This is an ordinary method but it gets called by the class's
instantiation method, new
; its arguments (the name(s) between the |
characters) are all passed to initialize
:
def Ringo = ContactInfo.new("Ringo Starr");
Instance variables are private to the object, so to get at them from outside, we'll need to add a getter and setter method:
ContactInfo::name = { return name };
ContactInfo::name_ = { | new_name | return name = new_name };
(Recall that something like this:
Ringo.name = "Richard Starkey";
gets expanded to a call to name_
, the setter.)
Loom actually has built-in shorthand for this (and also the read-only and write-only variants), so you'll rarely need to write them by hand.
ContactInfo.accessible(:address);
ContactInfo.accessible(:work_phone);
ContactInfo.accessible(:home_phone);
Methods can also take variadic arguments:
ContactInfo::tag = {|*all_tags|
tags = tags + all_tags;
}
Ringo.tag(:ringo, :the_best_drummer_in_liverpool);
They also (obviously) have local variables, declared between a second,
optional pair of pipe (|
) characters:
ContactInfo::set_field_count = { ||
| sum |
sum = tags.size;
[name, address, work_phone, home_phone].each{|fld|
fld.is_nil.not .if { sum = sum + 1 }
};
return sum;
};
In this case, we need to also specify an empty argument list. However, it's safe to omit empty argument lists if the resulting code is unambiguous. (This is any case except for when there are temporaries but no arguments.)
We can also add methods to individual objects:
Ringo::*is_pete_best = { return false };
This includes classes:
ContactInfo::*new_beatle = { return self.new("Paul McCartney") };
def Paul = ContactInfo.new_beatle;
All of this sytax for defining new methods expands into ordinary message send expressions. For example, this
ContactInfo::dial = { ... }
expands into something like this
ContactInfo.inner_add_method(:dial, ...);
So all of this is available for metaprogramming.
Sending Messages
In addition to the language's message-send syntax, message-based languages typically provide a way to programmatically send a message to an object. This is typically done by method(s) of the base class that take the name and message arguments as their own arguments, then send them and return the result. This is how Loom does it as well.
In Loom, there are two methods of class Object
: send
and sendv
.
send
is a variadic method whose first argument is the message name (a
symbol) and the remaining argument are passed to the message. For
example,
3.send(:`+`, 4) # 7
This is equivalent to either of
3 + 4
3.`+`(4)
But because the message is an argument, we can compute it:
msg = self.select_at_random(:[`+`, `-`, `*`, `/`]);
3.send(msg, 4) # ???
sendv
is like send
, but not variadic. Instead, it takes exactly
two arguments where the second is a vector containing the message's
arguments. With sendv
, the above examples would look like this:
3.sendv(:`+`, [4]) # 7
msg = self.select_at_random(:[`+`, `-`, `*`, `/`]);
3.sendv(msg, [4]) # ???
This is important because, while Loom methods can take variadic
arguments, there is currently no other way to unpack a vector of
arguments into an argument list the way (e.g.) Ruby's *
prefix does.
In the future, something like this will probably work
args = [];
// ...append arguments to args...
thing.msg(*args); // Not implemented yet
but for now, you'll need to use sendv
:
args = [];
// ...append arguments to args...
thing.sendv(:msg, args);
The Machinery of Objects and Classes
Under the hood, the Loom object system is actually (crudely) prototype-based, by which I mean that 1) objects have their own method dictionaries and 2) can delegate method lookup to one or more other objects.
In practice, it isn't a very good prototype system, but there's enough there to use as the basis for a powerful class-based object system.
The core idea behind this is that we have a special kind of object
called a trait. Traits are ordinary objects with the usual method
dictionary (and delegate list), but they also have a second method
dictionary/delegate list pair. (We call these inner
methods and
delegates.)
If an object has a trait as a delegate, the trait's inner dictionary
(and inner delegate list) will be used instead of the usual (outer
)
one.
This gives us the foundation for classes and the rest is just library code implementing common-sense conventions. In Loom, a class is just a trait that:
- Provides the method
new
(to create new instances). - Provides a method named
slots
that returns the list of instance variables. - Provides the method
subclass
to create a subclass. - Is part of the common class heirarchy rooted at
Object
, using its first inner delegate slot as the superclass.
Items 1, 2 and 3 are provided by the metaclass Class
, which serves as
the class of all named classes (including Class
itself) and item 4 is
de-facto enforced by method subclass
since all objects that provide it
are already in the heirarchy.
Traits also give us mixins (which I call AddonTraits for dumb reasons):
def Boopable = AddonTrait.new;
Boopable::boop = { "Booped.".println };
def BoopableContact = Contact.subclass([], Boopable);
BoopableContact.new("George Harrison").boop;
These can be mixed into new classes by passing them to subclass
after
the slot list.
Blocks and Control Flow
Loom, like Smalltalk, has easy lambdas (called blocks
here4), and
as in Smalltalk and Lisp, they're used for flow control.
(By lambda
, I mean an anonymous function that has access to the
(possibly local) scope in it was defined.)
You normally define a block with braces, just like method bodies, and
you invoke it with the call
method:
blk = {"***block body***".println};
blk.call(); # "***block body***"
Blocks can (but don't have to) take arguments and define local variables:
add = {|a, b| |result| result = a + b; result};
add.call(3, 4); # 7
And they capture their local context:
Thing::counter = {||
|total|
total = 0;
return { total = total + 1; total }
};
def x = Thing.new.counter;
x.call; # 1
x.call; # 2
x.call; # 3
x.call; # 4
If you're familiar with Lisp, Ruby, or Smalltalk, this is old hat to you. (If not and I just blew your mind, feel free to take a moment.)
Loom uses blocks for nearly all flow control. For example, the if
statement is implemented by adding methods to the Boolean types:
def Boolean = Object.subclass([]);
def True = Boolean.new;
def False = Booelan.new;
True::*if = {|body| return body.call()};
False::*if = {|body| return false};
Since all boolean operations return True
or False
, something like
this
a > b .if { "a is bigger!".println };
works as expected. If a > b
returns True
, it will invoke True
's
if
and that will evaluate the block. If it returns False
, it will
instead return False
's if
, which does not.
Short-circuited AND and OR operations work in much the same way:
a > b && { self.is_really_better(a, b) } .if { self.do_thing(a) };
(Aside: the parser will treat one or more blocks following an ordinary message send as arguments for that message. So the following are equivalent:
a.b({1}, {2});
a.b({1}) {2};
a.b() {1} {2};
a.b {1} {2};
Which can make the code look a bit cleaner. In the case of the &&
operator, normal parsing rules apply; there's an implicit pair of
parents around the first block.)
The foreach
loop's equivalent is provided by the Vector
method
each
(by way of a mixin named Enumerable
):
[1,2,3,4,5].each{|n| n.str + "," .print } # 1,2,3,4,5
We also have the usual other map/reduce/etc methods:
[1,2,3,4,5].map{|n| n*n} # [1, 4, 9, 16, 25]
[1,2,3,4,5].select{|n| n*2 > 4} # [3, 4, 5]
[1,2,3,4,5].inject(0) {|sum, n| sum + n} # 15
And the for
loop's equivalent is the same thing, but over an object
(class Range
) that pretends to be an array of increasing integers:
1 -> 5.each{|n| n.str + " " .print }
1 2 3 4 5
And the typical while
loop is just as easy. All it needs is... um...
Okay, fine, while
is a built-in method of Block
written in C++.
You call it like this:
{n < 5} .while { n = n + 1 ; n.str + " " .print }
How Methods Work
As mentioned above, the brace-delimited function syntax ({ ... }
) is
syntactic sugar expanded by the parser into a set of message sends. It
makes some sense to think of it as a fancier form of the quoted
array expression. That is, something like this
{ a + 1; b + 2 }
expands to something a lot like the expansion of
:[ a + 1, b + 2 ]
(This is before we talk about the arguments and local variables, of course. Also, the parser treats semicolons as separators but will forgive extras more easily.)
The missing piece of this is what happens when you quote a Loom message send:
:( a + 1 ) # a.+(1)
:( a + 1 ).class # MsgExpr
That's right, there's a class representing a message send expression. It looks like this:
def MsgExpr = Object.subclass(:[
receiver, # The expression to the left of the "."
message, # The message, a symbol
args # The vector of argument expressions
);
A method body is just an array of these (or symbols, or other objects), and a trivial Loom interpreter looks something like this:
Evaluator::eval_obj = { | context, obj |
obj.class == MsgExpr .if { ||
| receiver, args |
receiver = self.eval_obj(context, obj.receiver);
args = args.map{|arg| self.eval_obj(context, arg) };
return receiver.send(obj.message, args);
};
obj.class == Symbol .if { return context.lookup_name(obj) };
return obj;
};
Evaluator::eval_method_body = { | context, method_body |
method_body.each{|expr| self.eval_obj(context, expr) };
return context.lookup(:Self);
};
There are more fiddly little details to it than that, but this is the core idea.
If you quote a block definition,
:( {2+3} )
you'll get something like this:
ProtoMethod.new([], nil, [], :[2.+(3)], nil).make_block(Here)
Which is to say that you're getting a little bit more than just a list
of expressions. Block (and method) definitions expand into an instance
of class ProtoMethod
, which looks like this:
def ProtoMethod = Object.subclass(:[
args, # Vector of formal arguments
restvar, # nil or the name of the variadic argument list
locals, # Vector of local variable names
body, # Vector of expressions that make up the method body
annotation # nil or a descriptive string intended for error messages
);
The first three arguments get filled from the argument and local variables list and the fourth is the actual method body.
This could be interpreted as a method or block by something like the
Evaluator
example above. However, in the actual implemention,
methods and blocks are opaque internal C++ structures that are easy to
access from the actual (C++) evaluator. ProtoMethod
serves as the
intermediate step. Actual methods are created by a pair of build-in
methods, make_block
and make_method
. This is analogous to how
Lisp's lambda
converts several lists into a callable function.
So there's nothing stopping you from constructing a
ProtoMethod.new(...)
expression programmatically and turning it into
an executable object.
(You can also get just the ProtoMethod
by prefixing your Block
declaration with a colon:
:{2+3} # ProtoMethod([],nil,[],[2.+(3)],nil
This is occasionally useful.)
There's one subtle gotcha here, though. Loom requires you to declare variables before you use them. This is a guard against typo-based bugs and also makes the scopes of names unambiguous.
So this method definition will result in an error:
def Thing = Object.subclass([])
Thing::bar = { return some_undefined_variable }
But this one won't:
Thing::foo = { |a| a .if { return another_undefined_variable } }
The reason for this, if you think about it for a few moments5 is
pretty clear. The inner block expands into an expression like
ProtoMethod.new(...).make_block(...)
. That is, not a function, but an
expression that will create the function. So the method doesn't
touch any undefined variables at all. It's only when it gets run and
tries to define the block that it does something wrong. Which is, of
course, far too late for our purposes.
And because the whole thing is just done with ordinary(ish) objects and methods, it's not like I'll always be able to guarantee the name correctness of a block or method. So I've kind of painted myself into a corner, haven't I?
Well, not really. Every brace expression gets expanded into something static enough that it's relatively straightforward to search it for undefined names6. So this is what we do.
If you're doing something clever with ProtoMethods
like creating
them programmatically, the system (probably) won't help you, but at
that point, undefined names are the least of your worries. For
ordinary blocks and methods, the Loom will give you a warning
(upgradeable to error) if you get a name wrong.
Here
, or How Variable Assignment Works
The thing I've mostly skirted around so far is how variable assignment works in Loom. You'll recall that <reverb>Everything Is Done With Message Sends</reverb>. Most things are easy enough to do that way, but variables aren't objects so you can't send them messages.
In Smalltalk (and Lisp), variable assignment is one of the few things that still needs to be done by its own top-level thing instead of calling a method or function. Finding a way to do this was a shower problem for me for a while, and when I hit on this idea, it was enough to inspire me to actually build a language around it.
It goes like this:
Each context has a local constant named Here
(aliased to here
) that
references the context itself7. Here
's class (Context
) provides
methods to access its names or those of outer scopes according to the
expected scoping rules. The method set
does the latter.
The parser simply expands conventional variable assignments into
here.set(...)
message sends:
foo = bar + 1; // This...
here.set(:foo, bar + 1); // ...becomes this.
here.set
follows the same scoping rules that the evaluator uses when
looking up variables (current block, outer blocks, method, object, and
global) and stores the value in the appropriate namespace and slot.
As with blocks, this means that you can defeat the compile-time checks for undefined names if you're overly clever:
here.set("unknown_" + "variable" .intern, 42)
And that's fine. The name checking really only cares about likely accidents, which means the boring infix-style assignment you get from the syntactic sugar. That's where the name typos you don't expect will come from.
But having here
as the way to access your local scope gives you all
kinds of extra flexibilty. Consider this little debug printf
method
you can monkeypatch onto Context
:
Context::pvar = {|name|
self.has(name) .if {
name.str + "=" + (self.get(name).str) .println
};
};
Now if you want to print a variable, you can just do
Bar::do_thing = {
// ...
here.pvar(:a);
// ...
}
and you'll get a nicely-formatted message.
Odds and Ends
How return
works
The final bit of syntactic magic is the return
statement. Like
everything else, it's syntactic sugar wrapping a message send.
Specifically, it invokes Context::return
.
Here's a typical method with a return statement:
Bar::thing = { |a| return a + 1 }
The return a + 1
part expands to:
here.method_scope.return(a + 1);
method_scope
returns the Context
belonging to the current method
call8. This is important because we expect return
to operate at the
method level:
ContactInfo::dial = {
self.location.time_of_day < (Time.noon) .if { return nil };
return self.really_dial;
};
That is, we expect the return
after the if
to cause dial
to return
before the next expression (calling really_dial
). If return
nil
had expanded to here.return(nil)
, it would only have exited from
the block itself and not the method.
This mechanism can be (ab)used in clever ways. For example, this method
Thing::quux = {
{
{
Here.outer.return(42);
return "nope"; // skipped
}.call;
return "also nope"; // skipped
}.call;
return "Yup"; // run
};
will return the string "Yup
because the innermost return will cause
the outer two blocks to also exit and let control flow fall to the next
statement.
Doing stuff like this is generally a bad idea, but it illustrates how
powerful Context::return
can be. Future versions of Loom may add extra
control statements (e.g. break
and continue
) built on this stuff.
Exceptions and Ensure
Loom also has exceptions. They got added late to the process, just because it made it so much easier to write tests for failing conditions.
Initially, Loom had a Context
method named fail
, which quit the
program with a message. That worked well enough for a while, but the
tests got increasingly awkward so I added catchable exceptions.
Here's an example:
{
here.throw("Some error")
}.catch(String) {|e|
"Caught exception '" + e + "'" .println;
};
And it does pretty much what you expect. Block::catch
is like call
,
except that if Context::throw
is called with an object whose class
matches9 catch
's first argument, calls its second argument with it
and execution continues from there on.
It probably would have been possible to write this in pure Loom using
Context::return
, but currently it's just two native C++ functions with
about 25 lines of code.
In an earlier draft of this post, the next couple of paragraphs talked
about how this exception system was pretty weak overall. The underlying
problem is that there's no way to guarantee that cleanup code will run
after an exception the way Java does with finally
or Ruby does with
ensure
, leading to all kinds of hard-to-track-down errors.
But then, I asked myself how hard it would really be to just fix that rather than document the failings. So I tried it, and it took maybe half a day to implement.
Here's an example of the feature:
{
fh = File.new(filename);
self.process(fh);
}.callAndEnsure {
fh.close();
}
The block argument (in this case, containing fh.close()
) is
always called after the receiving block exits, regardless of
how. It can throw an exception or do a return or just run to the end.
You can also combine it with exception catching, as one does:
{
fh = File.new(filename);
self.process(fh);
}.catchAndEnsure(ProcessingException) { |e|
return nil;
} {
fh.close();
}
Both of these methods are written in pure Loom, by the way. The
undelying machinery is provided by built-in method Context::ensure
.
This takes a block and evaluates it just before the context returns.
Here's the source code for catchAndEnsure
to illustrate this:
Block::catchAndEnsure = {|klass, handler, ensure_block|
here.ensure(ensure_block);
return self.catch(klass, handler);
};
The ensure block gets attached to the method's here
instead of the
call to self
, but that's good enough. After self.catch(...)
exits,
here
will also always return so ensure_block
will also be evaluated.
Bypassing Overridden Methods (i.e. super
)
I ended up writing a lot of Loom code before the first time I needed to be able to call a superclass's version of a method the current object had overridden. Which surprised me; I'd assumed that I'd need it much sooner than that10.
But I did need it, and it was unexpectedly tricky to figure out how to do it without any magic.
tl; dr, I ended up adding it Context
as a method named super_send
.
This works just like self.send
but the method search starts at the
superclass of the class that defined this method. (Not self
; it's
possible that it's already inherited this method, so the method
determines the starting point.)
Here's an example:
Thing::blatt = {|x| return here.super_send(:blatt, x); }
And, symmetrically with Object::send
, there's a sendv
version:
Thing::blatt = {|x| return here.super_sendv(:blatt, [x]); }
The reason it belongs to Context
is because at the time of the
super_send
call, here
is the only well-known object that knows both
self
and current method.
Final Thoughts
Loom is the first language I've designed that I actually want to use.
Most of my experiments in language design are successful in that they
produce a result, but that result has usually been, That wasn't a good
idea after all.
Loom didn't do that.
The tooling is awful, libraries are nonexistant, and the whole thing runs at geological speeds. And yet, it's fun to write Loom code. Writing runtime code was almost always easier in Loom than in C++. This despite the fact that I have an extremely good C++ toolchain with astoundingly good debugger support.
It's been fun even when I did really complex things. Figuring out if a Block uses an undefined name was difficult and required a lot of thought and iterative design, but doing that in Loom was not only possible but easier.
So that's how I'll end it. Loom doesn't suck. I'm as surprised by this as you are.
-
If you're unfamiliar with the Smalltalk concept of
sending a message to an object
, just mentally replace the term withcalling a method
. That's close enough for our purposes. ↩ -
Calling it a parser is perhaps overly generous, but it's the thing that turns text into internal data structures, so there we go. ↩
-
I had planned to add brackets and full BEDMAS infix evaluation order rules, but I found that just this and the other little bits of syntax were enough. ↩
-
This term was also stolen from Smalltalk. ↩
-
This is totally something I saw coming from the start and didn't catch me by surprise long after the basic Loom system was up and running. ↩
-
Fun fact: the code that does this is written in Loom itself. It also serves as an optimizer because it will evaluate the
ProtoMethod.new(...)
expressions ahead of time. ↩ -
Smalltalk also has
Here
(namedthisContext
, though) but doesn't take the next step of using it for variable assignment. ↩ -
Or nil, if there isn't one. That's not really possible on the current C++ implementation, though, since each input expression gets turned into its own wierd mutant unnamed method. ↩
-
By which I mean, is an instance of the class passed to
catch
or one of its subclasses. ↩ -
You would think that decades of writing OOP code would have been the hint, but nope. ↩
# Posted 2023-11-19 20:25:49 UTC; last changed 2023-11-19 21:35:40 UTC
Low-effort Retrocomputing
So the other day, I wondered if anyone had put up pre-installed disk images for any of the really old Linux distros. I found installation media at archive.org and this blog post. with (excellent) instructions on how to do it, but nobody had done the work (so I wouldn't have to) and put it up for download.
So I took a run at it and installed Slackware 3.0 (from 1995) on a QEMU disk image:
You can download it here. The archive's sha256 hash is
28be1e75f8c5b8e9338f18589549ebc871e6daae3d4a7433826684d0cae446d1
(Please be considerate of my bandwidth.)
The image has two accounts: root
and a user account named bob
;
both have the password slack
. It boots into X11 but you can get to
a text console by pressing Ctrl+Alt+1
if you need to. Note that the
console's termcap seems to be a bit messed up.
Networking works, but there's no web browser or ssh
client. There's
a C compiler though (I installed everything) so you should be able
to build period-appropriate ssh from sources if you need to. There
are also Linux builds of Netscape 3 on the 'net, although I have no
idea if they'll run on Slackware.
Configuring X took some hand-fiddling. You can see my work in
/etc/XF86Config
and compare it to the original generated config file
/etc/XF86Config.orig
if you want.
Anyway, feel free to download and play with it if you're curious or nostalgic or want to do pre-1995-tech dev jam.
# Posted 2023-01-19 16:43:08 UTC; last changed 2023-01-19 16:46:42 UTC
Your Sucks Programming Language Favourite
Much to my chagrin, I've found myself lately becoming a defender of C++. People who a) know me and b) appreciate irony should feel free to smirk right about now.
To be fair, modern C++ has improved significantly, reaching rarified hights of not-badness only dreamed of fifteen years ago. But that's kind of beside the point. When you choose a programming language for a project, the quality of the language itself is often less important than external stuff; the quality of available implementations, tools, research, etc.
If I'm going to bet my (hypothetical) business on investing a zillion dollars to write a program that I can then sell, I want to know that:
- The development tools aren't going to rot or disappear because the vendor lost interest (e.g. Visual Basic).
- I'll be able to hire skilled developers whenever I need to.
- Good quality tools, books, training, etc. will all be available when I need them.
(And as a developer, I want to bet my non-hypothetical livelihood on developing the skills that are most likely to keep me employed. Being a really badass Haskell programmer doesn't really do much for my job search1.)
So let's concede that Rust (for example) is a better language than C++. C++ will still be a better choice for most commercial ventures in that space because it has:
- Multiple high-quality implementations, two of which are FOSS2
- A huge selection of high-quality third-party tools
- An enormous community of developers with whom you can exchange knowledge
- A literal half-century of concerted research on how to use it effectively
C++ sucks in a variety of ways but we know exactly how it sucks and how to work around it. Rust's suckage is still unknown, and I want the thing that keeps me from being homeless3 to have a really good track record.
And this principle applies to Scala-vs-Java, Zig-vs-C, Haskell-vs-anything, anything-please-anything-vs-PHP or any other language debate. $FAVORITE_LANGUAGE may be a better language than $CHOSEN_LANGUAGE but that doesn't mean it's going to get the job done better, faster, cheaper or more reliably. All of that depends on the entirety of the language's ecosystem.
Note that I'm not saying don't use $FAVORITE_LANGUAGE.
Just
be aware of what's riding on that decision. For a hobby project or an
in-house tool that took a month to write, it's going to be fine. But
for the hundred person-year project the business depends on? I mean,
I'd really like $FAVORITE_LANGUAGE to be viable in ten years but I'm
not going to bet the mortgage on it.
Also, you should go out and learn all kinds of programming
languages--especially wierd ones that will never fly in
Industry--because it will make you a better programmer. I got a
lot of benefit as a C programmer from asking myself, How would I
do this in Smalltalk?
I'm a programming language nerd. I've spent a lot of time thinking about how languages work and how they make people think about programming. I learn new languages for fun and I've designed and implemented several. So I absolutely get the desire to use better languages and the frustration of having to deal with the broken status quo. In a perfect world, we'd all be using Smalltalk.
Unfortunately, our world is fallen and so C++ is a necessary evil.
-
Okay, that's an exaggeration. A good hiring process will recognize that Haskell skills are often transferable to whatever the company is using. Unfortunately, a lot of otherwise-fine employers have terrible hiring processes and will reject any résumé not listing the exact version of their preferred web framework. As those companies have money they will exchange for relatively pleasant work, I would like to retain the option of working there. ↩
-
But if $FAVORITE_LANGUAGE is FOSS, that means it will be available forever!
No, not really. If nobody else is working on it, you'll find yourself having to maintain the toolchain by yourself. At that point, it's almost always easier to just rewrite your program in something else. ↩ -
Yeah, yeah, I know; the real problem is Capitalism. ↩
# Posted 2021-07-04 17:16:50 UTC; last changed 2021-07-04 18:09:16 UTC
Getting the Singleton Class of a BasicObject in Ruby
Ruby objects provide the method singleton_class
which returns the
object's singleton class. Unfortunately, BasicObject
doesn't have
this because it's Object
's superclass. So to get it, we need to be
somewhat clever.
And having spent way too much time figuring out how to do this, I'm writing it here so a) that I don't lose it again and b) so that others will have less trouble than me. (I'm not on Medium so, um, hello from the fourth page of your Google search results.)
TL; DR, How do I do it?
In an instance, you'd do something like this:
obj = BasicObject.new
obj.instance_exec(obj) {
class << self
lself = self
self.define_method(:my_singleton_class) { lself }
end
}
Notice how I copy self
to lself
on line 4. That's because self
will have changed when the method is called but the block that forms
the body of my_singleton_class
captures the local variable.
Also: this won't work on Ruby versions from sometime before 2.7
because define_method
is private before then; see your version's
Module
documentation for define_method
for a hacky workaround if
it's too old.
Doing this with a BasicObject
subclass is even simpler:
class Thingy < BasicObject
def my_singleton_class
class << self
return self
end
end
end
What's it good for?
Any case where you want an object to handle a method call by doing something other than call a method. For example, a DSL or a proxy object that forwards the call to something else.
Typically, you'd create a class with no methods, then implement
method_missing
to catch the failing method lookups and do the right
thing with them.
class Proxy
def initialize(target) @target = target; end
def method_missing(name, args)
log "Called #{name} with #{args}"
return @target.send(name, args)
end
end
BasicObject
is the ideal base class for this because it has very few
methods but if that's not enough–if you need to get rid of those few
as well–you can always override (most of) them with a method that
calls method_missing
directly. This is straightforward when
creating a subclass but there are times when it's necessary or easier
to add methods to the object instead, and for that you need to get the
singleton class.
In my case, I'm writing a DSL where every method whose name starts
with a letter is valid; this means they all need to turn into calls to
method_missing
.
(Handling the case where the user uses method_missing
as a name in
the DSL is left as an exercise to the reader.)
What's a singleton class anyway?
So normally in OOP, an object is an instance of a class and this is the case with Ruby as well:
[] # => []
[].class # => Array
[].class.class # => Class
But, when Ruby creates an object from a class, it also first creates
another anonymous class called the singleton class
. This gets
inserted in the new object's inheritance heirarchy: that is, the
singleton class becomes a subclass of the new object's class and the
object becomes an instance of the singleton instead of the original
class.
x = [] # => []
x.class # => Array
x.singleton_class # => #<Class:#<Array:0x00007fbc862d41b0>>
x.singleton_class.superclass # => Array
This is how you can add methods to individual Ruby objects: you're actually defining them in the object's singleton class.
Fun fact: singleton classes are also objects and thus have their own singleton classes:
x.singleton_class
# => #<Class:#<Array:0x00007fbc869b0060>>
x.singleton_class.singleton_class
# => #<Class:#<Class:#<Array:0x00007fbc869b0060>>>
x.singleton_class.singleton_class.singleton_class
# => #<Class:#<Class:#<Class:#<Array:0x00007fbc869b0060>>>>
This can go as deeply as you want it to.
The reason Ruby doesn't immediately fill up all available RAM with singleton classes and then die is because they are not created until the first time a program uses them. As a result, most objects don't have singleton classes at all.
Isn't this whole singleton class thing kind of overkill?
Not really.
See, Ruby is a language where everything is an object (in the OOP sense of the term), and so this means that classes are also objects. But since all objects have classes, that means each class is also an instance of a class. And so is that class. And this is if we ignore the singleton classes, which we are for the moment.
So how does this end? Well, it's pretty boring actually. Each class
is an instance of the class named Class
, including Class
itself. Class
is an instance of itself and that's all we really
need.
[] # => []
[].class # => Array
[].class.class # => Class
[].class.class.class # => Class
[].class.class.class.class # => Class
But wait! How do we do class methods or class instance variables:
class Thing
def self.instance
@instance = Thing.new unless @instance
return @instance
end
# ...etc...
end
In Smalltalk, this gets done by giving each class object its own
distinct class (the metaclass
) to hold the methods and variable
declarations. They are unnamed but you can get it with the class
method just like Ruby. The metaclass's inheritance tree mirrors the
class's tree (i.e. if Item
is derived from Thing
, then
Thing.class
is derived from Item.class
) with class Class
as the
abstract base class of the heirarchy.
t class. => Thing
t class superclass. => Object
t class superclass superclass. => nil
t class class. => Unnamed class ('Thing class')
t class class superclass. => Unnamed class ('Object class')
t class class superclass superclass. => Class
All metaclasses are instances of the class Metaclass
:
t class. => Thing
t class class. => Unnamed class ('Thing class')
t class class class. => Metaclass
This includes Metaclass
itself, which is how the loop closes:
Metaclass class => 'Metaclass class'
Metaclass class class => Metaclass
(Disclaimer: I've somewhat simplified the above. I also haven't run it.)
In Ruby, each Class
instance (i.e. class) has a singleton class that
holds the class methods and variables. That is, singleton classes
serve as metaclasses. The nice thing about this is that it's a
generalization of what Smalltalk does for classes, and it gives you
instance methods for free.
This is not to say that it's necessarily a better way than Smalltalk's. There are advantages and disadvantages to each approach but I'm far too lazy to write about them here.
# Posted 2021-05-07 01:55:31 UTC; last changed 2021-05-07 01:57:49 UTC