This is cool. I wrote a similar Lisp but got stuck debugging my copying GC. The interaction between the Lisp environment chain and the C stack is tricky.
I prefer something like this over Norvig's Lisp interpreter written in Python. Norvig's is good for understanding environments and eval/apply, but IMO it's cheating to use the host language's GC for allocations.
I would add integers, even if it costs a few lines of code... using `itos` and re-parsing the string when evaluating an int is just too painful for me.
Cool, that's an interesting solution! I tried hard to keep the GC implementation as straight-forward as possible, as a teaching aid, but I'll admit that debugging the GC was a bit of a struggle when roots weren't clearly indicated as such.
I felt the same way about Norvigs version which was one reason to do this. I feel like the recursive lisp interpreter-on-an-interpreter thing is not as clear when so much of the host language is relied on.
Oh, and I hear you about integers as strings. This choice was in part for compactness and in part inspired by the first LISP papers where there are only symbols and no other types.
but copying GC can be easily implemented with sbrk, or on the bare metal if no MMU is present. mpirstitz's GC uses mmap but could easily be replaced with sbrk. the main challenge of dynamic memory management is fragmentation, which is completely handled by using a copying GC. an implementation on top of malloc is not leaning on malloc's fragmentation handling at all.
Always good fun. Reminds me of XLISP which was my first exposure to Lisp.
There's a lot to be said for small implementations that gets to the heart of the issues. Most great oeuvres of software had a kernel beginning like this.
Some fun directions you could take this: Change the GC to Baker-style incremental (~ "real-time"), use cdr-coding, use tagged integers, add a compiler and compile to byte-code or threaded code.
EDIT: grammar.
Yeah, I am thinking that adding a similarly minimal compiler to this would be interesting as a next step.
There is a previous effort in the ”old-version” directory which had tagged integers and strings and an attempt at macros, but it was a bit too unfocused and I never finished it.
With this one the goal was to make it as small as possible but still be a complete implementation including GC, which made for a clearer end goal.
Spot on. I have a partially finished byte-compiled Scheme which I abandoned when it grew to big for its original target platform, sigh. So much to be said for something that actually runs.
To prevent reading from continuing indefinitely, each packet should end with STOP followed by a large number of right parentheses. An unpaired right parenthesis will cause a read error and terminate reading.
STOP )))))))))))))))))
1. Mac OS apparently has a conflicting definition of "isnumber", as "__DARWIN_CTYPE_TOP_inline int isnumber(int _c)" in "/usr/include/ctype.h:323". Thanks, Mac OS. So I'm running this on another machine of mine.
2. My usual test: iteration expressed as anonymous recursion, which should be possible to run in constant space:
pi@raspberrypi:~/LISP $ echo '
((lambda (f) (f f 10 0))
(lambda (f n tt)
(cond ((equal? n 0) tt)
(#t (f f (- n 1) (+ n tt))))))' | ./komplott
55
pi@raspberrypi:~/LISP $ echo '
((lambda (f) (f f 100 0))
(lambda (f n tt)
(cond ((equal? n 0) tt)
(#t (f f (- n 1) (+ n tt))))))' | ./komplott
5050
pi@raspberrypi:~/LISP $ echo '
((lambda (f) (f f 1000 0))
(lambda (f n tt)
(cond ((equal? n 0) tt)
(#t (f f (- n 1) (+ n tt))))))' | ./komplott
Out of memory
Aborted
I noticed from the source that eval did recursive calls to itself. I expected that to be its bane, but I'm surprised it hit an OOM rather than a stack overflow... It looks like, in lisp_eval's "apply" case, it'll call gc_protect on all the relevant stuff, then only do gc_pop after everything returns; perhaps that's why.
In the absence of other looping constructs, looping by tail recursion seems the best available option, and therefore an important usage to support. Implementing that, along with a moving GC, in a language that doesn't do its own tail call elimination is a pain.
Yeah the TCO is not complete, it allocates a call frame for each call regardless which is why it ran out of memory. It doesn’t use up the C call stack though, so you could in theory increase the GC heap size to make it survive longer. Due to the primitive environment implementation it’ll get slower and slower as it goes.
I'm ok with minimalism, but interning strings in a simple array with linear search only scales to ~100 entries. a hash table can be written in 15 lines.
Interesting. That threw me off. I overlooked the previous paragraph:
In Emacs Lisp, an obarray is actually a vector. Each element of the vector is a bucket; its value is either an interned symbol whose name hashes to that bucket, or 0 if the bucket is empty. Each interned symbol has an internal link (invisible to the user) to the next symbol in the bucket. Because these links are invisible, there is no way to find all the symbols in an obarray except using mapatoms (below). The order of symbols in a bucket is not significant.
Because it's a good programming exercise towards the goal of mastering C, and because being a functional programming language, LISP embodies software that is easy to debug, easy to understand and easy to extend.
When I took a programming languages class with Dan Friedman [0], the entire class was taught in scheme. One of our assignments was to implement a typerchecker for the simply-typed lambda calculus in minikanren [1]. I recall it being a relatively straightforward assignment.
The current version of the course is locked away on Canvas, but an older version (nearly the same as it was in 2017) is at [0]. It has all of the assignments, including the partially-complete type checker. (Look in Assignment 10.)
We spent a couple of weeks learning minikanren out of "The Reasoned Schemer", using Will Byrd's implementation of minikanren.
I've been studying this for the past day. I just wanted to say: thank you. This is the most helpful and relevant resource for my work that I've come across in some time.
I really appreciate that you took the time to send me this.
I prefer something like this over Norvig's Lisp interpreter written in Python. Norvig's is good for understanding environments and eval/apply, but IMO it's cheating to use the host language's GC for allocations.
I would add integers, even if it costs a few lines of code... using `itos` and re-parsing the string when evaluating an int is just too painful for me.