Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Lisp with copying GC in 537 lines of C (plus Lisp 1.5 on top) (github.com/krig)
152 points by krig on Oct 14, 2018 | hide | past | favorite | 34 comments


This is cool. I wrote a similar Lisp but got stuck debugging my copying GC. The interaction between the Lisp environment chain and the C stack is tricky.

I prefer something like this over Norvig's Lisp interpreter written in Python. Norvig's is good for understanding environments and eval/apply, but IMO it's cheating to use the host language's GC for allocations.

I would add integers, even if it costs a few lines of code... using `itos` and re-parsing the string when evaluating an int is just too painful for me.


Here's my own attempt at implementing a somewhat readable copying GC in C - https://github.com/matp/tiny-lisp/blob/master/tiny-lisp.c#L1...


Cool, that's an interesting solution! I tried hard to keep the GC implementation as straight-forward as possible, as a teaching aid, but I'll admit that debugging the GC was a bit of a struggle when roots weren't clearly indicated as such.


I felt the same way about Norvigs version which was one reason to do this. I feel like the recursive lisp interpreter-on-an-interpreter thing is not as clear when so much of the host language is relied on.

Oh, and I hear you about integers as strings. This choice was in part for compactness and in part inspired by the first LISP papers where there are only symbols and no other types.


To be fair to Norvig, malloc is also a relatively hefty abstraction comparable to a GC on a systems level.


That is certainly a good point. The problem I felt with dropping below malloc is that the code becomes very platform-dependent.


but copying GC can be easily implemented with sbrk, or on the bare metal if no MMU is present. mpirstitz's GC uses mmap but could easily be replaced with sbrk. the main challenge of dynamic memory management is fragmentation, which is completely handled by using a copying GC. an implementation on top of malloc is not leaning on malloc's fragmentation handling at all.


Always good fun. Reminds me of XLISP which was my first exposure to Lisp.

There's a lot to be said for small implementations that gets to the heart of the issues. Most great oeuvres of software had a kernel beginning like this.

Some fun directions you could take this: Change the GC to Baker-style incremental (~ "real-time"), use cdr-coding, use tagged integers, add a compiler and compile to byte-code or threaded code. EDIT: grammar.


Thanks!

Yeah, I am thinking that adding a similarly minimal compiler to this would be interesting as a next step.

There is a previous effort in the ”old-version” directory which had tagged integers and strings and an attempt at macros, but it was a bit too unfocused and I never finished it.

With this one the goal was to make it as small as possible but still be a complete implementation including GC, which made for a clearer end goal.


Spot on. I have a partially finished byte-compiled Scheme which I abandoned when it grew to big for its original target platform, sigh. So much to be said for something that actually runs.


ahahahaAHAHAhahaha...

    To prevent reading from continuing indefinitely, each packet should end with STOP followed by a large number of right parentheses. An unpaired right parenthesis will cause a read error and terminate reading.

    STOP )))))))))))))))))


Things I see...

1. Mac OS apparently has a conflicting definition of "isnumber", as "__DARWIN_CTYPE_TOP_inline int isnumber(int _c)" in "/usr/include/ctype.h:323". Thanks, Mac OS. So I'm running this on another machine of mine.

2. My usual test: iteration expressed as anonymous recursion, which should be possible to run in constant space:

  pi@raspberrypi:~/LISP $ echo '
  ((lambda (f) (f f 10 0))
   (lambda (f n tt)
     (cond ((equal? n 0) tt)
           (#t (f f (- n 1) (+ n tt))))))' | ./komplott
  55
  pi@raspberrypi:~/LISP $ echo '
  ((lambda (f) (f f 100 0))
   (lambda (f n tt)
     (cond ((equal? n 0) tt)
           (#t (f f (- n 1) (+ n tt))))))' | ./komplott
  5050
  pi@raspberrypi:~/LISP $ echo '
  ((lambda (f) (f f 1000 0))
   (lambda (f n tt)
     (cond ((equal? n 0) tt)
           (#t (f f (- n 1) (+ n tt))))))' | ./komplott
  Out of memory
  Aborted
I noticed from the source that eval did recursive calls to itself. I expected that to be its bane, but I'm surprised it hit an OOM rather than a stack overflow... It looks like, in lisp_eval's "apply" case, it'll call gc_protect on all the relevant stuff, then only do gc_pop after everything returns; perhaps that's why.

In the absence of other looping constructs, looping by tail recursion seems the best available option, and therefore an important usage to support. Implementing that, along with a moving GC, in a language that doesn't do its own tail call elimination is a pain.


Yeah the TCO is not complete, it allocates a call frame for each call regardless which is why it ran out of memory. It doesn’t use up the C call stack though, so you could in theory increase the GC heap size to make it survive longer. Due to the primitive environment implementation it’ll get slower and slower as it goes.

Thanks for the note on Mac OS, I’ll fix that.


Before I try, just always have a question - is it lisp-1 or lisp-2? Is it scheme/lisp?

Given the lisp15.scm is a scm it seems to be underlying is a lisp 1.5 running under scheme. But given it is doing make, is it just share code?

Too many of this but really want to have a code for simple C based lisp to play with. Is that it?


Considering the example code and test code is all (define func-name (lambda (....) )), this has got to be a lisp-1.


The base language interpreter written in C is a lisp-1, and it’s not even close to a full scheme implementation but it is a subset of scheme.

The LISP 1.5 interpreter runs on top and is implemented in the base language.


I'm ok with minimalism, but interning strings in a simple array with linear search only scales to ~100 entries. a hash table can be written in 15 lines.


It's actually a linked list (with linear search).

Good enough for such a simple interpreter methinks...


Sounds like a great pedagogical tool to me!


It works surprisingly well, and appeals to me for its compactness. But i maybe I’ll switch to hashing if I can squeeze that in.


This isn’t true. Emacs interns strings in a global obarray, which is a simple array with linear search. I hear emacs scales well.


That is incorrect. An Emacs obarray is a kind of hash table [1].

[1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Cr...


(make-vector length 0)

Interesting. That threw me off. I overlooked the previous paragraph:

In Emacs Lisp, an obarray is actually a vector. Each element of the vector is a bucket; its value is either an interned symbol whose name hashes to that bucket, or 0 if the bucket is empty. Each interned symbol has an internal link (invisible to the user) to the next symbol in the bucket. Because these links are invisible, there is no way to find all the symbols in an obarray except using mapatoms (below). The order of symbols in a bucket is not significant.

Thanks!


Good job


Thanks!


[flagged]


Heavens forbid people having fun by implementing a Lisp.


I feel like these are both excellent points.


Well, I’m also learning to play the piano, and that had certainly been done before and better than me.

But the reason for both is that it’s fun and I enjoy doing it.


Because it's a good programming exercise towards the goal of mastering C, and because being a functional programming language, LISP embodies software that is easy to debug, easy to understand and easy to extend.


Yes, like implementing a statically typed language, but we all have to start somewhere.


When I took a programming languages class with Dan Friedman [0], the entire class was taught in scheme. One of our assignments was to implement a typerchecker for the simply-typed lambda calculus in minikanren [1]. I recall it being a relatively straightforward assignment.

[0] https://en.wikipedia.org/wiki/Daniel_P._Friedman

[1] http://minikanren.org/


I actually need to do this for my lisp. Do you know if that class is available online? I only found things like http://mullr.github.io/micrologic/literate.html

which doesn’t seem to do type checking.


The current version of the course is locked away on Canvas, but an older version (nearly the same as it was in 2017) is at [0]. It has all of the assignments, including the partially-complete type checker. (Look in Assignment 10.)

We spent a couple of weeks learning minikanren out of "The Reasoned Schemer", using Will Byrd's implementation of minikanren.

[0] https://web.archive.org/web/20070610012057/http://www.cs.ind...


I've been studying this for the past day. I just wanted to say: thank you. This is the most helpful and relevant resource for my work that I've come across in some time.

I really appreciate that you took the time to send me this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: