Gambit-C namespaces

One of the first issues I had when evaluating scheme as a possible replacement for Python as my hobby language was its apparent lack of module/library/namespace system. How do people possibly build big programs? I wondered.

Now it turns out most (all?) scheme implementations have features to deal with modules and libraries. Gambit's is particularly nebulous in that it doesn't appear to be documented anywhere. Anyway, here's how it appears to work. I'm sure somebody will correct me if I've got something wrong:

Gambit has the 'namespace' primitive, with which you can declare that certain definitions belong in certain namespaces. Here's an example:


> (namespace ("f#" foo) ("b#" bar baz))

This means (AFAICS): "any further use/definition of the term 'foo' will reference the f# namespace and any use of bah/baz will reference the b# namespace".

e.g.


> (namespace ("f#" foo) ("b#" bar baz))

> (define (foo) "I am foo")  ; defines f#foo

> (foo)
"I am foo"

> (f#foo)
"I am foo"

> (define (bar) "I am bar")

> (b#bar)
"I am bar"

This is cool because it allows you to retroactively fit namespaces to scheme code loaded from other files. E.g. If mod1.scm and mod2.scm both defined a procedure 'foo', you could use namespaces to allow both to be used in the same environment thus:


> (namespace ("m1#" foo))
> (load "mod1")    ; contains: (define (foo) "I am mod1's foo")

> (namespace ("m2#" foo))
> (load "mod2")    ; contains: (define (foo) "I am mod2's foo")

> (m1#foo)
"I am mod1's foo"

> (m2#foo)
"I am mod2's foo"

Job done. Now I haven't really used gambit namespaces much, so I not in a position to provide a good comparison with other approaches, however the feature does seem in keeping with the rest of the scheme language. By that I mean rather than a large set of fully blown rich language features you get a small bunch of simple but very extensible primitives with which to build your own language.

An good example of building a big system over these small primitives is Christian Jaeger's chjmodule library where he has used namespaces along with 'load' and 'compile-file' (and of course macros) to build a more industrial strength module system. This includes an 'import' keyword that loads from a search path and a procedure to recursively compile and import modules. Some example code from the README:



$ gsc -i -e '(load "chjmodule")' -

> (import 'srfi-1)
> fold
#
> (fold + 0 '(1 2 3))
6
> (build-recursively/import 'regex-case)
            ; recompiles regex.scm (a dependency) if necessary,
            ; then (re)compiles regex-case.scm if necessary and imports it.
> (regex-case "http://www.xxx.yy" ("http://(.+)" (_ url) url) (else url))
"www.xxx.yy"
> *module-search-path* 
("."
 "mod"
 "gambit"
 "~/gambit"
 "~~"
 "..")

Sweet. I'm guessing it'll also be possible to build the r6rs library syntax in gambit scheme the same way.

Some hardcore Gambit-C features

Somebody asked me about gambit-c the other day, and why I was using that as opposed to some other language or runtime for my own-time coding stuff. Despite the scheme language being all cool, the thing that really made his eyes light up was the C features in gambit (it is called gambit-c for a reason). Here's some cool stuff you can do with gambit:

  1. Scheme compiles to native machine code
  2. The gambit scheme compiler compiles to C, which gcc then compiles to shared libraries or executables. The gsc command wraps this whole process, so you do a 'gsc < myschemefile>' which drops a shared library out of the other end. The (load) procedure in gambit will import either interpreted scheme code or compiled object files into the process, so you're good to go. In addition, the gsc compiler can also be run as an interactive interpreter (gsc -i) which acts just like the normal gambit scheme repl interpreter except you also have access to the compiler from your code. E.g. As well as dropping interpreted scheme code into the repl I can also compile a file and load it into the running repl process without dropping to the command line - cool!.
  3. You can embed C code directly into gambit scheme files.
  4. (c-include) lets you paste C code into your scheme, and (c-lambda) lets you define lambdas in C. This is really sweet. I thought the python C api was good, but because you have to write your c stuff in seperate files it always requires some sort of make/build system, and that's always been just too much of a barrier for me to use it day-to-day. With gambit you can just switch in a few lines of C into your performance hotspot and you're good to go. This also gives you trivial access to C libraries and low-level stuff - e.g. I use it for mmapped files. Having the C in the same file as scheme means the GCC compiler can optimize and inline C code into compiled scheme and vice-versa. The other advantage of this approach is that the gambit-c environment pretty much requires you to have a C compiler in the mix, so as a developer I can rely on it being there when distributing source to other developers, pasting code into emails etc..
  5. You can compile and load C into a running scheme process
  6. Actually this is just a mix of (1) and (2), but it's really cool when you think about what's going on. Make an update to your C code and dynamically re-load it into your running process. I have an emacs keybinding which executes a 'cload' function in the repl:
    
    (define (cload f)
      (compile-file f)
      (load f))
    
    
    I.e. edit the C code, whack the button and it's in the repl process. This keeps the dev loop really tight even when writing C.
  7. You can compile the whole thing into a native binary.
  8. This is especially cool and important when you consider that gambit-c isn't currently a popular runtime. It means you can distribute native binaries of your app for windows and mac users so that they can try your app without worrying about dependencies.

(*) N.B. although uncommon outside the lisp world, these compilation features are actually pretty common in lisp/scheme implementations. E.g. I think chicken, bigloo and SBCL provide simililar things.

Refactoring and the Repl

I'm still perservering with Gambit scheme, and progressing pretty slowly it has to be said. The first thing I've been missing is the lack of refactoring tools for scheme.

I wrote the basic python refactoring functionality in bicyclerepairman a long while ago, and having it as part of my daily toolset has strongly influenced the way I program. For example, I tend to follow the 'bash out some code and then clean it up' style of development. In particular, I have a habit of naming variables and functions badly and then renaming them later as I code.

So my initial thought is: no problem - I'll just knock up a bicyclerepairman for scheme! The problem is that I'm not quite sure how to do automated refactoring with a repl. You see Python has no real repl culture (sure it has a repl, but nobody uses it except for trying out simple expressions). People tend to run their program/unittests from scratch each iteration, which means the entire environment gets re-evaluated on each run.

The challenge with running a repl while you develop is keeping it in sync with your refactored code: E.g. if I rename a function that's used in multiple places, that results in lots of code that needs re-evaluating. Can this be done automatically (e.g. could it be made to work by just re-eval'ing files?). Hmm.. I think I need to talk to somebody with a lot more scheme experience than I have. Unfortunately I don't actually know any experienced schemers, especially not in London or Birmingham; maybe somebody from lshift can help?

Scheme is love

I've been battling again with Scheme recently. Having spent the last couple of months playing with various languages, I've come to the conclusion that scheme is the only one that has any real possibility of becoming my next 'general purpose language'. Python held that crown for many years, but its lack of blocks and concurrency caused me to start looking elsewhere and now I'm spoilt.

So, to Scheme. I've not found another language that can offer:

  • functional programming
  • message-passing concurrency (see termite)
  • macros
  • continuations
  • terse syntax
  • hardly any language cludges

...and as somebody who programs for fun in his spare time, these things really do matter to me. The biggest obstacle to full enlightment is the s-expression aesthetic: To my algol-shaped brain that lisp syntax just looks so damn ugly!

Anyway, I'm finding that the most enjoyable and self-affirming way to develop some scheme skills is (ironically) to re-read Peter Seibel's 'Practical Common Lisp' book with scheme glasses on. Now if there's anyone going to convince me that lisp syntax isn't just a grotty heap of parentheses, it's going to be Peter. His book just radiates lisp-love, and you can't help but be hooked. It says 'Look! You fools! Just look what you're missing!'. I've been translating various examples into scheme, just to test the water.

Java cage rattling

Joel Spolski rattles the java cage a second time, illustrating that:

  • Java is really bad at functional programming
  • Functional programming is really important for utilizing massively parallel hardware

In particular I like the way Spolski uses Javascript for his examples. The conception of this language has always been a bit of an enigma to me, especially given its timing: As far as I can see the netscape people managed to sneak a pretty good language under the idiot radar by slapping the word 'java' on the front of the name and showcasing a crappy C-style 'for' looping construct to make it look a bit like java.

Then they added dynamic typing, higher order functions, closures, prototype based OO, eval...

Lisp aesthetics (and OO message passing)

The Lisp style has been really trying my sense of aesthetic. Yesterday I got really hung up on the fact that I prefer the 'message-passing' approach to OO rather than the 'call a function and pass in the object as an arg' style. The latter just feels clumsy to me.

Then it occurred to me that lisp syntax could easily support a message passing style. E.g. I could do:


(philaccnt 'withdraw: 10)
(meaning "send the 'withdraw' message to the philaccnt object with an argument of 10).

Hmmm... just write a 'class' macro that creates a hash of lambdas and function 'make-object' that takes the class hash and returns a closure that dispatches the messages to method calls.

Could be something to this lisp build your own language malarky...

Understood vs Learnt

I've been reading 'The Little Schemer' (a new edition of 'The Little LISPer') and 'The Seasoned Schemer' recently, recommended to me by JP (who (of course!) has a signed copy of the original). This is classic text, with the first edition of 'lisper' appearing in the 70s. The striking thing about these books isn't so much the content but the approach: Rather than being structured like a normal CS textbook with a descriptive chapter and then questions at the end, the whole book is a set of questions and answers from the beginning, all framed in chatty prose.

On each page the questions are on the left and the answers on the right, so you always see the answer to each question posed immediately. Each question/answer adds just enough information for you to be able to do the next (well, most of the time). I was surprised by how well this approached worked for me.

In the forward, the authors point out that the goal of the 'little schemer' is to teach you to think recursively. Proceeding through the book raised an important insight to me that I hadn't properly groked before: the distinction between having understood something and having learnt it. When reading CS textbooks I have a tendency to read the chapter and then only do a couple of the questions before getting bored and skipping to the next. This is how I proceeded through SICP for example: As soon as I understood the material, I wanted to move on.

The problem with this approach is that my brain hasn't done enough repetition to form the higher-order concepts/patterns of the material learnt, and this makes the material much more heavy going later on.

Funnily enough, the distinction has always been obvious to me when learning music (practicing my french horn), because the difference between understanding and doing in music (as in sport) is a wide chasm. For some reason I'd never really come to terms with this when learning computer science.

Anyway, the little schemer rattles through example after example, Q/A after Q/A. Having the answers next to the questions means you can proceed at impressive speed because you don't need to formulate and complete the whole answer each time - you can just pick out the patterns, the structure and artifacts that you expect to see in the answer, and then check them.

The forward to the book recommends that you don't read it all in one sitting - in fact it recommends at least 3 sittings. I suspect this is because you need rest time for your brain to construct the appropriate structures to allow you to 'think' recursively rather than merely understanding recursion. This of course allows you to proceed even faster.

Finally, don't be put off by the childishness of the artwork or prose: this shit is hardcore!. Lambda expressions are there from the beginning. Higher order functions and Currying feature early on and the book blasts through the Y-Combinator at breakneck pace towards the end. Personally I had to resort to wikipedia and various tutorials to supplement learning the Y-combinator stuff. The last chapter quickly builds a scheme interpreter capable of evaluating most of the expressions in the book.

I'm now rattling through 'The Seasoned Schemer'. I'm a quarter of the way through and have just hit continuations...