And another new programming language

Sep 28 2007

Last year I embarked on somwhat of a journey to find a better language for my home projects after getting a bit frustrated by python's lack of blocks and general cruftyness. After a couple of months of trying various different things I settled on Gambit Scheme for my spare-time data indexing project. A minimal core language with uniform syntax and macros. Lots of potentual for adapting and building language features that make sense to me. Sorted.

Then last week I got round to reading Richard Jones's minimal forth code. A language compiler and runtime in a couple of pages of well documented X86 assembler; small enough to read and understand the whole system in a bus journey.

Like Scheme, Forth has the ability to construct new language features, allowing user code to get between the parser and the evaluator to modify the language itself. Also, like Scheme, it has a uniform syntax - words separated by spaces, which makes re-writing code on the fly practical. Both these features mean that a fully fledged programming environment can be bootstrapped up from a minimal core. And the hook is: that minimal core is so much smaller for Forth than it is for Scheme.

One thing lead to another and I got interested in stack languages. Now I'm looking at Factor and wondering...

Factor has ticks in all the right boxes: minimal core, machine code compiler, macros, continuations, lightweight threads and message passing concurrency. It's also the tersest language I've seen - code to do something always seems a fraction of the size I expect it to be. On the other hand it's so radically different to anything else I've programmed in and it makes my head hurt. Could it be a contender to knock Scheme off the top spot?

A poor man’s scheme profiler

Jul 25 2007

Gambit scheme lacks a profiler that can profile scheme with embedded C code. (There's statprof, but unfortunately it doesn't profile embeded C). I needed to do this pretty desperately for my triple indexing stuff so I've written a simple macro which takes wall clock timings of functions and accumulates them.

You replace 'define' with 'define-timed' in the functions you want profiled, and then the time spent in each function is accumulated in a global hash table. It's not pretty, but it's simple.

The macro needs some supporting code (which I keep in a seperate module so that it can be compiled in order to minimise overhead). :


(define *timerhash* (make-table))

;;; call this before running the code
(define (reset-timer) (set! *timerhash* (make-table)))


;;; adds the time spent in the thunk to an entry in the hashtable
(define (accum-time name thunk)
  (let* ((timebefore (real-time))
         (res (thunk)))
    (table-set! *timerhash* name 
                (let ((current (table-ref *timerhash* name '(0 0))))
                  (list (+ (car current) (- (real-time) timebefore))
                        (+ (cadr current) 1))))
    res))

;;; call this afterwards to get the times
(define (get-times)
  (map (lambda (e) (list (first e) 
                    (* 1000 (second e))
                    (third e)))

       (sort-list (table->list *timerhash*)
                  (lambda (a b) (> (second a)
                              (second b))))))

And here's the macro. It basically just wraps the function body in a call to 'accum-time':


(define-syntax define-timed
  (syntax-rules ()
    ((define-timed (name . args) body ...)
     (define (name . args)
       (accum-time 'name (lambda ()
                           body ...))))))

Here's some example output. The first number is the accumulated milliseconds spent in the function, and the second is the number of times it was called.


((handle-query-clause 2182.3294162750244 8)
 (do-substring-search-query 1678.9898872375488 1)
 (run-query 1678.9379119873047 1)
 (main 1678.929090499878 1)
 (handle-substr-search 705.5349349975586 2)
 (convert-ids-to-symbols 400.3608226776123 1)
 (handle-s_o-clause 192.81506538391113 2)
 (lookup-o->s 192.52514839172363 2)
 (lookup-2-lvl-index 192.48294830322266 2)
 (join 163.8491153717041 8)
 (hash-join 163.6500358581543 5)
 (fetch-int32-results 118.37220191955566 655)
 (project 100.7072925567627 3)
 (concat 77.42834091186523 11793)
 (handle-p->so-clause 55.348873138427734 2)
 (bounds-matching-1intersect 51.54275894165039 2)
 (lookup-p->2-lvl-index 50.98700523376465 2)
 (search-32-32-index 42.65856742858887 94)
 (cells-at-positions 35.561561584472656 11697)
 (fetch-32-32-results 1.171112060546875 94)
 (extract-column .2582073211669922 4)
 (get-join-columns .05888938903808594 5)
 (text-search-to-structured-query .032901763916015625 1)
 (columns .0324249267578125 10)
 (column-position .031232833862304688 8)
 (rows .011920928955078125 4))

Hmmm.. I really need to get something to html pretty print the scheme code.

A simple scheme unittest DSL

May 23 2007

One of the first things I wrote when I was in the 'nesting'* phase of learning gambit scheme was a unittest DSL. Part of this was that I wanted an excuse to use r5rs syntax-rules macros, but the real motivation was that I'd been seduced by the idea of using tests for documentation ala Nat Pryce's 'Protest'. Here's what I came up with: (example):


(define-tests bus-tests

  (drive-to-next-stop        ; name of fn/class/symbol being tested
    ("takes bus to next stop"
      (drive-to-next-stop bus)
      (assert-equal 'next-stop (bus 'position)))
    ("doesn't stop off at chipshop on the way"
      ; test code to detect chipshop hasn't been stopped at
      ))

  (pick-up-passengers
    ("picks up passengers from the bus stop"
      test code )
    ("doesn't leave passengers at the stop"
      test code )
    ("waits for old lady running to the stop"
      test code )))

The point is that it's easy to write some lisp to traverse this code and generate documentation from it.

I found when using protest in python that the documentation angle reinforced some healthy habits: When you write tests you naturally think 'how would this look to another person?' 'how can I document the behaviour of this?' which encourages more complete testing. Also when you look at generated documentation it's easy to see which bits you aren't testing because the documentation is missing (which then encourages you to write more tests).

The implementation is a bit clunky and makes use of gambit exceptions as a way of terminating tests early because of assert failures (which is a bit rubbish). What probably should be happening is that the outer macro should be re-writing each assert as an 'if' or something to conditionally execute the rest of the test. (which would portable to other scheme implementations) To be honest I knocked this up as fast as I could so that I could move onto writing other things (I'm developing a data aggregation and indexing tool), but the point of this post is more to convey the idea than the implementation.

That said, the crufty (currently gambit specific) implementation is here - hope this is interesting/useful to somebody.

* nesting as in 'building a nest'

Gambit-C namespaces

Apr 21 2007

One of the first issues I had when evaluating scheme as a possible replacement for Python as my hobby language was its apparent lack of module/library/namespace system. How do people possibly build big programs? I wondered.

Now it turns out most (all?) scheme implementations have features to deal with modules and libraries. Gambit's is particularly nebulous in that it doesn't appear to be documented anywhere. Anyway, here's how it appears to work. I'm sure somebody will correct me if I've got something wrong:

Gambit has the 'namespace' primitive, with which you can declare that certain definitions belong in certain namespaces. Here's an example:


> (namespace ("f#" foo) ("b#" bar baz))

This means (AFAICS): "any further use/definition of the term 'foo' will reference the f# namespace and any use of bah/baz will reference the b# namespace".

e.g.


> (namespace ("f#" foo) ("b#" bar baz))

> (define (foo) "I am foo")  ; defines f#foo

> (foo)
"I am foo"

> (f#foo)
"I am foo"

> (define (bar) "I am bar")

> (b#bar)
"I am bar"

This is cool because it allows you to retroactively fit namespaces to scheme code loaded from other files. E.g. If mod1.scm and mod2.scm both defined a procedure 'foo', you could use namespaces to allow both to be used in the same environment thus:


> (namespace ("m1#" foo))
> (load "mod1")    ; contains: (define (foo) "I am mod1's foo")

> (namespace ("m2#" foo))
> (load "mod2")    ; contains: (define (foo) "I am mod2's foo")

> (m1#foo)
"I am mod1's foo"

> (m2#foo)
"I am mod2's foo"

Job done. Now I haven't really used gambit namespaces much, so I not in a position to provide a good comparison with other approaches, however the feature does seem in keeping with the rest of the scheme language. By that I mean rather than a large set of fully blown rich language features you get a small bunch of simple but very extensible primitives with which to build your own language.

An good example of building a big system over these small primitives is Christian Jaeger's chjmodule library where he has used namespaces along with 'load' and 'compile-file' (and of course macros) to build a more industrial strength module system. This includes an 'import' keyword that loads from a search path and a procedure to recursively compile and import modules. Some example code from the README:



$ gsc -i -e '(load "chjmodule")' -

> (import 'srfi-1)
> fold
#
> (fold + 0 '(1 2 3))
6
> (build-recursively/import 'regex-case)
            ; recompiles regex.scm (a dependency) if necessary,
            ; then (re)compiles regex-case.scm if necessary and imports it.
> (regex-case "http://www.xxx.yy" ("http://(.+)" (_ url) url) (else url))
"www.xxx.yy"
> *module-search-path* 
("."
 "mod"
 "gambit"
 "~/gambit"
 "~~"
 "..")

Sweet. I'm guessing it'll also be possible to build the r6rs library syntax in gambit scheme the same way.

Some hardcore Gambit-C features

Apr 21 2007

Somebody asked me about gambit-c the other day, and why I was using that as opposed to some other language or runtime for my own-time coding stuff. Despite the scheme language being all cool, the thing that really made his eyes light up was the C features in gambit (it is called gambit-c for a reason). Here's some cool stuff you can do with gambit:

Scheme compiles to native machine code
You can embed C code directly into gambit scheme files.

(c-include)

(c-lambda)

You can compile and load C into a running scheme process


(define (cload f)
  (compile-file f)
  (load f))

You can compile the whole thing into a native binary.

(*) N.B. although uncommon outside the lisp world, these compilation features are actually pretty common in lisp/scheme implementations. E.g. I think chicken, bigloo and SBCL provide simililar things.

Getting the hang of the lisp style

Mar 29 2007

Finally, it's starting to feel like I'm getting the hang of the lisp style. The key for me seems to be in re-reading the contents of srfi-1 every time I think I need a loop.

lisp
scheme

The point of macros

Feb 28 2007

Pascal Costanza nails the point of macros.

(and illustrates why lispers actually like lisp's strange syntax so much)

Scheme development environment

Feb 21 2007

I recently had my work laptop nicked while I was in paris, so I've had to reconstruct my linux development environment on another laptop. That reminded me that I intended to document this stuff since I had to dig around a bit for it when I first picked up scheme a few months ago.

Things I use:

Gambit scheme

termite

Emacs
Quack.el
E-tags
GNU Info

Bill Clementson

(add-hook 'scheme-mode-hook
      (lambda ()
        (define-key scheme-mode-map [f1]
          '(lambda ()
         (interactive)
         (ignore-errors
           (let ((symbol (thing-at-point 'symbol)))
             (info "(r5rs)")
             (Info-index symbol)))))))

A unit testing framework.

simple DSL

Protest

Is there anything missing?

Refactoring and the Repl

Oct 08 2006

I'm still perservering with Gambit scheme, and progressing pretty slowly it has to be said. The first thing I've been missing is the lack of refactoring tools for scheme.

I wrote the basic python refactoring functionality in bicyclerepairman a long while ago, and having it as part of my daily toolset has strongly influenced the way I program. For example, I tend to follow the 'bash out some code and then clean it up' style of development. In particular, I have a habit of naming variables and functions badly and then renaming them later as I code.

So my initial thought is: no problem - I'll just knock up a bicyclerepairman for scheme! The problem is that I'm not quite sure how to do automated refactoring with a repl. You see Python has no real repl culture (sure it has a repl, but nobody uses it except for trying out simple expressions). People tend to run their program/unittests from scratch each iteration, which means the entire environment gets re-evaluated on each run.

The challenge with running a repl while you develop is keeping it in sync with your refactored code: E.g. if I rename a function that's used in multiple places, that results in lots of code that needs re-evaluating. Can this be done automatically (e.g. could it be made to work by just re-eval'ing files?). Hmm.. I think I need to talk to somebody with a lot more scheme experience than I have. Unfortunately I don't actually know any experienced schemers, especially not in London or Birmingham; maybe somebody from lshift can help?

Scheme is love

Sep 19 2006

I've been battling again with Scheme recently. Having spent the last couple of months playing with various languages, I've come to the conclusion that scheme is the only one that has any real possibility of becoming my next 'general purpose language'. Python held that crown for many years, but its lack of blocks and concurrency caused me to start looking elsewhere and now I'm spoilt.

So, to Scheme. I've not found another language that can offer:

functional programming
message-passing concurrency (see termite)
macros
continuations
terse syntax
hardly any language cludges

...and as somebody who programs for fun in his spare time, these things really do matter to me. The biggest obstacle to full enlightment is the s-expression aesthetic: To my algol-shaped brain that lisp syntax just looks so damn ugly!

Anyway, I'm finding that the most enjoyable and self-affirming way to develop some scheme skills is (ironically) to re-read Peter Seibel's 'Practical Common Lisp' book with scheme glasses on. Now if there's anyone going to convince me that lisp syntax isn't just a grotty heap of parentheses, it's going to be Peter. His book just radiates lisp-love, and you can't help but be hooked. It says 'Look! You fools! Just look what you're missing!'. I've been translating various examples into scheme, just to test the water.