A poor man’s scheme profiler

Jul 25 2007

Gambit scheme lacks a profiler that can profile scheme with embedded C code. (There's statprof, but unfortunately it doesn't profile embeded C). I needed to do this pretty desperately for my triple indexing stuff so I've written a simple macro which takes wall clock timings of functions and accumulates them.

You replace 'define' with 'define-timed' in the functions you want profiled, and then the time spent in each function is accumulated in a global hash table. It's not pretty, but it's simple.

The macro needs some supporting code (which I keep in a seperate module so that it can be compiled in order to minimise overhead). :


(define *timerhash* (make-table))

;;; call this before running the code
(define (reset-timer) (set! *timerhash* (make-table)))


;;; adds the time spent in the thunk to an entry in the hashtable
(define (accum-time name thunk)
  (let* ((timebefore (real-time))
         (res (thunk)))
    (table-set! *timerhash* name 
                (let ((current (table-ref *timerhash* name '(0 0))))
                  (list (+ (car current) (- (real-time) timebefore))
                        (+ (cadr current) 1))))
    res))

;;; call this afterwards to get the times
(define (get-times)
  (map (lambda (e) (list (first e) 
                    (* 1000 (second e))
                    (third e)))

       (sort-list (table->list *timerhash*)
                  (lambda (a b) (> (second a)
                              (second b))))))

And here's the macro. It basically just wraps the function body in a call to 'accum-time':


(define-syntax define-timed
  (syntax-rules ()
    ((define-timed (name . args) body ...)
     (define (name . args)
       (accum-time 'name (lambda ()
                           body ...))))))

Here's some example output. The first number is the accumulated milliseconds spent in the function, and the second is the number of times it was called.


((handle-query-clause 2182.3294162750244 8)
 (do-substring-search-query 1678.9898872375488 1)
 (run-query 1678.9379119873047 1)
 (main 1678.929090499878 1)
 (handle-substr-search 705.5349349975586 2)
 (convert-ids-to-symbols 400.3608226776123 1)
 (handle-s_o-clause 192.81506538391113 2)
 (lookup-o->s 192.52514839172363 2)
 (lookup-2-lvl-index 192.48294830322266 2)
 (join 163.8491153717041 8)
 (hash-join 163.6500358581543 5)
 (fetch-int32-results 118.37220191955566 655)
 (project 100.7072925567627 3)
 (concat 77.42834091186523 11793)
 (handle-p->so-clause 55.348873138427734 2)
 (bounds-matching-1intersect 51.54275894165039 2)
 (lookup-p->2-lvl-index 50.98700523376465 2)
 (search-32-32-index 42.65856742858887 94)
 (cells-at-positions 35.561561584472656 11697)
 (fetch-32-32-results 1.171112060546875 94)
 (extract-column .2582073211669922 4)
 (get-join-columns .05888938903808594 5)
 (text-search-to-structured-query .032901763916015625 1)
 (columns .0324249267578125 10)
 (column-position .031232833862304688 8)
 (rows .011920928955078125 4))

Hmmm.. I really need to get something to html pretty print the scheme code.

A simple scheme unittest DSL

May 23 2007

One of the first things I wrote when I was in the 'nesting'* phase of learning gambit scheme was a unittest DSL. Part of this was that I wanted an excuse to use r5rs syntax-rules macros, but the real motivation was that I'd been seduced by the idea of using tests for documentation ala Nat Pryce's 'Protest'. Here's what I came up with: (example):


(define-tests bus-tests

  (drive-to-next-stop        ; name of fn/class/symbol being tested
    ("takes bus to next stop"
      (drive-to-next-stop bus)
      (assert-equal 'next-stop (bus 'position)))
    ("doesn't stop off at chipshop on the way"
      ; test code to detect chipshop hasn't been stopped at
      ))

  (pick-up-passengers
    ("picks up passengers from the bus stop"
      test code )
    ("doesn't leave passengers at the stop"
      test code )
    ("waits for old lady running to the stop"
      test code )))

The point is that it's easy to write some lisp to traverse this code and generate documentation from it.

I found when using protest in python that the documentation angle reinforced some healthy habits: When you write tests you naturally think 'how would this look to another person?' 'how can I document the behaviour of this?' which encourages more complete testing. Also when you look at generated documentation it's easy to see which bits you aren't testing because the documentation is missing (which then encourages you to write more tests).

The implementation is a bit clunky and makes use of gambit exceptions as a way of terminating tests early because of assert failures (which is a bit rubbish). What probably should be happening is that the outer macro should be re-writing each assert as an 'if' or something to conditionally execute the rest of the test. (which would portable to other scheme implementations) To be honest I knocked this up as fast as I could so that I could move onto writing other things (I'm developing a data aggregation and indexing tool), but the point of this post is more to convey the idea than the implementation.

That said, the crufty (currently gambit specific) implementation is here - hope this is interesting/useful to somebody.

* nesting as in 'building a nest'

Gambit-C namespaces

Apr 21 2007

One of the first issues I had when evaluating scheme as a possible replacement for Python as my hobby language was its apparent lack of module/library/namespace system. How do people possibly build big programs? I wondered.

Now it turns out most (all?) scheme implementations have features to deal with modules and libraries. Gambit's is particularly nebulous in that it doesn't appear to be documented anywhere. Anyway, here's how it appears to work. I'm sure somebody will correct me if I've got something wrong:

Gambit has the 'namespace' primitive, with which you can declare that certain definitions belong in certain namespaces. Here's an example:


> (namespace ("f#" foo) ("b#" bar baz))

This means (AFAICS): "any further use/definition of the term 'foo' will reference the f# namespace and any use of bah/baz will reference the b# namespace".

e.g.


> (namespace ("f#" foo) ("b#" bar baz))

> (define (foo) "I am foo")  ; defines f#foo

> (foo)
"I am foo"

> (f#foo)
"I am foo"

> (define (bar) "I am bar")

> (b#bar)
"I am bar"

This is cool because it allows you to retroactively fit namespaces to scheme code loaded from other files. E.g. If mod1.scm and mod2.scm both defined a procedure 'foo', you could use namespaces to allow both to be used in the same environment thus:


> (namespace ("m1#" foo))
> (load "mod1")    ; contains: (define (foo) "I am mod1's foo")

> (namespace ("m2#" foo))
> (load "mod2")    ; contains: (define (foo) "I am mod2's foo")

> (m1#foo)
"I am mod1's foo"

> (m2#foo)
"I am mod2's foo"

Job done. Now I haven't really used gambit namespaces much, so I not in a position to provide a good comparison with other approaches, however the feature does seem in keeping with the rest of the scheme language. By that I mean rather than a large set of fully blown rich language features you get a small bunch of simple but very extensible primitives with which to build your own language.

An good example of building a big system over these small primitives is Christian Jaeger's chjmodule library where he has used namespaces along with 'load' and 'compile-file' (and of course macros) to build a more industrial strength module system. This includes an 'import' keyword that loads from a search path and a procedure to recursively compile and import modules. Some example code from the README:



$ gsc -i -e '(load "chjmodule")' -

> (import 'srfi-1)
> fold
#
> (fold + 0 '(1 2 3))
6
> (build-recursively/import 'regex-case)
            ; recompiles regex.scm (a dependency) if necessary,
            ; then (re)compiles regex-case.scm if necessary and imports it.
> (regex-case "http://www.xxx.yy" ("http://(.+)" (_ url) url) (else url))
"www.xxx.yy"
> *module-search-path* 
("."
 "mod"
 "gambit"
 "~/gambit"
 "~~"
 "..")

Sweet. I'm guessing it'll also be possible to build the r6rs library syntax in gambit scheme the same way.

Some hardcore Gambit-C features

Apr 21 2007

Somebody asked me about gambit-c the other day, and why I was using that as opposed to some other language or runtime for my own-time coding stuff. Despite the scheme language being all cool, the thing that really made his eyes light up was the C features in gambit (it is called gambit-c for a reason). Here's some cool stuff you can do with gambit:

Scheme compiles to native machine code
You can embed C code directly into gambit scheme files.

(c-include)

(c-lambda)

You can compile and load C into a running scheme process


(define (cload f)
  (compile-file f)
  (load f))

You can compile the whole thing into a native binary.

(*) N.B. although uncommon outside the lisp world, these compilation features are actually pretty common in lisp/scheme implementations. E.g. I think chicken, bigloo and SBCL provide simililar things.