Factor makes you write better code

I program in Python, Javascript and Factor on a roughly daily basis. My experience is that I can write functions/methods quicker in Python and Javascript than I can in Factor, but that my Factor code ends up being of considerably higher quality. By higher quality I mean that it's better factored and easier to pull apart and change. In this post I'm making the claim that factor forces me to write better code, and I'm going to illustrate this with an example.

(I also use perl, ruby, scheme and java, but not nearly as often)

I've recently been writing a trading simulator in my spare time so that I can test my trading ideas on historical data. As part of this project I've written some of the same functionality in both javascript and factor and this experience gave me a good basis from which to compare the languages.

The example I'm going to use to illustrate the comparison is: Coding a simple moving average (SMA) function.

A simple moving average involves stepping along an array of numbers, generating at each step the average (mean) of the last p elements of the sequence (where p is the period). The output of the function is the sequence/array of averages.

E.g. an sma with period 4 on a six element array:

sma([0,1,2,3,4,5],4) => [0,0,0,1.5,2.5,3.5]

(I padded the start of the array with zeros in the javascript version)

For the javascript implementation I built SMA as two nested 'for' loops, with the inner loop summing the last n elements at each turn. This isn't the most efficient way of computing a moving average, but it is what I thought of and implemented first:


function sma (arr,period) {
    var out = [];
    // fill initial space with zeros
    for(var i=0;i<period -1;i++) { out.push(0);}  
    // fill rest with averages
    for (var i=period-1; i<arr.length;i++) {
        var sum = 0;
        for (var j=i-(period-1); j<=i; j++){
           sum += arr[j];
        }
        out.push(sum / period) ;   
    }
    return out;
}

When I went to code the Factor version the idea of coding up nested loops made my head hurt. Factor's stack based approach effectively means serial access to state - you have to shuffle the right variables into the right order at the right time. This makes it is very hard to write functions that manage more than ~3-4 variables at a time.

Javascript by comparison has random access to local variables* and my javascript version uses: 'arr', 'out', 'i', 'j', 'period', 'sum', not to mention a bunch of unnamed temporaries like 'length' arr[j], 'period-1' etc...

Shuffling all these variables manually on a stack while mentally keeping tabs on the order and position of each variable is a pretty tough challenge. I suspect the resultant code would be the sort of thing only a compiler could love.

So faced with this problem I used my traditional factor problem-hammer, which is to step away from the screen, walk around a bit and ask myself the question: 'What abstraction could there be that would make this easier?'.

I came up with 'map-window' which implements a sliding window across the input sequence and applies a block of code to each subset in turn. The code to implement the moving average is then:


[ mean ] map-window

Which is clearly a much cleaner implementation of SMA.

Before I continue I should mention that I could also have written the map-window abstraction in javascript (javascript has good higher order function facilities), but the point of this post is that factor forced me to come up with the approach.

Once I'd had the 'map-window' idea I could easily see how to compute the moving average. I also had an idea of how I could build map-window using 'head' and 'tail', or at least I had enough of an idea to motivate my trying it.

Ok, so here's my full implementation for comparison with the javascript:


: window ( seq start window-width -- subseq )
    [ 1+ head ] dip short tail* ; 

: map-i ( seq quot: ( seq i -- elt ) -- seq' )
    [ dup length ] dip with map ; inline

: map-window ( seq window-width quot -- seq )
    '[ _ window @ ] map-i ; inline

: sma ( seq period -- seq' ) 
    [ mean ] map-window ;

To my eyes the factor implementation is quite a bit more complex than the javascript one, at least consumed in its entirety. This might be because the concept of a for-loop is deeply engrained in my brain whereas the Factor implementation invents both map-i and map-window to build sma.

However the individual parts of the factor implementation are both generic and composable, and once you know what each bit does the whole thing pretty elegantly describes itself.

A big advantage to all this abstraction is that when you discover an implementation pattern occuring more than once, the chances are that the pattern is already factored out to some extent and is ripe for reuse with very little modification. I find this makes refactoring quicker and easier than with python and keeps the codebase relatively lean. This in turn means that the codebase doesn't drag as much as it gets bigger. The tradeoff is that I spend more time upfront finding and creating abstractions in the first place.

Of course if the right abstractions already exist then coding performance is improved dramatically. e.g. if map-window had already existied then sma would have been a slam dunk. I'd assume that as the factor library improves the likelyhood of this happening will increase, maybe at the expense of more time required to learn the core vocabularies. Programming in factor is already more about the libraries than the native language and I'd imagine this trend will continue, especially when you consider that in a lot of cases the libraries implement the core language.

Aside: I was surprised to discover last year that genuinely new and important stack language abstractions like 'fry' and the cleave/spread combinators were only just being conceived, despite Factor being quite a few years old and stack languages in general being many decades old. When you consider that very few languages actually 'invent' new features this makes Factor quite an interesting language in itself. Also interesting is that apart from a small bootstrapping core, the factor language is actually implemented in libraries meaning that anybody can build and experiment with new language constructs.

Anyway I'm diverging from the subject so I ought to sum up. The takeaway is: Whereas other languages provide the ability to create good abstractions, Factor pretty much forces you to create good abstractions because it is so bloody difficult to write any code without them.

Update: During writing this post I realised that what I'm doing with map-window is actually very similar to an abstraction in the factor library called <clumps> which constructs a virtual array of overlapping subsequences. That's the nature of factor programming: you keep finding that somebody else has built a similar abstraction to yours and it would have saved you a ton of time if only you'd realised!

Factor actually has support for efficient lexical local variables via the 'locals' vocabulary (library), which is a pretty impressive feat. However I only tend to use this when the problem I'm solving doesn't factor well (or sometimes temporarily out of desperation when I can't come up with the right abstraction).