Over the last few weeks I've been spending quite a bit of my evening + weekend spare time in rustc trying out ideas and writing code to figure out what's possible and reasonable for real time type inference.
The goal is to get racer to perform perfect rustc type-inferred completion queries in the 100ms ballpark. To give an idea what racer is up against: using rustc to do a type check pass on a sizeable library (I've using the 'rustc_typeck' crate as an example) takes close to 10 seconds, with 8 seconds of that being the actual type checking.
Here's a representative breakdown of a prototype running against rustc_typeck (times are in seconds):
0.182s parse crate text from files into one big AST 0.711s expand macros, insert std::prelude 0.123s read external crates (libraries) 0.164s resolve crate (i.e. resolve all the static paths to point to actual nodes) 0.046s create a type context object (mk_ctxt()) 0.036s collect the types of all the items (fns, traits etc..) 0.010s infer variance 0.018s coherence check 8.077s perform a full type inference pass
Originally I was hoping I could find a way to cache a type-checked ast + side tables and then just incrementally fill in functions and re-typecheck only them. Unfortunately this approach conflicts heavily with the current rustc design, which freezes the ast shortly after the macro expansion stage (the 2nd phase above). I suspect changing the rustc design to accomodate modifying the ast after this stage would be a large and invasive change.
So if racer can't cache data after the expand stage, the other potentual avenue is to try and speed up the rest of the phases by typechecking less code. 'rustc' still needs a fully cohesive crate to compile, but there are some transformations that can be made to the code that will still leave a cohesive crate.
Rust has the nice charactistic that a function signature is enough to type-check a caller without the analysing the body, so the first obvious transformation is to remove function bodies except for the one you want to type check.
Here's some timings running a type-check over the same crate, but this time with all the function bodies removed:
0.031s parse 0.127s expand 0.085s read crates 0.032s resolve crate 0.002s mk_ctxt 0.016s collect_item_types 0.001s infer variance 0.013s coherence check 0.644s type check
This is closer, although still way-off the 100ms goal. A sub-second typecheck is a pretty good start though. There are other transformations racer can try - e.g. pruning private declarations that can't be reached from the target code.
So if we take this approach, racer's ultimate performance will depend on its ability to prune the ast. To get an idea of what the performance could be like if racer became really good at this I pruned an ast manually. This is rust_typeck with the body of one function intact (I used 'typeck::check::check_bounds_are_used'), and only the direct dependencies of this function. I cheated a bit and used rustc to tell me what the dependencies should be; The timings assume the initial ast was parsed, expanded and cached, and doesn't include any time racer would take to prune the tree.
0.045s re-expand the ast based on 0.000s ast map 0.073s read crates 0.008s resolve crate 0.000s mk_ctxt 0.000s collect_item_types 0.000s infer variance 0.003s coherence check 0.149s type check
So there we are. Still almost a couple of hundred millis off the goal, but definitely in right ballpark. I haven't dug much into the actual rustc typechecking code proper yet, so I'm hoping I can get some further (maybe radical) gains there.
I think the next stage for me is to build a prototype that can do interactive type checking of a crate. Something like 'highlight an expression and it tells you the type'. I hope that by building that I'll learn more about rustc and maybe discover some further optimizations or maybe a better approach.