• rustc typeck timings

    (Following up on my comment on the rust IDE rfc)

    Here are some performance numbers for a type inference prototype (demoed here), which uses the rustc libraries to generate resolution and type inference information. It runs from scratch against the source code and rlibs with nothing pre-indexed/loaded, and prints the type of the expression highlighted by the cursor.

    As I mentioned in the comment, it speeds up the parsing and analysis stages by selectively stripping function bodies as the text is loaded by rustc, replacing them with whitespace and newlines so that the coordinates are the same. I cribbed the code to do this from racer, which has some fast routines for iterating over logical blocks of rust source.

    As a target I analysed the rustc_typeck crate.

    On my laptop (2ghz x64 ubuntu):

    1. Phase 1 (parse input): ~50ms (includes stripping the function body text prior to ast parsing)
    2. Phase 2 (expand) : ~132ms
    3. Phase 3 all other checks expect body typeck: ~390ms
    4. Body typeck for single function: - ~19ms

    The results suggest to me in this environment we could get a type-signature level index of this crate achieved in ~ half a second, and then perform the body typecheck on the current local scope after each keypress. We would then need to perform an additional step using the type-signature index to deduce the completion results.

    The steps for a type-based completion query ( e.g. foo.<caret> ) are:

    1. deduce the type of the expression to be completed (i.e. the stuff before the dot)
    2. generate suggestions by resolving the type and finding e.g. methods, fields, trait impls

    Regardless of the level of prior cached indexing, we potentially need to perform the type step (1) on each keypress. This means we should aim to get rustc answering this question in < 100ms (hopefully much less). I suspect this will require some custom compiler work in addition to the current incremental compilation effort.

    We could theoretically do (2) without an index directly from the sourcecode and rlibs, (racer currently attempts to do this from source). However this is complicated especially in the face of macros. A rustc generated index would make this simpler and faster, and the index only needs to be at the type-signature level to perform this specific query (i.e. we don’t need information about the contents of functions except for the local scope we are currently searching from).

    With the field search occurring outside of rustc we can also be more relaxed about consistency and versioning - e.g. we could perform the search on an old index if we don’t have the most up-to-date one generated yet, since updates outside the current scope will be limited.

  • Videos! Racer + Rustc + Typecking

    I’ve been providing some feedback on an RFC Nick Cameron is writing about providing rust compiler support for IDEs. In order to gain some understanding I’ve been experimenting with a prototype tool to perform type analysis on a crate as fast as possible (as you type).

    I wanted to demo this tool and racer to Nick, but unfortunately getting experimental code to compile with rustc is error-prone due to rustc being a fast-moving target. Instead I made a couple of videos to show Nick what I was up to. I thought they might be interesting to other people so with his permission I’m posting them here:

    1. https://www.youtube.com/watch?v=3-f67VHGgVs (~10 mins)

    2. https://www.youtube.com/watch?v=YBD4kDFK9Lk (~6 mins)

    (I’m not very good at speaking on videos so these are a bit mumbly, sorry!)

    As I see them, the big upcoming challenges from the perspective of code completion are:

    1. Creating a useful stable interface to rustc that can be used to extract semantic type and lifetime information
    2. Performing rustc analysis on incomplete crates (so that the results can be used for providing completions + suggestions as you type)
    3. Performing the (incremental) analysis with sub-100ms latencies

    I’m very excited about the prospect of rustc-enabled perfect completions, and am also eager to work on other functionality for racer (I use intellij daily at work and really miss the ‘extract-method’ refactoring when writing rust code).

  • Racer v1.0.0!

    I made a Racer v1.0.0 release!

    This doesn’t represent any particular milestone in Racer’s development, but rather I wanted to draw a line in the sand before Racer embarks on future architectural changes. A bunch of people have made editor + ide plugins that rely on Racer’s under-specified output protocol and so having a release to point at will help a bit with plugin compatibility. I am planning to stick to semver versioning so it is likely there will be a v1.0.1 soon!

    For the future I would like to see Racer mature into a fully fledged library and binary for supporting IDE authors. In addition to better completions, Racer has potentual to provide support for dynamic type-checking, code assistance and refactoring. To deliver this functionality Racer will need to work much closer with rustc and, due to the overheads in running queries through the compiler infrastructure, will likely need to operate as a long running process/library, holding state and caching information. Unfortunately it is also likely that future Racer versions may only build with rust-nightly due to a tighter coupling with unstable rustc internals.

    Racer is an evening/weekends spare time project for me, and I am very grateful to everyone who helped out submitting patches, bugfixes and suggestions over the last year. In particular, thanks to:

    Andrew Andkjar, Antoine Kalmbach, awdavies, Ben Batha, Björn Steinbrink, Björn Zeutzheim, byronyi, Chris Morgan, Corey Farwell, Dabo Ross, Damien R, Darin Morrison, Derek Chiang (Enchi Jiang), Eike Hein, Emil Lauridsen, Henrik Johansson, Heorhi Valakhanovich, inrustwetrust, Jake Kerr, Jakko Sikkar, Johann Tuffe, Jorge Aparicio, Justin Harper, Keiichiro Ui, Kirill Khazan, krzat, Leonids Maslovs, lilydjwg, Marius, Marvel Mathew, mathieu _alkama_ m, Mattias Bengtsson, Michael Gehring, Michael Maurizi, nlordell, oakes, Pyry Kontio, Renato Zannon, rhysd, Ricardo Martins, Ronald Kinard, rsw0x, Saurabh Rawat, Sebastian Thiel, Shougo Matsushita, Siddharth Bhat, Tamir Duberstein, Tom Jakubowski, Victor-Nicolae Savu, Vincent Huang, Vlad Svoka, Wilfred Hughes, Yamakaky, yasushi abe, Zbigniew Siciarz, Ziad Hatahet

  • Rust error stacktraces

    There was a post and discussion on reddit yesterday that reminded me I wanted to write about stack traces in rust at some point.

    I’ve worked on a bunch of java code bases over the years, and one feature I like about the java environment is that you can obtain a stack trace identifying the throw site of any exception raised and a some good information about the path the code took to get there.

    This is really valuable in a production system where a subtle environment change triggers an unexpected violation of an invariant. The violation is unexpected and so the error bubbles all the way up to the generic error handling. Luckily because of the stacktrace you still have context from which to quickly diagnose the error and decide on an appropriate course of action.

    Rust currently supports printing a stack trace to stderr when a thread panic!()s, which is really handy. It is enabled by setting the “RUST_BACKTRACE” environment variable. Panics however are too heavyweight to use for most errors in rust code because they offer limited recovery options, and instead it is generally preferred to return std::result::Result.

    (Aside: I suspect in the long run panics will all-but disappear from rust. My guess is that some nice sugar will come along for Result types that will make e.g. array indexing with Results palatable (maybe something like ‘?’ syntax), and then panics will become relegated to a small corner of the language, used only for out-of-memory errors and maybe stack overflows.)

    So that leaves a problem: How do I get a stacktrace out of an unexpected rust error in production code? The best approach I’ve come up with so far is to combine a custom error with rt::backtrace. Here’s the recipe:

    1. Create a custom error type
    2. #[derive(Debug,Clone)]
      pub struct MyError {
          pub msg: String,
          pub backtrace: String
    3. Add a 'new()' fn. On creation insert a backtrace into the error using rt::backtrace
    4. impl MyError {
          pub fn new(msg: &str) -> MyError {
              MyError { msg: msg.to_string(), backtrace: get_backtrace() }
    5. For other error types used by the program, convert them using the convert::From machinery
      impl From&lt;::std::io::Error> for MyError {
          fn from(e: ::std::io::Error) -> MyError {
    6. Create a Result type alias
    7. pub type Result&lt;T> = ::std::result::Result&lt;T, MyError>;

      Now step 3. ensures that when try!() converts an error from e.g. std::io::Error into MyError ...

      fn connect_to_something() -> Result&lt;String> {
          let mut stream = try!(TcpStream::connect("example.com:1500"));   // try! creates MyError (and backtrace!)

      ... then the backtrace is created with this callsite. Unfortunately the backtrace won't point you to where the original std::io::Error was created, but at least you'll have an idea how connect_to_something() came to be called and the fact that the error originated in the TcpStream::connect. If you build the binary with dwarf debuginfo (rustc -g) then you'll even have a file and line number.

      The downside to this approach is that std::rt::backtrace::write() is unstable, so only works with nightly rust. Also it sounds like the backtrace functionality will be moved out to a separate crate at some point.

  • Racer + Rustc update2

    (This is the second in a possible series of posts about racer + rustc, the first being here)

    Over the last few weeks I've been spending quite a bit of my evening + weekend spare time in rustc trying out ideas and writing code to figure out what's possible and reasonable for real time type inference.

    The goal is to get racer to perform perfect rustc type-inferred completion queries in the 100ms ballpark. To give an idea what racer is up against: using rustc to do a type check pass on a sizeable library (I've using the 'rustc_typeck' crate as an example) takes close to 10 seconds, with 8 seconds of that being the actual type checking.

    Here's a representative breakdown of a prototype running against rustc_typeck (times are in seconds):

    0.182s         parse crate text from files into one big AST
    0.711s         expand macros, insert std::prelude
    0.123s         read external crates (libraries)
    0.164s         resolve crate (i.e. resolve all the static paths to point to actual nodes)
    0.046s         create a type context object (mk_ctxt())
    0.036s         collect the types of all the items (fns, traits etc..)
    0.010s         infer variance
    0.018s         coherence check
    8.077s         perform a full type inference pass

    Originally I was hoping I could find a way to cache a type-checked ast + side tables and then just incrementally fill in functions and re-typecheck only them. Unfortunately this approach conflicts heavily with the current rustc design, which freezes the ast shortly after the macro expansion stage (the 2nd phase above). I suspect changing the rustc design to accomodate modifying the ast after this stage would be a large and invasive change.

    So if racer can't cache data after the expand stage, the other potentual avenue is to try and speed up the rest of the phases by typechecking less code. 'rustc' still needs a fully cohesive crate to compile, but there are some transformations that can be made to the code that will still leave a cohesive crate.

    Rust has the nice charactistic that a function signature is enough to type-check a caller without the analysing the body, so the first obvious transformation is to remove function bodies except for the one you want to type check.

    Here's some timings running a type-check over the same crate, but this time with all the function bodies removed:

    0.031s         parse
    0.127s         expand
    0.085s         read crates
    0.032s         resolve crate
    0.002s         mk_ctxt
    0.016s         collect_item_types
    0.001s         infer variance
    0.013s         coherence check
    0.644s         type check

    This is closer, although still way-off the 100ms goal. A sub-second typecheck is a pretty good start though. There are other transformations racer can try - e.g. pruning private declarations that can't be reached from the target code.

    So if we take this approach, racer's ultimate performance will depend on its ability to prune the ast. To get an idea of what the performance could be like if racer became really good at this I pruned an ast manually. This is rust_typeck with the body of one function intact (I used 'typeck::check::check_bounds_are_used'), and only the direct dependencies of this function. I cheated a bit and used rustc to tell me what the dependencies should be; The timings assume the initial ast was parsed, expanded and cached, and doesn't include any time racer would take to prune the tree.

    0.045s         re-expand the ast based on 
    0.000s         ast map
    0.073s         read crates
    0.008s         resolve crate
    0.000s         mk_ctxt
    0.000s         collect_item_types
    0.000s         infer variance
    0.003s         coherence check
    0.149s         type check

    So there we are. Still almost a couple of hundred millis off the goal, but definitely in right ballpark. I haven't dug much into the actual rustc typechecking code proper yet, so I'm hoping I can get some further (maybe radical) gains there.

    I think the next stage for me is to build a prototype that can do interactive type checking of a crate. Something like 'highlight an expression and it tells you the type'. I hope that by building that I'll learn more about rustc and maybe discover some further optimizations or maybe a better approach.

  • Racer on Rust Beta!

    I'm very pleased to announce that Racer now compiles with the Rust 1.0 beta releases! This means that hopefully Racer should be ready to go for Rust 1.0 when it launches later this month.

    The big blocker for this release was rustc's libsyntax. I'd like to say a big thanks to Erick Tryzelaar for porting libsyntax into the syntex_syntax crate, and for working hard to remove unstable features from libsyntax. Also I'd like to say a special thanks to Tamir Duberstein for helping to remove the dependency on rust unstable features.

    And of course thanks to all the Racer contributors.

    (BTW, a side effect of relying on a separate syntax crate is that build times for Racer have gone up dramatically. This is a shame, but I think a small price to pay for Rust 1.0 compatibility)

  • Rustc noodling for code completion

    Background: Racer is a code completion tool for the Rust programming language. This is going to be a bit of a meandering post as I try to decide what to do about type inference.

    As I wrote in the previous post, I'm pretty convinced that racer should be using rustc to interrogate the type of expressions. This would enable racer to perform more accurate completions (and potentually complex refactorings in the future). Currently racer does some of its own type inference, but getting this right is error prone and I think long-term probably a dead end.

    However the nice thing about racer doing its own type inference is that it can still work in the face of unfinished code, and also be very fast (because it can work incrementally, take shortcuts and avoid doing a lot of the work that rustc does). The downside is that hand-rolling rust type inference is a lot of work and I don't have much spare time.

    Initially I was wondering if racer could approach the type inference problem by isolating the target expression and constructing a smaller representative program around it. The program would be quicker to compile by rustc but have same type expression output. This is essentially the approach racer currently takes with parsing.

    Unfortunately after attempting a few hand written examples I realised this would be a very complex approach. E.g. consider this piece of code from libsyntax parse/parser.rs and imagine you want to construct a program to extract the type of the |p| argument:

                    let tts = try!(self.parse_seq_to_before_end(
                        |p| p.parse_token_tree()

    Racer would need to incrementaly resolve the types of a whole bunch of expressions before it could reasonably construct a representative program to run through rustc in order to find the type of 'p'. I suspect this work is similar in scope to completing racer's existing hand-rolled type analysis, which kind of defeats the point.

    So the other approach is to run the rustc type inference over whole crates, including imports, and somehow get this fast enough for a good user experience. I'm trying to figure out how best to accomplish this. (Ignoring unfinished code for now)

    I started by putting together a spike which attempts to use rustc to run a full type-inference pass on some code. I had to create a copy of the rustc_typeck crate (called mytypeck) in order to get access to the private innards.

    (N.B. rustc changes frequently so it is likely that the mytypeck crate will get out-of-date very fast)

    The approach I currently intend to persue is to maintain an ast and side tables of all the items in a crate, but no function bodies, (a 'skeleton ast' if you will), and then somehow patch the ast with an individual function body and get the side-tables updated in order to obtain the type of an expression contained within. The function body may be need to be artificially constructed by racer to handle unfinished code. The skeleton ast would need to be constantly updated as the item signatures are updated.

    This is all uninformed speculation, but I'm a bit limited with spare time and wanted to get something written down to encourage some chat and input from people with more experience. Am I on the right track?

  • Racer Progress Update 5

    Racer is a code completion tool for the Rust programming language. Time for another progress update!

    1. Cargo support

    The big news is that I recently added some cargo support to racer:

    When searching for crates racer now interrogates the nearest Cargo.lock and Cargo.toml file and then searches git checkouts and crates.io repos in the ~/.cargo directory. This was a bit fiddly but luckily I had the source-code to Daniel Trstenjak's excellent rusty-tags project to refer to. Thanks very much Daniel!

    2. Rust Beta

    Racer relies on some unstable libraries at the moment, which unfortunately means it won't currently build with the Rust Beta channel. Hopefully this situation will be remedied over the next few weeks before 1.0, but in the meantime you need a rust nightly install to build racer.

    (Unstable features used are: collections, core, rustc_private, str_char, path_ext, exit_status. Patches very welcome!)

    3. Performance improvements

    Back in February Renato Zannon noticed that racer was spending a lot of time spawning threads. The spawning was originally designed to protect against libsyntax panics, and preventing crashes has become an important issue recently because there is a lot of demand to provide racer as a library for linking with IDEs. Unfortunately the thread spawning also generates a lot of overhead and removing it more than doubled the performance of racer in a lot of cases.

    The thread spawning is currently removed from the codebase, which means that racer will sometimes panic and fail to produce matches. As an alternative to wrapping libsyntax calls I've been spending time removing panics from Rust's libsyntax itself in the last few weeks. If you want to follow along the PR-in-progress is here.

    4. Type Inference

    Not much progress on the type inference front I'm afraid. I'm now more convinced than ever that racer should be using rustc for type inference rather than continuing to roll its own. The question is how to get there: the rust compiler is currently not the right shape to do incremental type inference. There is plenty of desire in the community to have a compiler library that can support IDEs, so I plan to help out there where I can.

    As a start I'm going to try having racer construct small representative pieces of code to compile (in the same way as it does with using libsyntax to parse), and see where that takes me.

    5. Thanks!

    Once again, a big thanks to all the Racer contributors. Your help and support is much appreciated:
    Andrew Andkjar, awdavies, Ben Batha, Björn Steinbrink, Björn Zeutzheim, Dabo Ross, Darin Morrison, Derek Chiang (Enchi Jiang), Emil Lauridsen, Henrik Johansson, Heorhi Valakhanovich, Jake Kerr, Johann Tuffe, Jorge Aparicio, Justin Harper, Keiichiro Ui, krzat, Leonids Maslovs, Marvel Mathew, mathieu _alkama_ m, Michael Gehring, Michael Maurizi, nlordell, oakes, Pyry Kontio, Renato Zannon, rsw0x, Saurabh Rawat, Sebastian Thiel, Siddharth Bhat, Vincent Huang, Vlad Svoka, Zbigniew Siciarz, Ziad Hatahet.

  • Racer Progress Update 4

    Time for another Racer update! Most of the work on racer in the last few months has been about keeping up with the flood of changes to the rust language and internal apis. Aside from that there were a couple of improvements:

    • Racer got some support for generics (connect() returns IoResult<TcpStream>)
    • It got some support for destructuring
    • Racer got better at handling half-finished code, dangling scopes and match statements
    • It also got a bit better at working on windows (in particular carriage returns and BOMs (byte-order-marks) in files).

    TODO List

    Racer is starting to get some heavier use by people now, which is great, but has thrown up a few deficiencies that need addressing in the coming weeks/months

    • Cargo support

      Racer works well for a small project using the rust standard libraries. Unfortunately using it with a modern cargo project is a bit clunky and sometimes completely unworkable, involving having to fiddle with the RUST_SRC_PATH variable and in some cases make a copy of external sources in a separate directory. Racer needs to be able to read cargo.toml files and locate the sources referenced in them.

    • Racer as a library (+ daemon)

      Some people would like racer to operate as a long-running daemon for performance reasons. Current Racer performance isn't a big deal for me personally (a complete or find-definition is a few hundred ms on my laptop in most cases), but @valloric thinks it will be and he has a lot more experience in this area due to his YouCompleteMe project. The plan is to create a racer library that editors and daemons can embed.

    • Racer only reads code from files on the file system

      This is somewhat connected to the library thing: Currently racer only consumes code from the file system - you cannot pass it text to work on. This simplifies the internals but means that if you're half way through editing a file and you need racer, your editor must save the contents of your file before invoking it. This is currently handled by emacs and vim by saving a temporary file in the current directory. If racer is to work as a library in a long running process it will need the capability to allow editors to pass unsaved code to it directly and not get confused when it doesn't match what's on disk.

    • Type inference + rustc

      This is probably Racer's biggest hurdle: Racer currently does a bunch of custom type inference in order to resolve method and field completions and definitions. Unfortunately this inference is nowhere near 'complete' (ha!), and the rust language is a fast moving target making this a bunch of effort. I could easily spend the next year or two of my spare time replicating rust type inference and getting nothing else done.

      Instead I'd ideally like to lever the rustc compiler to do the type inference. Unfortunately rustc currently isn't designed for type-inferring anything other than an full complete crate, which is slow and also won't work for half-finished programs like the code you typically autocomplete. I need to spend some time researching what can be done in this area and I'm open to any ideas or advice.

      I noticed @sanxiyn has been experimenting with adding an ast node to represent a completion ('ExprCompletion') to enable this to go through the various compiler passes even though the code is 'broken'. This is cool and really promising - I think the meat here is in allowing incomplete programs to get through rustc's type-inference pass. I'm not sure whether having a 'completion' ast node is as important (once you can evaluate the type of an expression, providing completions is possible without the compiler support), but maybe this approach could be generalised to allow the compiler to skip over sections of code that don't parse correctly.

      In the short term I'm wondering if it might be possible/reasonable to generate small programs that represent the inputs to a specific type query but are quicker for rustc to type-infer. Racer currently does this for parsing (constructs a artificial statement similar to the original and then uses rustc's libsyntax) and it works really well there, but parsing is considerably less brittle than a full type-inference pass. I'm unsure of the best place to discuss this sort of thing; Reddit?

    Finally I'd like to thank everybody that has contributed fixes and code to Racer so far:

    Michael Gehring, Johann Tuffe, rsw0x, Darin Morrison, Zbigniew Siciarz, Michael Maurizi, Björn Steinbrink, Leonids Maslovs, Vincent Huang, mathieu _alkama_ m, Ben Batha, Björn Zeutzheim, Derek Chiang (Enchi Jiang), Emil Lauridsen, Henrik Johansson, Heorhi Valakhanovich, Jorge Aparicio, Justin Harper, Keiichiro Ui, Pyry Kontio, Renato Zannon, Saurabh Rawat, Vlad Svoka, Ziad Hatahet, awdavies, nlordell

    I have very limited spare time so it's a real joy to wake up in the morning and find that somebody has submitted a patch to e.g. bring racer up to date with the latest rust. Thanks everyone!

  • Another Racer progress update

    Racer has come on a bit since my last post. Here's a quick rundown of the main changes:

    • Method resolution got better. Racer now hunts around traits for default method implementations. This means that e.g. a completion on a Vec method gives you all the methods available, including 'is_exists()' and other methods implemented in the sub traits (see above). Racer can also resolve the type of 'self' so that it can match methods and fields when you're writing method implementations.
    • More completion information Racer now outputs some contextual info which gets included in matches - see the image above.
      (N.B. unfortunately there's a bug in my emacs bindings somewhere in the interaction with company mode which means that the extra stuff only crops up after you move the cursor. I'm not really sure why this happens, I'll try and fix it at some point.)
    • Module searching got better: Racer does a more accurate job of following 'extern crate' and nested modules. It turned out that modules didn't quite work the way I originally thought they did.
    • Racer now recognises explicit types in let expressions
      let v : Vec<uint> = ...
      This gives you a way to 'tell' Racer the type if it isn't smart enough to deduce it.
    • Internally Racer now uses iterators everywhere instead of callback functions. (I mentioned callbacks as something I didn't like about the design in a previous post).

      I originally thought iterators would solve the 'don't do the whole search if I only need the first match' problem. Unfortunately writing lazy iterators in Rust is non-trivial when the thing you're iterating is nested and recursive (like a code search). For now I've replaced the callbacks with a simpler eager solution: 'collect the results in a vector and return vec.move_iter()'.

      N.B. I did manage to cobble together a lazy iterator in one place by chaining together a bunch of procs in the central 'resolve_name' function; This has improved performance in racer - I might do the same trick for sub-trait searches.

    Basically Racer is now pretty good at navigating and completing anything that does not involve lambdas, destructuring or generics. I'm planning to work on generics next.

    I should say a big thanks to the people who committed patches to racer in the last couple of months, in particular rsw0x who posted a couple of fixes and has also been working on the vim plugin, Jorge Aparicio who got racer integrated with the travis build system and Micheal Gehring who dilligently kept Racer compiling in the face of breaking Rust changes.

    While on the subject of contributors: I've been feeling the need for somewhere to discuss Racer recently, in particular for things that I don't have much experience with - e.g. windows support, vim + other ide plugins. I've created a google group racer-discuss; please join if you're interested in Racer.

All posts

subscribe via RSS