Careful Cutting To Get Faster RSpec Runs with Rails

By Jon Frisby / August 2, 2011

A few months ago, Ruby Inside wrote about using Spork with RSpec 2 and Rails 3 in order to get a more sprightly spec run. Unfortunately, using the techniques in the article with our fledgling codebase's test suite left us with somewhat disappointing results, so I decided to dig deeper and see if I could do better.

Note: This is a guest post by Jon Frisby of Cloudability. See the post footer for more info.

With and Without Spork

First, let's see what things look like with and without Spork running on our raw test suite.

Note: The machine I'm using is a spanking-new 15" MacBook Pro with the 2.2Ghz quad i7 and running Ruby 1.9.2-p180.

"time rspec spec" without Spork running:

Finished in 15.78 seconds
75 examples

real    0m22.334s
user    0m17.952s
sys     0m2.643s

We can see that we're not overly I/O bound (real vs. user+sys), but we are spending a LOT of time doing "real work". Ugh.

"time rspec spec" with Spork running:

Finished in 16.82 seconds
75 examples

real    0m17.340s
user    0m0.214s
sys   0m0.065s

This didn't exactly equate to a big boost! (Editor notes: Most of Spork's win is in running specs over and over rather than once-off speed, but I'll let this slide for now ;-)) Just for clarity, here's our barebones spec_helper.rb file showing what runs in the prefork as well as on each further run:

require 'spork'

Spork.prefork do
  ENV["RAILS_ENV"] ||= 'test'
  require File.expand_path("../../config/environment", __FILE__)
  require 'rspec/rails'

  RSpec.configure do |config|
    config.mock_with :rspec
    config.use_transactional_fixtures = true
  end
end

Spork.each_run do
  # This code will be run each time you run your specs.
  Dir[Rails.root.join("spec/support/**/*.rb")].each {|f| require f}
end

Cutting out MySQL

So, where are we spending out time and what can we do about it? First I'll try cutting MySQL out of the picture and use an in-memory SQLite DB for our tests. This article was a helpful starting point for me, but I discovered quickly that the spec_helper.rb addition needed must be placed in the Spork.each_run block - not the Spork.prefork block.

Unfortunately, the silence_stream call winds up being superfluous because Spork has captured it already, and ActiveRecord isn't writing to STDOUT anymore. This will make the tests a bit noisier when the Spork server is running, but that's not a huge deal.

Our spec_helper.rb file's each_run section now becomes this:

Spork.each_run do
  # This code will be run each time you run your specs.
  load_schema = lambda {
    load "#{Rails.root.to_s}/db/schema.rb" # use db agnostic schema by default
    # ActiveRecord::Migrator.up('db/migrate') # use migrations
  }
  silence_stream(STDOUT, &load_schema)

  Dir[Rails.root.join("spec/support/**/*.rb")].each {|f| require f}
end

"time rspec spec" without Spork running:

Finished in 14.95 seconds
75 examples

real    0m21.866s
user    0m17.826s
sys   0m2.688s

A little bit better - and every win helps - but not huge. Let's see how things look with the Spork server running.

"time rspec spec" with Spork running:

Finished in 14.83 seconds
75 examples

real    0m15.504s
user    0m0.222s
sys 0m0.064s

Things are beginning to look a bit better, but there is still room for improvement.

Cutting out Garbage Collection

The most obvious candidate is garbage collection. Turning off GC outright makes things faster, but at an unacceptable cost: The spec runner bloats from a peak of 81MB to a peak of 418MB. It turns out though that we can make a bit of a trade-off here by explicitly running GC periodically and finding a sweet-spot between too much time spent in GC and too much memory growth.

Our new spec_helper.rb disables GC in general, and forces a run after every 10th test. I didn't go for an exhaustive analysis of the time/space tradeoff here, but converged to this as a "good enough" option after a handful of tests.

Here's the new spec_helper.rb file (a lot more changes this time):

Spork.prefork do
  # Loading more in this block will cause your tests to run faster. However,
  # if you change any configuration or code from libraries loaded here, you'll
  # need to restart spork for it take effect.
  ENV["RAILS_ENV"] ||= 'test'
  require File.expand_path("../../config/environment", __FILE__)
  require 'rspec/rails'

  counter = -1
  RSpec.configure do |config|
    config.mock_with :rspec
    config.use_transactional_fixtures = true

    config.after(:each) do
      counter += 1
      if counter > 9
        GC.enable
        GC.start
        GC.disable
        counter = 0
      end
    end

    config.after(:suite) do
      counter = 0
    end
  end
end

Spork.each_run do
  GC.disable

  # From: http://www.osmonov.com/2011/01/in-memory-sqlite-database-for-testing.html
  load_schema = lambda {
    load "#{Rails.root.to_s}/db/schema.rb" # use db agnostic schema by default
    # ActiveRecord::Migrator.up('db/migrate') # use migrations
  }
  silence_stream(STDOUT, &load_schema)

  Dir[Rails.root.join("spec/support/**/*.rb")].each {|f| require f}
end

"time rspec spec" without Spork running:

Finished in 14.78 seconds
75 examples

real    0m21.384s
user    0m16.421s
sys   0m2.830s

"time rspec spec" with Spork running:

Finished in 13.36 seconds
75 examples

real    0m13.981s
user    0m0.218s
sys   0m0.061s

These runs used 145MB and 142MB of memory respectively. A considerable increase in memory, but not bad given the time win for controlling our GC behavior. The tradeoff time-wise (versus just disabling GC) turns out to be small — about half a second. I can live with that.

But all we've really been doing so far is shaving off some rough corners. We haven't addressed the real issue head-on — the 13+ seconds of "real work" being done.

Taming Devise

My next step was to bust out perftools.rb and do some for-real profiling. I'm going to get a bit hand-wavy, cut to the chase and tell you what I found to be the single biggest culprit: Devise.

As with many in Rails-land, we're using Devise to implement our user account and authentication functionality. Devise utilizes a "work factor" mechanism to make the computational cost of computing a password hash relatively high. This is to make brute-forcing the system that much harder for potential attackers, and it's a very important mechanism.

Most of our models have an association to our User class, and we're using the Devise-recommended choice of bcrypt with the default 10 stretches for hashing passwords right now. Obviously, anything that involves creating users is going to add up relatively fast. In hind-sight, this should have been fairly obvious. D'oh.

So, what do we do in test-mode? In test-mode we don't care is the test data includes easy-to-brute-force passwords. We can't just set stretches to 0, because bcrypt won't let us. One little monkey-patch to Devise later, and we can get past that!

Our updated spec_helper.rb has an extended Spork.prefork section:

Spork.prefork do
  # Loading more in this block will cause your tests to run faster. However,
  # if you change any configuration or code from libraries loaded here, you'll
  # need to restart spork for it take effect.
  ENV["RAILS_ENV"] ||= 'test'
  require File.expand_path("../../config/environment", __FILE__)
  require 'rspec/rails'

  # Don't need passwords in test DB to be secure, but we would like 'em to be
  # fast -- and the stretches mechanism is intended to make passwords
  # computationally expensive.
  module Devise
    module Models
      module DatabaseAuthenticatable
        protected

        def password_digest(password)
          password
        end
      end
    end
  end
  Devise.setup do |config|
    config.stretches = 0
  end

  counter = -1
  RSpec.configure do |config|
    config.mock_with :rspec
    config.use_transactional_fixtures = true

    config.after(:each) do
      counter += 1
      if counter > 9
        GC.enable
        GC.start
        GC.disable
        counter = 0
      end
    end

    config.after(:suite) do
      counter = 0
    end
  end
end

"time rspec spec" without Spork running:

Finished in 6.34 seconds
75 examples

real    0m12.982s
user    0m9.567s
sys   0m2.699s

"time rspec spec" with Spork running:

Finished in 6.32 seconds
75 examples

real    0m6.935s
user    0m0.219s
sys   0m0.058s

Bingo. So now we've gone from around 17s to around 7s. And that is how you make your specs faster.

While Spork can make multiple runs faster, it's not the be-all and end-all of making your spec runs faster generally, so see if there are other wins to be had as well.

Jon Frisby is a jack-of-all-trades coder and is currently the co-founder of Cloudability, which aims to be the Mint.com for cloud spending. He's a veteran of the startup world and even after being co-founder or founder of 4 companies, he insists that he still has some of his sanity left.

Comments

Alan says:
August 2, 2011 at 11:18 pm
After finding myself annoyed by schema load output I spent way too much time playing with STDOUT and silence_stream only to find the following option works great:

ActiveRecord::Schema.verbose = false
load "#{Rails.root.to_s}/db/schema.rb"

No noisy output!
Federico says:
August 2, 2011 at 11:39 pm
Jon, I might be the only one here but would you mind telling us a bit about what you did with perftools.rb? I think seeing how other people has approached this problem might help everyone else debug their long spec runs (26m here :( ).
Alex Sharp says:
August 3, 2011 at 4:52 am
Yep, we discovered the exact same thing with warden and devise. Hey, at least bcrypt is doing it's job!
Kieran says:
August 3, 2011 at 6:18 am
Awesome tip. Using 'rspec spec' instead of 'rake', and putting that devise hack in place, I got our tests down from 1:40.49 to 43.569. Awesome!
Kieran says:
August 3, 2011 at 6:31 am
One note: This breaks user.valid_password?('something') in tests. But then, you shouldn't be testing devise functionality in apps anyway, cause devise already tests it's own stuff.
Kieran says:
August 3, 2011 at 6:59 am
Interestingly, I can't seem to get the in-memory database to work. When I run tests after making the changes, I get this:

/Users/kieran/.rvm/gems/ruby-1.9.2-p290@blog/gems/activerecord-3.0.9/lib/active_record/connection_adapters/sqlite_adapter.rb:295:in `table_structure': Could not find table 'posts' (ActiveRecord::StatementInvalid)

Happens even when db/test.sqlite3 exists, and the schema file is loaded :-S rake db:schema:load has the same problem :-S
Rémy says:
August 3, 2011 at 8:00 am
Hi,

FYI, the tip about minimizing the stretches is documented here: https://github.com/plataformatec/devise/wiki/Speed-up-your-unit-tests

Also, the point of Spork is really not to speed up the running time of your tests but to speed up their starting time after the first load (i.e. after Spork has done its heavy "prefork" job).

By the way, if you're doing continuous testing, you may be interested in Spork + RSpec + Guard (in brief, Guard will run your specs for you on files modification).
Max says:
August 3, 2011 at 8:09 am
So your real speedup came from lower the devise stretches..

Isn't this covered in the Wiki? https://github.com/plataformatec/devise/wiki/Speed-up-your-unit-tests
Joel says:
August 3, 2011 at 6:27 pm
I'd seen the Devise trick before somewhere but completely forgot about implementing it ... and the GC changes are a nice added bonus. So, just those two things took my suite from ~ 18 seconds to ~ 4 seconds.

Awesome stuff. Thanks guys!
Jon Frisby says:
August 3, 2011 at 9:36 pm
@Alan: Aha! Good tip! Thanks!

@Federico: Nothing especially fancy, just a very simple CPU-profile-capture. The sampling-based approach was a bit problematic for me because my test suite runs in such a short amount of time but you should see more robust data from it. Just look at the biggest time-sinks and proceed accordingly.

@Kieran: If you're seeing that big a jump just by switching from "rake spec" to "rspec spec" then it sounds like you've got a LOT of gems / application code loading. I don't envy you that situation! And thanks for the heads-up on Devise breakage. As for in-memory schemas -- if you're using ":memory:" as the file, then db/test.sqlite3 shouldn't ever be created, OR loaded. It sounds like you're loading the schema up at the wrong point in the lifecycle of things -- it must happen after Spork has forked. If you're following my code examples, that shouldn't be a problem. Beyond that, up to you I'm afraid -- I'm not a master of using SQLite, nor am I a master of the Rails object/process lifecycle so I don't have much intuition about what might be going on.

@Remy: Hadn't seen that wiki page but I did attempt using stretches=1 in a test run and while it was a big win, Devise still showed up as a big time-sink. That is why I went through the effort of stubbing out the password hashing.
Jon Frisby says:
August 3, 2011 at 11:22 pm
Just to clarify a couple items...

1) All the test-times are taken after doing two 'primer runs' of the same command, to ensure disk caches are primed and that I'm benchmarking the impact of my change with less impact from external variables.

2) The with-Spork times do not include the overhead of Spork itself, of course.

3) I include both the with/without Spork times because I find myself often editing code that does not get reloaded under Spork -- things like code in the lib/ directory, require'd via initializers. In that scenario, it's not worth it to start Spork, wait for it to come up, then switch tabs and run "rspec spec", it's net better to just do "rspec spec" without Spork running at all.

So of course one has to project this out to one's workflow:

When Spork is appropriate: S + N * I
(S = Spork startup time, N = number of iterations/code-changes to test, I = incremental time to run a test, I.E. "time rspec spec" when Spork is running.)

When Spork is not appropriate: N * C
(N = as above, C = total time to run a test, I.E. "time rspec spec" when Spork is NOT running.)

So you can see that these wind up looking as follows here:

First variant, with Spork: 5 + N * 17
First variant, without Spork: N * 22

Last variant, with Spork: 6 + N * 7
Last variant, without Spork: N * 13

So given a hypothetical situation where I change my code 30 times and want to run tests after each one, I can expect to sit and wait on my machine, slipping out of flow, for:

First variant, with Spork: 515 seconds
First variant, without Spork: 660 seconds

Last variant, with Spork: 216 seconds
Last variant, without Spork: 390 seconds

So for both workflows there's some pretty big wins here although how big is a function of how much you can use Spork.
Jon Frisby says:
August 3, 2011 at 11:23 pm
Gah. #2 should say "do not include the overhead of starting Spork itself".
Bogdan Gusiev says:
August 4, 2011 at 7:15 am
Rspec test suite performance should be split in two tasks.

The first one is run a single spec file in development environment.
The second one is run entire large test suite with 500+ tests.

The benchmarks you've done here don't cover any on these use case. You are optimizing very young project that doesn't have enough code yet.

So IMHO this could not fit well real world optimization problems.
Jon Frisby says:
August 4, 2011 at 8:29 pm
@Bogdan: It's a fair point that there are several major use-cases to testing, and that this isn't quite addressing two of them. A third major use-case is editing or refactoring of cross-cutting concerns, where you may uncover hidden/unexpected assumptions more readily than you would in most code. This article suits that scenario fairly well, as you have the frequent-iteration but the need to run many tests.

However I should note that as the code and test suite has scaled up (the article was written a few weeks ago), the benefits have held up fairly well so far.

The benchmarks in the article are actually quite relevant for larger test suites though: Things like the Devise overhead will amplify and impact larger test suites disproportionately since in many apps, most things will need a User object. The same goes for the GC tweaks, which are apparently knocking about a third off the execution time of the whole test suite as of today (based on a quick comparison with/without GC tweaks).

The single-spec scenario, not being well addressed, deserves a bit of a second look, so let's revisit it here. I've actually moved over to primarily using Cucumber on this project so I'll need to establish a bit of a baseline and then we'll proceed with what things look like when using Spork to run one single Cucumber feature.

Baseline - single Feature, Spork is running, no tweaks to Devise, no GC tweaks, MySQL for the test DB:
5 scenarios (5 passed) 22 steps (22 passed) 0m0.988s
real 0m2.944s user 0m1.508s sys 0m0.260s

Setting the # of stretches in Devise to 1:
5 scenarios (5 passed) 22 steps (22 passed) 0m0.491s
real 0m2.432s user 0m1.517s sys 0m0.255s

Disabling hashing completely in Devise:
5 scenarios (5 passed) 22 steps (22 passed) 0m0.459s
real 0m2.414s user 0m1.516s sys 0m0.259s

Using SQLite for the test DB (using the DB-neutral schema loading option -- not running migrations, run once when Spork forks, and using DatabaseCleaner with the Transaction strategy before each individual test):
5 scenarios (5 passed) 22 steps (22 passed) 0m0.506s
real 0m2.612s user 0m1.525s sys 0m0.255s

Undoing the SQLite3 change, and adding the GC change:
5 scenarios (5 passed) 22 steps (22 passed) 0m0.399s
real 0m2.271s user 0m1.518s sys 0m0.257s

So what do we see?

1) The incremental benefit of going from 1 stretch to none is lost in the noise at this scale even if it is more meaningful when one has dozens or hundreds of tests.

2) SQLite3 is a net-loss for the focused-test-execution route.

3) The GC change winds up being slightly beneficial here, although in practice it's just disabling the GC outright since I don't have 10+ scenarios.

The net result of combining the no-hashing-in-Devise, and GC tweaks is a more-than-20% improvement in single-feature-execution time -- modest, but nice to have.
Ryan Bigg says:
August 5, 2011 at 1:29 am
At first, I was skeptical. How could so few changes result in such a massive speedup?

Then I did them on an application here that has a test suite that took 700 seconds.

It now takes 440. This is a *massive* improvement and with further improvements I'm sure it can be made faster.

Thank you, thank you, thank you. I will be doing this for every application I can now.
David Chelimsky says:
August 5, 2011 at 12:21 pm
Good stuff! I'm going to add an option to rspec-core's config to set the GC frequency: https://github.com/rspec/rspec-core/issues/433.
Jon Frisby says:
August 5, 2011 at 7:40 pm
@Ryan: Glad to hear it helped you, and I'd love to know what else you find as you dig into your test suite's performance characteristics.

@David: Awesome! I look forward to the release that includes it!
Jared Fraser says:
August 7, 2011 at 10:39 pm
This GC approach didn't segfault like some of the other solutions out there (and ones i've built myself), so thats great! Shaved ~70 seconds off a 400 sec rspec suite.

In-memory SQLite actually *added* 20-30 seconds on the total suite.

The devise solution didn't seem to have much of an effect, but still took off about 20 seconds.

Thanks for the article!
Jared Fraser says:
August 7, 2011 at 10:45 pm
Cancel that; segfaults on 1.9.2 but not 1.9.3.
Jon Frisby says:
August 9, 2011 at 7:58 pm
@Jared: The SQLite3 thing seems like there could easily be scenarios where the front-loaded costs of an app are higher than the incremental costs. If you're not actually doing a "lot" of DB operations in your tests -- particularly if you have a large/complex schema, that could easily account for why SQLite is a net loss. As to why the Devise change isn't as big for you, sounds like your models are less tightly coupled to your User model than perhaps mine are.

And in terms of segfaulting, that's strange and a bit beyond my ken but I can report that I've had no problems on an rvm-installed Ruby 1.9.2p180.
Benjamin Quorning says:
August 15, 2011 at 12:08 pm
As Kieran mentioned, I valid_password? may fail now. And when doing integration tests, it did. So I fixed it this way: https://gist.github.com/1146099
zoras says:
August 30, 2011 at 8:33 am
Is there a workaround to use load_schema for schemaless databases like mongodb(mongoid)?