

To run automated tests for your Ruby on Rails webapp, not only do you need your latest database structure deployed to the test database (created by rake db:test:prepare
), but you also need some seed data for lookup tables, e.g. like zip codes.
Common approaches like adding seed data through rails migrations are discouraged, and plugins like seed_fu only work for small amounts of seed data. In seed_fu, you can specify a seed
method for your ActiveRecord models like so:
User.seed(:login, :email) do |s| s.login = "bob" s.email = "bob@bobson.com" s.first_name = "Bob" s.last_name = "Bobson" end
Running the rake db:seed
task provided by seed_fu will add all defined models to your test database.
DHH has even standardized a way to load seed data for Rails 3, making the rake db:seed
task part of rails and setting up a file called db/seeds.rb
for maintaining your seeding code. Using that file, you can load your seed data however you see fit, e.g. seed_fu.
How to Deal With Big Amounts of Seed Data
So far, so good. There are ways to load seed data into your rails test database using Ruby code. But what if, like in our case, you have to seed more than 60,000 Points of Interest and over 16,000 cars? We definitely don’t want to write Ruby code for each of them. The only sane way of handling such amounts of data are database dumps. So I added my own rake db:seed:dump
and rake db:seed:load
tasks to our Rails 2.3.2 application. As soon as we move to Rails 3, we can call the load task from within db/seeds.rb
.
Short and sweet (and completely MySQL specific and dependent on MySQL living in your path 😉 ) here are my two rake tasks:
namespace :db do namespace :seed do require 'db/seed_tables' desc "dump the tables holding seed data to db/RAILS_ENV_seed.sql. SEED_TABLES need to be defined in config/environment.rb!!!" task :dump => :environment do config = ActiveRecord::Base.configurations[RAILS_ENV] dump_cmd = "mysqldump --user=#{config['username']} --password=#{config['password']} #{config['database']} #{SEED_TABLES.join(" ")} > db/#{RAILS_ENV}_seed.sql" system(dump_cmd) end desc "load the dumped seed data from db/development_seed.sql into the test database" task :load => :environment do config = ActiveRecord::Base.configurations['test'] system("mysql --user=#{config['username']} --password=#{config['password']} #{config['database']} < db/#{RAILS_ENV}_seed.sql") end end end
Note that I use a file called db/seed_tables.rb
to define, which tables shall be dumped. It just holds an array of table names like so:
SEED_TABLES = [ "auxilary_services", "background_informations", "pois" ]
Using two basic rake tasks and database dumps eases the pain of handling test data for us. How do you manage your test data? Let us know in the comments!
Hey thanks for the writeup! I’ve been using shell scripts to do the same, but this is a nicer way to handle it.
You should consider making this a Rails plugin.
LikeLike
Great that you like it. Unfortunately it’s still a little too rough for becoming a gem.
LikeLike
Good solution,
I’d like to add –host=#{config[‘host’]} in command line
LikeLike
In rails 2.3.8 you don’t need the
load
task anymore. You can just dropinto the given
db/seeds.rb
file and run:or use
rake db:setup
to create your database and load the seed data from your SQL fileLikeLike
Can this technique also be used to upload data to a Rails hosting company, like Heroku? Can you give me a little hint as to how I would seed my “categories” table to my database up on Heroku?
LikeLike
Matt,
if you add your
db/#{RAILS_ENV}_seed.sql
file to git and push it you should be able to run$ heroku rake db:seed
from your box. I did not try it myself so it would be great to hear, if it works for you.LikeLike
I got sick of having seed data and then layering rake tasks for environmental data on top so put together seedbank which gives you common seeds under db/seeds/*.seeds.rb and seeds for your enviroment under db/seeds/ENV/*.seeds.rb
This gives me the ability to have an entire working db in place just using;
$ raked db:setup
This will load all the common seeds and my development environment seeds in one go.
LikeLike
@James2m Sounds cool. Is your seedbank OpenSource?
LikeLike
Yep. It’s right there on github. I didn’t need anything fancy like loading SQL files and had some need to use factories when large amounts of data needed creating so extending the seeds.rb approach worked best for me.
LikeLike
Thank you!
LikeLike