We moved our static resources to S3 back in May of this year. The transition was so ‘simple’ and seemless that it’s hard to believe we’ve been using it for over 6 months now. Matthias is now thinking about doing the same and asked me for a howto including any pitfalls and caveats that I stumbled upon along the way.
Did you know that the first release notes for S3 date back to March of 2006? It’s pretty safe to say that Amazon’s Simple Storage Service (S3) is a very mature service. S3 – just uttering those 2 characters alone bring up visions of fluffy cloud coolness and limitless storage arrays. But is it a real business advantage? Half a year later, I’m completely convinced.
New Static Resource Domain and URLs
The first thing you need to do is refactor your site to serve up images, js & css files from a separate, cookieless subdomain. Create a test subdomain as this will be your initial development sandbox (and will ultimately serve in your test environment).
As I wrote last year, such a subdomain is just best practice and will give your users a nice bump in performance. Believe it or not, this is probably the hardest part of preparing to move your static files to a CDN. So once you get this done, you’re home free!
SPIKE: host main CSS file in S3 and monitor
Next, create a test S3 account and upload a single file to it (I used the Firefox plugin S3Fox). We uploaded our main css file as this is served with every page request. Setup a CNAME entry on that test subdomain you created above pointing to the S3 Bucket Name provided by Amazon (ignore the cloudfront domains for now ;)):
After the DNS smoke clears in an hour or so, try and access your file using the test subdomain. If you set the permissions correctly, you’ll see your file and can check out the cool Amazon S3 headers.
The next things I wanted to know about were performance and billing. To get an idea about both of these, I created a Pingdom check to S3 using that CSS resource. For a week’s usage, I was charged 28 cents, and the global response times from the Ireland datacenter were excellent. SPIKE complete!
Upload & sync static resources to S3
The final step for us was to upload all our static files to a the test S3 instance, and then ensure that any new static files were correctly uploaded or synced. To accomplish this, I’m using the s3sync and s3cmd tools. We have 3 sources of static content: user generated images on the frontend, CMS generated images from our editors and developer generated files (js, css & images).
Here’s a sample script showing s3sync and s3cmd uploads:
#!/bin/bash if [ $# -ne 1 ] then echo "Usage: `basename $0` [cloud-t.ndimg.de|cloud-p.ndimg.de]" exit 65 fi s3cmd put $1:favicon.ico /full/src/path/images/favicon.ico cache-control:max-age=7776000 x-amz-acl:public-read s3cmd put $1:blank.gif /full/src/path/images/blank.gif cache-control:max-age=7776000 x-amz-acl:public-read s3sync -rpv --cache-control="max-age=604800" --exclude=".svn(/|$)" /full/src/path/css $1: s3sync -rpv --cache-control="max-age=604800" --exclude=".svn(/|$)" /full/src/path/js $1:
For the user and CMS generated images, we’re using the PHP Zend Amazon S3 service class. I wrote a simple wrapper script around this which I’m more than happy to share (just send me an email).
After a week of poking and prodding, we were satisfied. This test instance was our first guinea pig for ensuring the correct operation of all these new scripts, and it’s now used in our test environment (will wonders never cease). Just ensure you get the deployment order and configuration correct in your continuous integration build. Don’t s3sync your resources until all your tests pass, and don’t deploy your code until after you’ve synced your resources.
In my next post, I’ll tell you how we made the logical step of getting these resources pushed out to Amazon’s CDN Cloudfront. In the meantime, share your S3 experiences with us. What things tripped you up – which features surprised you?
3 thoughts on “Moving Static Resources to S3”
Initially we also tried using s3sync as asynchronous data transfer. Later on we discovered “s3fs” and use this instead of s3sync. Till now its very reliable, we never lost the mount point so far and it works really great and you don’t need to care about syncing…