How do you decide who in the company should have access to the test and production environments? Opening it up to everyone is one extreme which in today’s security sensitive world is no longer an option. At the far other end, granting only one person access, while perhaps more secure (depending on who you’ve entrusted this ‘honor’), is just as dangerous. People are not gods, and besides spontaneously dying, also have nasty tendencies to get sick, go on vacations and (heaven forbid) quit their job.
So most companies shoot middle of the road – granting access to a select group of individuals and ensuring there’s some decent logging to monitor usage and proper sick leave and vacation coverage. The question I’d like to help you answer is which individuals should have access. And, of course, like any sane individual, I’ll start my answer with ‘It depends…’ – it depends on the degree of configuration management your environment has.
Wild Wild West
I guess most websites in the world are quite adequately managed by basic ftp access. The developer has full access to the site and makes all the changes directly on production. The only “backup” is probably just the local development copy and this is probably enough for most hobby sites. If the remote server dies and the developer quits, another website disappears from the world. Plainly speaking, there’s no CM going on here at all.
Really scary are all these “start-up” companies that run their business sites in much the same manner. The contractors they’ve hired to create their web presence are just hired guns – they could care less if the site falls off the face of the earth once they’ve gotten paid. This is why having a backed up source code repository is so important. If you’re running your business online, make sure your developers are using a code repository hosted either on a company server or one of the handful of free online providers. As for backups, Matthias will tell you some more about how to do this next week.
Drifting towards complete Configuration Management
Ok, so you’ve got your code under source control. Developers are working on different projects (branches), committing their changes, and merging their work with other developers. Releases are happening pretty much like before, except now there’s at least some SCM in between. Maybe the developers are still even using ftp or scp to push their changes straight to production. While we’re in a better situation now, we still have a fundamental problem happening. The production system slowly “drifts” away from what’s in your repository.
Think of a light snow in the evening. While it might be wonderful to watch, the next morning you have two feet of snow to shovel out of your driveway. “Drift” happens whenever human beings have direct access to test or production systems – no matter how disciplined or experienced the developer or sysadmin, they will make changes on this production system and forget to commit them back to the repository. As developers tend to make a lot more changes than sysadmins, we take their access away in order to gain more “control” over the environments.
Putting Your Generals on the Frontline Makes for a Short War
Fact is, sysadmins are people too (*gasp*). They also feel pressure to make hotfixes directly on live systems. And they don’t always remember to get this change back into the CMDB. For mission critical business systems, every second of downtime costs serious money. Developers generally have zero access and the most trusted and knowledgeable operations staff in the company are barely given the resources needed to keep the lifeblood of the company circulating 24/7. These are sysadmins with usually 5+ years of experience, and can always be counted on to tell some really horrific stories of random hardware glitches and rogue uninterruptible processes which go through customer data like fat kids through Happy Meals™.
You need to get these senior guys, your generals, off the front line and back to the staging area. They need time to direct their experience, knowledge and energy towards making your site “release proof”. Point them towards some great CM tools. Let them get back to their engineering jobs coming up with ways to prevent some of these outages from ever happening in the first place.
Limiting human access to your critical systems takes an incredible amount of hard work and serious discipline. But succeeding in doing so will show a marked decrease in critical incidents and failed changes made to your mission critical systems. If you’d like to learn some more about configuration management, check out my review of the “Visible Ops” book chapter.