Archive for September, 2007

Some people just don’t care about IT

Posted on September 22nd, 2007 in IT | No Comments »

This was a doozy of a week at my job. Well, given the low morale spread throughout the entire company, from the office to the manufacturing employees, it always seems to be bad enough. However, things got worse yesterday. While most, (I won’t deny, myself included) were just waiting for the day to end to collect our paychecks and head home for a nice weekend. But that didn’t happen.

In the afternoon, one of the newly-bestowed ’supervisors’ (and I use that term very loosely, as I don’t think he’s the right person for the job) gave one of the two IT technicians the pink slip. I don’t know the reasons for termination, and I really don’t care as it’s none of my business, but it seems to have come out of the blue, without any single or even accumulative problems to back up the termination. With this, the other IT technician, already angry for other injustices he’s witnessed personally from a management level, saw that as the straw that broke the camel’s back, and immediately handed in his resignation. Two IT technicians on Friday morning, no one left afterwards.

As I said, I don’t want to go into details into what happened. I’m a believer that things happen for a reason, and I’m sure both IT technicians, who were two of only four people I trust in that entire company, will get back on their feet, better than ever sooner rather than later. But 24 hours after the events took place, I’m still having a sinking feeling in my stomach about the way management has worked with this.

First off, while the main focus of the company I work for isn’t directly associated with computers and electronics (we’re mainly an electronics recycling company), it does play a major role in the company’s growth. The company has a subsidiary to tell used computers and electronics as part of their recycling plan. As such, you’d expect that their IT department should be one of the highest-priority and highly-coveted departments they have.

If you read the beginning and noticed the amount of employees I mentioned above, you know that this isn’t the case. As of Friday, the IT department consisted of the two technicians (the one who was fired was a part-timer, but not by choice), one guy who’s being trained as we speak to do part of the technicians duties - testing computers, installing software, etc. - and one intern used to do the tedious testing of smaller parts like memory modules and hard drives. So in one fell swoop, half of the technicians - the actual trained ones - are gone.

While I’m mostly their lone programmer, I’m the only one around who knows how to work with the servers, a mix of Windows 2003 and Linux. Add in two supervisors thrown into the mix and are not fit for the job - one is actually the marketing director, the other more geared in working with industrial equipment rather than computers - and you’ll find one very disorganized IT department. So much for being a highly-coveted department, don’t you think?

From what I’ve read around the Internet, it seems like managers of all sorts of companies realize the importance IT plays in their overall strategy, but they fail to treat them as an important piece of the puzzle. And I’m witnessing this first hand. While the electronics recycling section is the bread and butter of my company, the sales of electronic equipment is what will take them to the next level. But things aren’t being done to remedy the problems at hand.

Before both technicians left, there was basically one-and-a-half technicians doing the brunt of the work - not enough for the volume they want to move. With these people gone, I just know I’m going to be called to fill in while some other hapless soul walks into this mess of an IT department. The low volume, in turn, made management angry, using scare tactics such as saying that they’re losing about $10,000 a month (which I think is bullshit - no one will withstand losing more than $100,000 a year in a company that probably doesn’t make more that $1 million, probably, for two straight years) and stating that they’re going to close the entire equipment testing operation, leaving them out of jobs. And they wonder why morale is so damn low.

While this is more of a personal piece geared at those who we lost at the company, the entire point is that if you have such great resources at your disposal, willing to get the company to reach new heights, as I personally know both technicians did, then treat them as such - an integral part of the business plan. You never know when your entire IT department just up and leaves.

Will this fail? Who cares!

Posted on September 14th, 2007 in Programming | No Comments »

I’ve been noticing a trend about programming news on social bookmarking sites like Digg and DZone lately. There are a lot of posts floating around indicating the so-called “warning signs” of imminent software project failure. There are even posts with titles like “How to guarantee your project will fail!” How exciting. Just makes you want to click the link and get increasingly depressed as you read the exact same reasons you’re experiencing in your own projects.

While I do think these posts offer some advice on new programmers or those still in college, I think these posts are just over-kill. There are so many posts saying the same thing, it’s extremely rare and surprising to see some new reason that has a valid point. Also, most of the points in those posts are obvious to anyone who’s been programming even for a short while. Some of those points that have been repeated ad-nauseum and obvious deal-breakers:

  • Setting unrealistic goals (Really? If I set an impossible goal, does that mean I’ll most likely fail?)
  • Adding more people to a delayed project (A manager who thinks that new employees will automatically hit the ground running is one who should be shot - or at least not in a managerial position.)
  • No source code system (Any IT department without any type of backup is just asking for trouble. Hell, any individual programmer working solo on a project should be smart enough to use source control and other ways of backing up your code.)
  • Unmanaged schedule or, worse yet, no schedule at all (Again, really? But winging it is so much fun!)

All of these posts basically state the same reasons over and over again. I’d be glad to send these posts over to someone who’s learning programming now, or a manager (like mine, unfortunately) who doesn’t have the slightest clue on how to manage even the simplest of software projects. But the point is that these articles are written and posted in places where the audience consist largely of professional programmers with many years of experience under their belt. So what’s the point? I’m sure they know more than this, and have experienced at least two or three axed projects because of these same things.

If you really want to help, just point your readers to go grab a copy of books like The Mythical Man-Month or Dreaming In Code. These stories of real software failure provide more than enough information on avoiding software disasters. They’ll offer much more insight to problems in software development than any one post has. In short, those who forget history are condemned to repeat it.

So, seriously, please stop writing these posts. You’ll be better off working on not making your project fail or something.

Capistrano - Like that person you hate, yet end up falling in love with

Posted on September 3rd, 2007 in Open Source, Ruby On Rails, Software | 6 Comments »

One of the reasons I went exploring into Rails was because of Capistrano, a utility that greatly helps deploying Rails sites into production servers, by automating many of the tedious setup steps needed to deploy new changes into production. I know I can’t be the only one who has once or twice pushed some new change into production, only to discover (by myself or an angry user) that I forgot to bring the database schema up-to-date as well. Capistrano (actually, migrations are the key component here) will never allow that to happen again.

In a nutshell, Capistrano does the following:

  • Logs into your server via SSH
  • It creates a directory structure that’s useful in case you want to rollback some bad code you mistakenly pushed unto the server
  • Uses Subversion (or other SCM) to checkout the latest code committed into the repository and downloads it
  • Automatically runs all migrations to make sure the database is up-to-date
  • Runs other scripts, like making sure your FastCGI processes are restarted and running correctly, or restarting your web server

After nearly finishing my first public Rails site (coming soon, I promise!), I wanted to learn how to use this tool by deploying the first version into my VPN space using Capistrano. I thought this would be a daunting task, and at the beginning it was thanks to some minor errors, but after that, it was total bliss. After a few hours of tweaking my settings, I finally got it to work, and all deployments from here on out should be as simple as writing cap deploy without and remorse.

I’m going to write how I set up my deployment environment, so if anyone has had similar problems to mine, they can hopefully get past them.

First off, let me write about my app and server environment:

  • Operating System: CentOS 4.5
  • Webserver: Lighttpd 1.4.15
  • Rails Version: 1.2.3
  • Mongrel: 1.0.1
  • Capistrano: 2.0.0

After installing Capistrano on my development computer (simply using gem install capistrano --include-dependencies), I was ready to “capify” my application. To create the files used for deployment, just issue capify . at the root directory of the Rails application. This creates two files: Capify, which points to the second file, config/deploy.rb, which is the actual deployment configuration file.

The configuration is pretty straight-forward. There are some default settings that can e easily changed to reflect your production server setup. I did just that, and that’s where the ‘gotchas’ started pouring in.

After I changed the default values to my own, I wanted to set up the directory structure in my production server. To do that, simply run cap deploy:setup in the root directory of your application. That should prompt you for your SSH password to create all the directories needed in the directory specified in the deployment file (using the :deploy_to variable). However, when I did that, I got a nice error message: no such file to load — openssl.

After searching a few minutes in Google, I found my problem: I had compiled Ruby from source without the openssl-devel libraries installed in my system. Without the header files, Ruby compiled without OpenSSL support. So after installing the OpenSSL header files and recompiling Ruby (don’t forget to run make clean before recompiling), I was faced with another error message: It stated that my server didn’t exist. Then I remembered I’m running SSH in a non-standard port. Capistrano assumes it’s running on the default port, which is 22. After a few more minutes of searching, I found an option that needed to be added to the deployment file: ssh_options[:port] = xx, where xx is your SSH port number. After these changes, I was golden, as Capistrano asked for my SSH password.

After entering it in and seeing some progress in the directory creation process, I was faced with yet another error message, about a user not existing. I was assuming Capistrano was using the user name from my development box to log into the production server. In any case, this was fixed by adding another option in the deployment file: set :user, "production_user", where production_user is the user name with the appropriate permissions to create the directories and files in the production server. I ran cap deploy:setup once more, and all my directories were created. Success! Little did I know that would be only the first steps, and more troubles were looming ahead.

Once I verified the directory structure was created correctly on my production server, I went ahead and ran cap deploy:cold, which deploys my latest working version to the server, runs all migrations, updates all symlinks to the current code, and runs all remaining processes, like respawning all FastCGI processes, for the very first time. I once again ran into a small snag, as I was having permission problems running some scripts on the production server. After some more minutes of searching, I found that there’s a variable that needs to be set to make sure Capistrano runs the scripts as a specific user with adequate permissions. After adding set :runner, "production_user" (once again, where production_user is the user with the correct permissions to run your application scripts) to my deployment file, I was able to pass the permission parts, but then I hit yet another snag: I was missing a file - script/spin.

I found it odd that Capistrano was looking for this file, as it’s not automatically generated either by Rails or Capistrano. But after calmly reading the Capistrano installation instructions (instead of skimming over most of it), I saw that this file is used to recreate (or create) the FastCGI processes in your production server, to ensure that the users will get served the latest version of your app. There are many different ways to set up your FastCGI processes, depending on what the web server you’ll use. Since I use Lighttpd, I’ll be writing about that here. But you can find tons of useful information on the Internet if you use Apache, nginx or any other web server.

To remedy this problem, all I needed to do was to create the script/spin file (with executable permissions - chmod 0755 script/spin) with the following line (where /root_of_app/ is the path you described in the set :deploy_to: variable in the deployment file):

/root_of_app/current/script/process/spawner -a 127.0.0.1 -i 3 -r 5

This script calls another script called spawner (included in current versions of Rails), which verifies if there are FastCGI processes currently running. If the processes exist, they’re recreated to show the new version of the app. If the processes don’t exist, they’re created. The -a switch indicates the IP address used to direct the FastCGI processes. If you don’t use this switch, it will default to 0.0.0.0, which was causing me problems later on. The -i switch tells the script to create three FastCGI processes in sequential ports. Finally, the -r switch tells the script to verify if these scripts are still active every five seconds. This makes sure that all processes are running smoothly. One switch I didn’t use was the -p switch. By default, the spawner script creates all FastCGI processes starting with port 8000. Using the -p switch, you can specify which is the first port. In my case, the three FastCGI processes are creates using ports 8000, 8001 and 8002. You can change that default if you wish.

After you create the spin script, you’ll need to commit it to your SCM so Capistrano can find it in the production server. Once committed, I re-ran the cap deploy:cold command, and I was greeted with success at the end. My latest version of the application code was sent to the server, all migrations ran, and the spin script created three FastCGI processes on my server. Awesome! My work here with Capistrano was done. After hating Capistrano for a good while, I fixed all the kinks and can now never live without it. I love you, Capistrano.

Feeling good for myself, I immediately fired up my browser and entered my site’s URL. Too bad only 500 - Internal Server Error appeared when I went to the site. Curious, I entered the URL once again, appending :8000 at the end of the URL, and lo-and-behold, the site appeared in all its glory. So the FastCGI processes created with Mongrel were working well. But my web server wasn’t transferring the requests to one of the three processes.

After looking around for more information, I saw how FastCGI processes, Mongrel and Lighttpd work together. In a nutshell, the request for the site is sent to Lighttpd, the web server. Lighty then needs to process this request and send it over to one of the FastCGI processes, which then displays the site on the user’s screen. Lighttpd is simply used in this case as a proxy, and lucky for me, it already has some basic proxy functionality built-in. However, it needs to send the the request somewhere. I saw some tutorials online that set this up, but it seemed to always send the request to only one of the three processes, which wasn’t efficient at all.

Here is where Pound comes into play. Pound is reverse-proxy and load balancer for web servers. Basically, it takes all requests from the web servers and passes it along to the processes running the site, making sure that all processes aren’t over-worked by load-balancing all requests. After installing Pound on my production server, I had to create a configuration file, by default stored in /usr/local/etc/pound.cfg (your location may vary, depending if you compiled and installed the program from source, or just installed a package):


ListenHTTP
Address 127.0.0.1
Port 7999
Service
HeadRequire "Host: .*site.com.*"
BackEnd
Address 127.0.0.1
Port 8000
End
BackEnd
Address 127.0.0.1
Port 8001
End
BackEnd
Address 127.0.0.1
Port 8002
End
End
End

This configuration will make Pound listen to the requests on port 7999 in the local machine (my production server), and forward the site’s requests to one of the three FastCGI processes created by the spawner script I talked about previously. I was surprised at how something so powerful could be easily implemented.

Now all I needed to do was to configure my web server to direct all site requests to Pound, which in turn passed them along to one of the three FastCGI processes. Skipping all the other default settings and changing all sensitive info that may compromise my site, here are my current Lighttpd settings for the site in question:


$HTTP["host"] =~ “(^|.)site.com$” {
server.document-root = “/home/production_user/railsapps/app_name/current/public”
server.error-handler-404 = “/dispatch.fcgi”
server.errorlog = “/var/log/lighttpd/site.error”
accesslog.filename = “/var/log/lighttpd/site.access”
proxy.server = ( “” => ( “site” => ( “host” => “127.0.0.1″ , “port” => 7999, “check-local” => “disable” )))
)

Make sure you load the mod_proxy server module so you can use the proxy.server option mentioned above.

Once I restarted Lighttpd, I entered my site’s URL, crossed my fingers, and… success! I finally had a working site, load-balanced and all. What set out to be a learning process in Capistrano in turn made me use load balancing techniques in my site, which was something I planned on doing, but on another day, thinking it was super-complicated.

From here on out, every time I make a change I want to push to my site, all I need to do is run cap deploy on my development box, and that’s it. Everything will be updated with a simple command. It’s truly worth the time I spent getting it to work. Now I will never have an angry user again because I forgot to update the database schema.

In all, I spent a few hours fixing all the small kinks I encountered along the way. But as all good things go, you need to bust your ass to get things working like you want to. I don’t mind at all, as I learned a whole lot in one day. I hope someone finds some solutions in this writeup.

In case you’re curious, here’s my deployment file, with all the sensitive info changed for obvious reasons:


# Application Name - Anything you want to describe your application
set :application, "app_name"
# The URL of your source code repository, pointing to the latest version
set :repository, "http://svn_repo/trunk"
# Set the user name to connect to the server via SSH
set :user, "production_user"
# Set the user name of the user with permissions to run the application scripts
set :runner, "production_user"
# Set the path where you want your application to be stored
set :deploy_to, "/home/production_user/railsapps/#{application}"
# Option to change the SSH port
ssh_options[:port] = xx
# The URL or IP Address where your application will be stored - Multiple sites can be specified
role :app, “xx.xx.xx.xx”
# The URL or IP Address where your application will be served - Multiple sites can be specified
role :web, “xx.xx.xx.xx”
# The URL or IP Address where your database lives - Multiple sites can be specified
role :db, “xx.xx.xx.xx”, :primary => true
# Task to restart the web server
task :restart_web_server, :roles => :web do
sudo “/etc/init.d/lighttpd restart”
end
# Restart the web server once the deployment is finished
after “deploy:start”, :restart_web_server