Fungible Clouds

Using Chef to deploy cloud applications

2012-12-09T22:00:00-08:00

Chef is a popular Apache licensed open source configuration management and automation tool for the cloud. It drives three core ideas in the cloud computing industry.

Fungibility

Chef routinizes the repeatable steps in cloud operations management and it does that in a way that is almost agnostic to the underlying cloud provider. Chef thus helps make applications almost agnostic to the underlying machines.

Using Chef to manage cloud applications makes cloud computer from a provider, say AWS EC2, fairly easily substituted by cloud computers from another provider, say HP Cloud.

As my friend Kevin Jackson describes, Cloud computing has several economic benefits, however, cloud computing would become even more economical if cloud computers were fungibile, meaning, a cloud machine from provider X would be practically no different than another cloud machine from provider Y. Fungibility would make cloud computers easily substitutable driving the price further down by increasing the competition and reducing the differention between providers of cloud computers. Fungibility, by the way, is not a property of eukaryotic organisms.

Routinizing the repeatable is key to successful operations management and is described in detail in a paper on Integrated Operations by Prof. William Lovejoy of The University of Michigan Business School, Ann Arbor and I quote…

” If some task is to be repeated many times, it makes sense to find out the best way to perform the task and then require its execution according to that best practice. This means that in stable task environments, stable work routines and policies will be generated over time, and this is efficient. This derives from March and Simon’s (1958) model of organizational learning. The consequences for this are that one will want to consider the relationship between efficiency and discretion allowed workers in a stable environment. ”

The ability to routinize the repeatable and provide consistent environments from development, through testing and staging, to production is a key benefit to successful business operations in the cloud.

Idempotence

Chef operation (sudo chef-client) is idempotent, repeat runs will produce the exact same resulting machine configuration as the initial run did. Idempotence is the property of certain operations in mathematics and computer science, that they can be applied multiple times without changing the result beyond the initial application. The term was introduced by Benjamin Peirce in the context of elements of an algebra that remain invariant when raised to a positive integer power, and literally means the quality of having the same power, from idem + potence (same + power).

Idempotent operations enables consistently reproducible cloud environments for development and production use. It helps bring order and reduce chaos in business operations.

Embryos and DNA Injections

Ok, I admit, this is going to be an incorrect analogy from biological science perspective but it does seem to work for some people as an crude example to explain the logic. What you get to accomplish is to give life to, say a giraffe (Giraffa camelopardalis), a cow (Bos primigenius), a leopard (Panthera pardus), or a person (Homo sapiens), based on the DNA you inject into an embryo. On similar lines, you create a web server running nginx with a specific configuration, or a proxy running HAProxy, or a database master server running PostgreSQL or whatever you need, by asking Chef to run an appropriate set of cookbooks on top of a cloud machine running just enough OS or jeOS (pronounced as juice or jüs).

Chef helps spin up machines just the way you want with a specific set of software and specific configuration by building up from scratch right from bare metal machines loaded with just enough OS.

Putting these concepts to work

Let’s see in practise how these core concepts pan out in reality. This is best illustrated in form of a hands on exercise of creating an infrastructure in the cloud where we will have the production environment running first on a single server instance which is useful for rapid prototyping of apps while sharing a single machine among multiple applications to minimize cost. Once you’re comfortable with this basic all-in-one configuration, it’s relatively simple to scale it out, separating the various roles onto multiple machine instances. move I must caution you that this is a fairly elaborate setup that would be needed on your linux/unix workstation but fortunately it’s all pretty straightforward, and there’s a lot of good documentation available on the internet.

Tools we will use

A macintosh used as your workstation
Homebrew
Git - a distributed version control system
Chef
Chef cookbooks from the community
Access to a Chef server that you may run on your own - or you can use the hosted service Opscode provides free for up to 5 nodes.
Accounts with at least two cloud compute providers. I will illustrate the concept working across Amazon Web Services EC2 and HP Cloud Compute cloud but you are free to experiment with other cloud providers which should also work using similar concepts.

Setup the basics

homebrew, git, chef, librarian, vagrant, ec2, hpcloud

ruby -e "$(curl -fsSkL raw.github.com/mxcl/homebrew/go)"
brew install git
gem install chef
gem install librarian
gem install vagrant
vagrant box add precise64 http://files.vagrantup.com/precise64.box # latest Ubuntu LTS 12.04 for vagrant
gem install knife-ec2
gem install knife-hp

Connect to your Chef server

Follow the Chef Fast Start Guide to set up your workstation. You need to go through about half that guide up to step 4. Stop there as you don’t need to configure the workstation as a client. In our case, the chef clients will run on vagrant, ec2, hpcloud and other cloud providers you select instead of on your workstation.

If you follow that guide, it will create a ~/chef-repo/.chef/ folder with three files - knife.rb, validation.pem and username.pem. You need to move that .chef folder and those three files and to ~/.chef/ which lets you execute knife commands from anywhere on the workstation.

My knife.rb looks something like this. Please note that I am using my own chef server - not opscode hosted chef - so make sure to customize your knife.rb file to properly connect with your Chef server.

(knife.example.rb) download

current_dir = File.dirname(__FILE__)
log_level                :info
log_location             STDOUT
node_name                'nilesh'
client_key               '/Users/nilesh/.chef/nilesh.pem'
validation_client_name   'chef-validator'
validation_key           '/Users/nilesh/.chef/validation.pem'
chef_server_url          'http://chefserver.somewhereinthecloud.net:4000'
cache_type               'BasicFile'
cache_options( :path => '/Users/nilesh/.chef/checksums' )
cookbook_path            ["#{current_dir}/../cookbooks"]

cookbook_copyright       "Your Name"
cookbook_email           "your@email.addr"
cookbook_license         "apachev2"

encrypted_data_bag_secret "#{current_dir}/encrypted_data_bag_secret"

knife[:aws_ssh_key_id]        = "awsdefault"
knife[:aws_access_key_id]     = "asdasdasdasdasdasd"
knife[:aws_secret_access_key] = "asdasdasdasdasdasdasdasdasdasdasdasd"

knife[:hp_account_id] = "asdasdasdasdasdasd"
knife[:hp_secret_key] = "asdasdasdasdasdasdasdasdasdasdasdasdasdasd"
knife[:hp_tenant_id]  = "123123123123123"
#knife[:hp_avl_zone]   = "Your HP Cloud Availability Zone" (optional, default is "az1")
#knife[:hp_auth_uri]   = "Your HP Cloud Auth URI" (optional, default is "https://region-a.geo-1.identity.hpcloudsvc.com:35357/v2.0/")

line 4 - your username on chef server
line 5 - your pem file
line 6 - your organization name
line 7 - your organizational pem file
line 8 - url:port for your chef server

Create a working directory

mkdir -p ~/fungibility
cd ~/fungibility
git init

Gather community cookbooks

We will use librarian to gather the required cookbooks from the Chef community. Download this Cheffile to your ~/fungibility folder. This ~/fungibility/Cheffile will guide librarian what to bring down to the workstation.

(Cheffile) download

#!/usr/bin/env ruby
#^syntax detection

site 'http://community.opscode.com/api/v1'

cookbook 'apt'
cookbook 'git'
cookbook 'mysql'
cookbook 'apache2'
cookbook 'php'
cookbook 'build-essential'
cookbook 'users'
cookbook 'sudo'
cookbook 'vim'

You can customize this Cheffile depending on your infrastructure needs. Libraian will download these specified cookbooks in $PWD/cookbooks and will also create a $PWD/tmp folder. You can put both these subfolders in your .gitignore file so as to not clutter your git repo.

cd ~/fungibility
librarian-chef install
echo cookbooks >> .gitignore
echo tmp >> .gitignore
git add .
git commit -m "initial commit"

Manage your custom cookbooks

Create a site-cookbooks sub-folder to store any custom cookbooks or customization to community cookbooks.

cd ~/fungibility
mkdir -p ~/fungibility/site-cookbooks
touch site-cookbooks/readme.md
echo "store custom cookbooks in here" >> site-cookbooks/readme.md
git add .
git commit -m "added a place for custom cookbooks"

Send cookbooks to Chef server

knife cookbook upload -a -o ./cookbooks	
knife cookbook upload -a -o ./site-cookbooks

and here is a BE CAREFUL: nuke all cookbooks command just in case you need to start from fresh.

knife cookbook bulk delete -y '.*'

Create Chef environments

Your development, test, and production environments may differ, for example, the development environment might include debugging tools, which may not be installed in production. Chef lets you define different environments and assign a node to a particular environment. Let’s create a dev, a stage and a prod environment which we can customize and fine tune later.

Create environments/dev.rb, with the following contents:

name "dev"
description "The development environment"

Create environments/stage.rb, with the following contents:

name "stage"
description "The staging environment"

Create environments/prod.rb, with the following contents:

name "prod"
description "The production environment"

Commit the environment files to version control:

git add environments
git commit -m 'Add development, staging, and production environments.'

Upload the environments to the Chef server:

cd ~/fungibility
knife environment from file environments/dev.rb
knife environment from file environments/stage.rb
knife environment from file environments/prod.rb

Create roles for your machines

Chef roles are a way to define certain patterns and processes that exist across nodes in a Chef organization as belonging to a single job function. It helps you define a group of recipes and attributes that should be applied to all nodes that perform a particular function.

Lets us start creating a base role that would apply to all nodes, a webserver role, and a db_master role for the master database.

Create roles/base.rb containing the following:

name "base"
description "Base role applied to all nodes."

run_list(
  "recipe[apt]",
  "recipe[git]",
  "recipe[build-essential]",
  "recipe[sudo]",
  "recipe[users::sysadmins]",
  "recipe[vim]"
)

override_attributes(
  :authorization => {
    :sudo => {
      :users => ["ubuntu", "vagrant"],
      :passwordless => true
    }
  }
)

The run_list method defines a list of recipes to be applied to nodes in the base role. The override_attributes method overrides the default attributes used by community recipes. for example, this will overriding attributes used by the sudo cookbook so the vagrant and ubuntu users can run sudo without manually entering a password.

Create roles/webserver.rb containing the following:

name "webserver"
description "Web server role"

all_env = [ 
  "role[base]",
  "recipe[php]",
  "recipe[php::module_mysql]",
  "recipe[apache2]",
  "recipe[apache2::mod_php5]",
  "recipe[apache2::mod_rewrite]",
]

run_list(all_env)

env_run_lists(
  "_default" => all_env, 
  
  #"dev" => all_env + ["recipe[php:module_xdebug]"],
  "dev" => all_env,
  
  "prod" => all_env, 
)

The env_run_list method in this webserver role defines different run lists for different environments. all_env array defines a common run list for all environments and is appended with additional run list item php:module_xdebug unique to dev environment in this case.

Create roles/db_master.rb containing the following:

name "db_master"
description "Master database server"

all_env = [
  "role[base]", 
  "recipe[mysql::server]"
] 

run_list(all_env)

env_run_lists(
  "_default" => all_env,
  "prod" => all_env,
  "dev" => all_env,
)

The all_env array again defines a common run list for all environments.

Commit these three roles, base, webserver, and db_master created under the roles subfolder.

git add roles
git commit -m "add roles for base, webserver and db_master"

Update the roles in Chef Server

knife role list # list all roles
knife role delete `rolename` # to delete any stale role you may not need

knife role from file roles/base.rb
knife role from file roles/webserver.rb
knife role from file roles/db_master.rb

Set up a sysadmin

You need a user account created with sysadmin privileges on every node. You accomplish that by defining a data bag for the users cookbook, with attributes describing your credentials.

It is best to use your existing user credentials from your workstation. Look for your public key under ~/.ssh for a file named id_dsa.pub or id_rsa.pub or similar. That is your public key for your user account $USER on your workstation. Create a public/private keypair if you don’t find any by executing these commands.

echo "Checking for SSH key, generating one if it doesn't exist ..."
[[ -f ~/.ssh/id_dsa.pub ]] || ssh-keygen -t dsa -C your@email.address

echo "Copying public key to your clipboard so you can paste it whereever you like ..."
[[ -f ~/.ssh/id_dsa.pub ]] && cat ~/.ssh/id_dsa.pub | pbcopy

Then create a new data bag file named after the user you want to create:

mkdir -p data_bags/users
vi data_bags/users/$USER.json

Edit below and paste your public key as one long string into the ssh_keys segment below and add this to the $USER.json file. Remeber to replace nilesh with your username on your workstation.

{
    "id": "nilesh",
    "ssh_keys": "ssh-dss AAAAB3NzaC1kc3MAAACBANcunES89sbKlIhrtkpnECp7Z4a+BlJHZTHYjBAo/Itw2R4WmuXhbQiEcYdiYR0tZjKmIXzzG5M5wWIzpmvuOaBxThVMKk8Irgu0bzi9eNY/MD+EDTNRhzry8q/IJeh8jDRfSB2exdcMcFAjmiVdKJd5bbql5NkU9uZaxGhV2W8XAAAAFQCVxO/iejN6s/ToaJWfV8IEFaJiqwAAAIAHl3vQcjQ40G+ZLoj8S73fU7/XhX8ushb3fP4ERCFUm54mvkkezUXJGupUgEihZuPNHWZdvjouzD7H1HMf6xLaR/umjzBX3sNhKFwA0I1gFBsxnHEu3QW0JV9ObJdmfz70lm9/y8Cj96T+ErkgRKd7dW7XWeF125cR9yPWmPWsZwAAAIEAvXo9aoAtX9ZS/Z9WmNcdP2IH4/blOnLr8wMDk+r4hUd7nExWFF7ckDwOl5Wlm1iagvUHzkjRHQjyPX9uEs3WAxm7kk6ofnBiFYzfNAGemDgN1D5FkpTeg/cbkYohpr9Zyl9m1N5hV0jBW5faoh/O0KmFInLVi7yIrPHQNjGv/9o= your@email.address",
    "groups": [ "sysadmin", "dba", "devops" ],
    "uid": 2001,
    "shell": "\/bin\/bash"
}

Commit the data bag file to version control:

git add data_bags
git commit -m 'Add sysadmin user data bag item.'

Upload the data bag to the Chef server:

knife data bag list # to list out existing data bag items
knife data bad delete `itemname` # to delete any stale data bag item you may want to get rid of

knife data bag create users
knife data bag from file users users data_bags/users/$USER.json

Encrypt your secrets

Use an encrypted data bag to store secrets like passwords and encryption keys.

Create an encryption key:

openssl rand -base64 512 | tr -d '\r\n' > ~/.chef/encrypted_data_bag_secret
chmod 400 ~/.chef/encrypted_data_bag_secret

Add this line to ~/.chef/knife.rb which would copy the encryption key to Chef clients so they can use it for decryption.

encrypted_data_bag_secret "#{current_dir}/encrypted_data_bag_secret"

Set the $EDITOR environment variable to vi by adding export EDITOR=vi to ~/.bash_profile or ~/.zshenv). The knife data bag command will launch this editor and allow you to edit the encrypted data bag contents. Create a new encrypted data bag item for storing MySQL passwords:

knife data bag create --secret-file ~/.chef/encrypted_data_bag_secret secrets mysql

Enter this into vi when it opens and make sure to select better passwords that these:

{
  "id": "mysql",
  
  "dev": {
    "root": "dev-my-root-password",
    "repl": "dev-my-replication-password",
    "debian": "dev-my-debian-password"
  },
  	  
  "prod": {
    "root": "secret-root-password",
    "repl": "secret-replication-user-password",
    "debian": "secret-debian-password"
  }
}

This would encrypt, set passwords for the dev and the prod environment and upload to the Chef server. To save the encrypted data bag locally, download that into a file and commit to version control in JSON format.

mkdir -p data_bags/secrets
knife data bag show secrets mysql -Fj > data_bags/secrets/mysql.json

cat data_bags/secrets/mysql.json # you will see the encypted version of mysql data bag item
{
  "id": "mysql",
  "dev": "vLb86kC71FK6Z860ru/5Nkz3oKTOu/+4fPY2ics3h82mfiZEZTS3KR3QF8LV\nORwCikcK32ahjpwvgYVo3IexpDRh3tyPKWs3tlup7m7dsiDs9TrKbYsL3Ze+\n/9N6cQweV2+MbJmJ7+qqRjmyxEECbg==\n",
  "prod": "/2tmCI0ewAmhSz9jr/izKLeqyEPUmq56p+9Ls7sf3Du6++hsryRnRse9ZDst\np+Z0OYKla0zzrknROqUrWCks+rmGuAMjHmUqFP14vSYN9F6znsf0I9EnEsLV\ncnflOuspU130zki7foaJmBo/OtyM5Q==\n"
}

Save that in version control.

git add data_bags
git commit -m "added encrypted data bag for mysql secrets"

Decrypt the data bag if you need to inspect. Just include the –secret-file argument:

knife data bag show secrets mysql --secret-file ~/.chef/encrypted_data_bag_secret

Modify the encrypted data bag if you need using the knife data bag edit command:

knife data bag edit --secret-file ~/.chef/encrypted_data_bag_secret secrets mysql

but make sure to commit any secret changes back into git. Next chef run would apply the changes onto nodes.

Make mysql::server recipe use the encrypted data bag

The best practise is to keep customizations to community cookbooks in a separate site-cookbooks folder. However, I havn’t figured out yet what is the best way to override mysql community cookbook with this segment of code so resort to a hack and add the following to the top of cookbooks/mysql/recipes/server.rb:

# Customization: get passwords from encrypted data bag
secrets = Chef::EncryptedDataBagItem.load("secrets", "mysql")
if secrets && mysql_passwords = secrets[node.chef_environment] 
  node['mysql']['server_root_password'] = mysql_passwords['root']
  node['mysql']['server_debian_password'] = mysql_passwords['debian']
  node['mysql']['server_repl_password'] = mysql_passwords['repl']
end

Git will complain if you commit such a hack to version control so its best to remember that this is a hack and needs a cleaner approach to accomplish mysql cookbook to use encrypted data bag item for secrets.

git add cookbooks/mysql
git commit -m 'Read MySQL passwords from encrypted data bag.'

upload the updated mysql cookbook to Chef server.

knife cookbook upload mysql -o ./cookbooks

Spin up a single instance on EC2

This command assumes that you have a key named awsdefault and the corresponding awsdefault-east.pem pemfile saved in ~/Downloads folder, that the pemfile is marked read only for you chmod 400 ~/Downloads/awsdefault-east.pem, that you have a security group named webserver defined on EC2, that you have roles defined and upload onto Chef server, and that have environments also uploaded. The specified AMI ami-9c78c0f5 is the official 64-bit Ubuntu 12.04 EBS image in the us-east-1 region. If you want to use a different EC2 region, select a similar AMI in your desired region from the ubuntu AMI list. Also, you musts specify the db_master role before the webserver role.

knife ec2 server create \
    -S awsdefault -i ~/Downloads/awsdefault-east.pem \
    -G webserver,default \
    -x ubuntu \
    -d ubuntu12.04-gems \
    -E prod \
    -I ami-9c78c0f5 \
    -f m1.small \
    -r "role[base],role[db_master],role[webserver]"

After the provisioning is completed, knife will list some details on your new EC2 instalce like this.

...
...
ec2-50-17-75-229.compute-1.amazonaws.com [2012-12-12T01:19:21+00:00] INFO: Chef Run complete in 259.308362 seconds
ec2-50-17-75-229.compute-1.amazonaws.com [2012-12-12T01:19:21+00:00] INFO: Running report handlers
ec2-50-17-75-229.compute-1.amazonaws.com [2012-12-12T01:19:21+00:00] INFO: Report handlers complete

Instance ID: i-604df61e
Flavor: m1.small
Image: ami-9c78c0f5
Region: us-east-1
Availability Zone: us-east-1a
Security Groups: webserver, default
Security Group Ids: default
Tags: {"Name"=>"i-604df61e"}
SSH Key: awsdefault
Root Device Type: ebs
Root Volume ID: vol-66fa4619
Root Device Name: /dev/sda1
Root Device Delete on Terminate: true
Public DNS Name: ec2-50-17-75-229.compute-1.amazonaws.com
Public IP Address: 50.17.75.229
Private DNS Name: ip-10-101-51-86.ec2.internal
Private IP Address: 10.101.51.86
Environment: prod
Run List: role[base], role[db_master], role[webserver]
➜  fungibility git:(master)

At the end of this run, you should see It works! page apache generates when you visit the public url of your amazon ec2 instance in a web browser. If you run into any errors during provisioning, you can edit the Chef configuration, upload that the Chef server, and then re-run the Chef client directly on the EC2 instance:

➜  fungibility git:(master) ssh -i ~/Downloads/awsdefault-east.pem ubuntu@ec2-50-17-75-229.compute-1.amazonaws.com

ec2$ sudo chef-client

Another way without ssh directly is to use knife to do a remote chef run

knife ssh role:base 'sudo chef-client'

Idempotence would come into play here and make it the fastest way to make config amendments because Chef won’t re-install things that are already installed.

Create the same on the HPcloud

This command assumes that you have a key named hpdefault and the corresponding hpdefault.pem pemfile saved in ~/Downloads folder, that the pemfile is marked read only for you chmod 400 ~/Downloads/hpdefault.pem, that you have a security group named webserver defined on HP Cloud, that you have roles defined and upload onto Chef server, and that have environments also uploaded. The specified image 120 is the Ubuntu 12.04 EBS image in HP Cloud and 102 is the flavor (standard.medium) of machine used. knife hp flavor list will give all flavors of machines available in the HP cloud. Also, you musts specify the db_master role before the webserver role.

knife hp server create \
    -S hpdefault -i ~/Downloads/hpdefault.pem \
    -G webserver,default \
    -x ubuntu \
    -d ubuntu12.04-gems \
    -E prod \
    -I 120 \
    -f 102 \
    -r "role[base],role[db_master],role[webserver]"

After provisioning, knife will print out some details on your HP instance.

...
15.185.226.228 [2012-12-12T01:39:55+00:00] INFO: Chef Run complete in 132.742629 seconds
15.185.226.228 [2012-12-12T01:39:55+00:00] INFO: Running report handlers
15.185.226.228 [2012-12-12T01:39:55+00:00] INFO: Report handlers complete

Instance ID: 403017
Instance Name: hp15-185-227-146
Flavor: 102
Image: 120
SSH Key Pair: hpdefault
Public IP Address: 15.185.226.228
Private IP Address: 10.2.2.51
Environment: prod
Run List: role[base], role[db_master], role[webserver]

You should be able to It works! page when you visit the public url of your hp instance in a web browser. You just witnessed what I would refer to as the rudimentary beginnings of fungibility of cloud machines. Now that you have two instances running identical configuration but on machines from two different providers.

Let’s query the cloud instances using knife.

➜  fungibility git:(master) knife hp server list 
Instance ID  Name              Public IP       Private IP  Flavor  Image  Key Pair   State 
403017       hp15-185-227-146  15.185.226.228  10.2.2.51   102     120    hpdefault  active

➜  fungibility git:(master) knife ec2 server list
Instance ID  Name        Public IP     Private IP    Flavor    Image         SSH Key     Security Groups     State  
i-604df61e   i-604df61e  50.17.75.229  10.101.51.86  m1.small  ami-9c78c0f5  awsdefault  default, webserver  running

and globally check uptime, restart apache and also run a sudo chef-client on all machines with base role.

knife ssh role:base 'uptime'
knife ssh role:base 'sudo service apache restart'
knife ssh role:base 'sudo chef-client'

# Restart Apache on all webservers
knife ssh role:webserver 'service apache restart'

# Check the free disk space on all nodes
knife ssh name:* 'df -h'

Destroy what you just created

Unless you are planning to use the instances for something else, it is a good idea to destroy them so you wont get charged. Enumerate your instances using knife

➜  fungibility git:(master) knife hp server list 
Instance ID  Name              Public IP       Private IP  Flavor  Image  Key Pair   State 
403017       hp15-185-227-146  15.185.226.228  10.2.2.51   102     120    hpdefault  active

➜  fungibility git:(master) knife ec2 server list
Instance ID  Name        Public IP     Private IP    Flavor    Image         SSH Key     Security Groups     State  
i-604df61e   i-604df61e  50.17.75.229  10.101.51.86  m1.small  ami-9c78c0f5  awsdefault  default, webserver  running

Delete the server instances, node, and client using knife:

#cleaning up HP
knife hp server delete 403017
INSTANCE=hp15-185-227-146 
knife node delete $INSTANCE
knife client delete $INSTANCE	

# cleaning up AWS EC2
INSTANCE=i-604df61e
knife ec2 server delete $INSTANCE
knife node delete $INSTANCE
knife client delete $INSTANCE

While deleting, you will notice that there are some minor differences between HP and AWS EC2 in the way knife deletion works.

Provision a local vagrant box

Now that you have the configuration working on two different cloud providers, lets configure vagrant in a similar fashion so we can use that as a development environment. Create a directory on your workstation for your vagrant VM that would be shared with the vagrantbox. This is where your dev work will reside in subdirectories within this folder.

VMDIR=~/dev/vagrant-vm
mkdir -p $VMDIR
cd $VMDIR

Next, download and save this as $VMDIR/Vagrantfile to help create and provision the vagrantbox.

(Vagrantfile) download

Vagrant::Config.run do |config|
  config.vm.box = "precise64"
  config.vm.forward_port 80, 8080

  config.vm.customize [
    "modifyvm", :id,
    "--name", "vagrant-vm",
    "--memory", "2048"
  ]

  config.vm.network :hostonly, "10.0.0.23"
  config.vm.host_name = "vagrant-vm"
  config.vm.share_folder("v-root", "/home/vagrant/apps", ".", :nfs => true)

  # Your organization name for hosted Chef 
  orgname = "chef-validator"

  # Set the Chef node ID based on environment variable NODE, if set. Otherwise default to vagrant-$USER
  node = ENV['NODE']
  node ||= "vagrant-#{ENV['USER']}"

  config.vm.provision :chef_client do |chef|
    chef.chef_server_url = "http://chefserver.cloudapp.net:4000"
    chef.validation_key_path = "#{ENV['HOME']}/.chef/validation.pem"
    chef.validation_client_name = "chef-validator"
    chef.encrypted_data_bag_secret_key_path = "#{ENV['HOME']}/.chef/encrypted_data_bag_secret"
    chef.node_name = "#{node}"
    chef.provisioning_path = "/etc/chef"
    chef.log_level = :debug
    #chef.log_level = :info

    chef.environment = "dev"
    chef.add_role("base")
    chef.add_role("db_master")
    chef.add_role("webserver")

    #chef.json.merge!({ :mysql_password => "foo" }) # You can do this to override any default attributes for this node.
  end
end

In this Vagrantfile, make sure to set orgname to the orgname you use in Hosted Chef. The node must be unique among all nodes that use your Chef server. You can override it by exporting a $NODE environment variable, or you can accept the default vagrant-$USER. This Vagrantfile uses NFS for shared folders which is useful on a Mac or Linux host. Omit the , :nfs => true argument on a Windows host. Don’t try to mount a shared directory on /home/vagrant as it will cause important configuration to be overwritten, such as the .ssh directory (preventing key-based ssh authentication). You can change the amount of memory allocated to the VM with the config.vm.customize [ "--memory", 2048] setting (currently configured to allocate 2GB). You must specify the db_master role before the webserver role.

Next, provision the vagrantbox:

cd $VMDIR
vagrant up

Or, to specify a custom NODE name such as my-cool-vm:

NODE=my-cool-vm vagrant up

If you need to tweak the Chef scripts and then re-provision over the top of the existing configuration:

cd $VMDIR
vagrant provision # a bug https://github.com/mitchellh/vagrant/issues/1111 ?
vagrant ssh      # this is a workaround
sudo chef-client # this is a workaround

To wipe it out and start over:

NODE=vagrant-$USER
cd $VMDIR
vagrant destroy
knife node delete $NODE
knife client delete $NODE

Check if vagrantvm set up correctly by opening http://localhost:8080 in your browser to see the It works! page.

Credentials as env variables

2012-09-23T11:46:00-07:00

AWS credentials can either be passed in line (not ideal as your code is clutted with secret info) or it can be passed via environment variables (preferred method). The AWS tools require you to save your AWS account’s main access key id and secret access key in a specific way.

Create this credentials master file $HOME/.credentials-master.txt in the following format (replacing the values with your own credentials):

Credentials Master File (.credentials-master.txt) download

AWSAccessKeyId=YOURACCESSKEYIDHERE
AWSSecretKey=YOURSECRETKEYHERE

Note: The above is the sample content of .credentials-master.txt file you are creating, and not shell commands to run.

Protect the above file and set an environment variable to tell AWS tools where to find it:

export AWS_CREDENTIAL_FILE=$HOME/.credentials-master.txt
chmod 600 $AWS_CREDENTIAL_FILE

We can now use the command line tools to create and manage the cloud.

Using ipython

iPython is a beautiful interactive shell for python which you can easily install in a virtualenv. Just type

pip install tornado pyzmq ipython

and then run

ipython notebook --pylab inline

This would open http://127.0.0.1:8888/ in a browser window where you can run python interactively. According to iPython notebook installation, MathJax is not installed by default which can be installed with these steps.

from IPython.external.mathjax import install_mathjax
install_mathjax()

Run Pandora via Terminal

2012-09-22T13:17:00-07:00

If you like Pandora but rather flash and visual ads that go along with it, all you need is pianobar which you can install in one line. Just open terminal and type:

brew install pianobar

Now you can run your flash-free Pandora player in your terminal.

➜  ~  pianobar
Welcome to pianobar (2012.09.07)! Press ? for a list of commands.
[?] Email: lvnilesh@yahoo.com
[?] Password: 
(i) Login... Ok.
(i) Get stations... Ok.
0)     Boston Radio
1)     Guns N' Roses Radio
2)     Kishore Kumar, Mohd. Rafi, Mukesh & Lata Mangeshkar Radio
3)     Lata Mangeshkar Radio
4)     Led Zeppelin Radio
5) q   Michael Jackson Radio
6)  Q  QuickMix
7)     Super Freak Radio
[?] Select station: 5
|>  Station "Michael Jackson Radio" (116177894800507788)
(i) Receiving new playlist... Ok.
|>  "Wanna Be Startin' Somethin'" by "Michael Jackson" on "Thriller"
|>  "Signed, Sealed, Delivered I'm Yours [Alternate Mix]" by "Stevie Wonder" on "The Complete Motown Singles: Volume 10: 1970"
|>  "Freak" by "Chic" on "The Definitive Groove Collection: Chic"
|>  "Brick House" by "The Commodores" on "Colour Collection"
|>  "Thriller" by "Michael Jackson" on "Thriller"
#   -05:34/05:59

Updated with bonus:

Also run last.fm via terminal. Open terminal and type:

brew install shell-fm

and create the file ~/.shell-fm/shell-fm.rc containing this.

username = your-username
password = your-password
default-radio = lastfm://user/your-username/your-station-name

#  for example: lastfm://user/lvnilesh/personal

and run

➜  ~  shell-fm                
Shell.FM v0.8, (C) 2006-2010 by Jonas Kramer
Published under the terms of the GNU General Public License (GPL).

Press ? for help.

Receiving lvnilesh’s Library Radio.
Now playing "Call Me Maybe" by Carly Rae Jepsen.
-00:01

Enjoy!

mathjax integration

2012-09-05T22:28:00-07:00

My blog is now mathjax enabled which means I can now write math expressions in plain text markdown. Here is the wave equation by Erwin Schrödinger.

$$ [ i\hbar\frac{\partial \psi}{\partial t} = \frac{-\hbar^2}{2m} \left( \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2} + \frac{\partial^2}{\partial z^2} \right) \psi + V \psi ] $$

Here is $\rm \LaTeX$ inline, math representation of a circle ( $\begin{align} x^2 + y^2 = 1 \end{align}$) and here is Euler’s constant. $$ e = \mathop {\lim }\limits_{n \to \infty } \left( {1 + \frac{1}{n}} \right)^n $$

Futher reading

Computational Investing

2012-08-18T21:18:00-07:00

I just discovered an online course on Computational Investing that Prof. Tucker Balch from the College of Computing at Georgia Tech is offering on coursera. It nicely blends my interests in the financial markets and computers so I immediately registered for it. The course has not started yet but for those interested in getting a headstart, here is a quick step-by-step on how I set my computer up with the QuantSoftware ToolKit

Getting the basics down

ruby <(curl -fsSkL raw.github.com/mxcl/homebrew/go)
brew install wget
brew install pyqt # brew installed sip as sip is a dependency
brew install gfortran
brew install gtk
brew install ghostscript
brew install swig

Use a virtual environment for use with QSTK (so it wont mess up existing setup) See my other post on setting up a virtualenv and create a quant virtualenv

mkvirtualenv quant
cd ~/domains/quant

The rest of the steps take place inside the newly created quant virtualenv.

Install numpy from source

pip install -e git+https://github.com/numpy/numpy.git#egg=numpy-dev

Install other dependencies via a requirements.txt file created by pip freeze > requirements.txt from a working installation.

PIP Requirements File (requirements.txt) download

Cython==0.16
distribute==0.6.28
epydoc==3.0.1
ipython==0.13
lxml==2.3.5
patsy==0.1.0
python-dateutil==1.5
pytz==2012d
pyzmq==2.2.0.1
tornado==2.3
wsgiref==0.1.2
Jinja2==2.6
Pygments==1.5
Sphinx==1.1.3
docutils==0.9.1
readline==6.2.2
six==1.1.0
xlrd==0.8.0
-e git+https://github.com/pydata/pandas.git#egg=pandas-dev
-e git+https://github.com/sympy/sympy.git#egg=sympy-dev
-e git+https://github.com/matplotlib/matplotlib.git#egg=matplotlib-dev
-e git+https://github.com/scipy/scipy.git#egg=scipy-dev

wget http://blog.fungibleclouds.com/downloads/code/requirements.txt
pip install -r requirements.txt

Install statsmodels from source

pip install -e git+https://github.com/statsmodels/statsmodels.git#egg=statsmodels-dev

Install CVXopt from source

pip install cvxopt should work but seems there is a bug with cvxopt.

cd ~/domains/quant/src
wget http://abel.ee.ucla.edu/src/cvxopt-1.1.5.tar.gz
tar zxvf cvxopt-1.1.5.tar.gz
cd cvxopt-1.1.5/src
python setup.py install

Install QSTK

cd ~/domains/quant/
mkdir QSTK
cd QSTK
svn checkout http://svn.quantsoftware.org/openquantsoftware/trunk .

Install QSDATA - sample data from the stock market

wget http://www.quantsoftware.org/QSData.zip
unzip QSData.zip

Configure the qstk specific env variables

cp config.sh local.sh
vi local.sh # edit the $QSDATA env var to point to $QS/QSData/
vi local.sh # edit this to match path of QSTK and QSDATA
	$QS : This is the path to your installation (The location of the Bin, Example, Docs) folders.
	$QSDATA : This is where all the stock data will be.
source local.sh

Test the env variables

echo $QS # would show ~/domains/quant/QSTK
echo $QSDATA # would show ~/domains/quant/QSTK/QSData

Now you are ready to run the QSTK examples

ipython notebook --pylab inline # This will open your default browser http://localhost:8888

Click on new notebook to create a new tab with new empty notebook. In that new notebook, type this code segment to test your setup

import numpy as np
import pandas as pand
import matplotlib.pyplot as plt
from pylab import *
x = np.random.randn(1000)
plt.hist(x,100)
plt.savefig('test.png',format='png')

Press SHIFT-ENTER to see something like this below.

The class is not started yet but here are the two recommended readings that I ordered already.

Active Portfolio Management: A Quantitative Approach for Producing Superior Returns and Controlling Risk by Richard Grinold, Ronald Kahn
All About Hedge Funds: The Easy Way to Get Started by Robert Jaeger

I am looking forward to applying the learnings from this class to my personal portfolio.

python virtualenv

2012-08-17T13:58:00-07:00

Prepare a virtual environment (so it wont mess up existing setup)

sudo easy_install pip
sudo pip install virtualenv virtualenvwrapper
mkdir domains # create a directory to store different virtual environments

Create a temporary text file (say ~/appendthis) with below text

export WORKON_HOME=$HOME/domains
source /usr/local/bin/virtualenvwrapper.sh
export PIP_VIRTUALENV_BASE=$

Append that temp file to ~/.zshenv (or .profile or .bashrc depending on your shell)

cat ~/appendthis >> ~/.zshenv

Exit current shell and start terminal again to see something like this show up:

Linux quant 2.6.32-27-generic #49-Ubuntu SMP Thu Dec 2 00:51:09 UTC 2010 x86_64 GNU/Linux Ubuntu 10.04.1 LTS

Welcome to Ubuntu!
* Documentation:  https://help.ubuntu.com/
Last login: Thu Dec 23 14:35:06 2010 from imac.workgroup
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/initialize
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/premkvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postmkvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/prermvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postrmvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/predeactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postdeactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/preactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/get_env_details

Now you can create any number of python virtual environments. For example, I create myfirstenv

mkvirtualenv myfirstenv # create my first virtual environment named myfirstenv
pip install BLAH # install BLAH
deactivate # deactivate that virtualenv
rmvirtualenv myfirstenv # remove myfirstenv

To work with virtualenv again, simply type:

workon myfirstenv
cd ~/domains/myfirstenv

Wrappers: Virtualenv provides several useful wrappers that can be used as shortcuts

mkvirtualenv (create a new virtualenv)
rmvirtualenv (remove an existing virtualenv)
workon (change the current virtualenv)
add2virtualenv (add external packages in a .pth file to current virtualenv)
cdsitepackages (cd into the site-packages directory of current virtualenv)
cdvirtualenv (cd into the root of the current virtualenv)
deactivate (deactivate virtualenv, which calls several hooks)

Hooks: One of the coolest things about virtualenvwrapper is the ability to provide hooks when an event occurs. Hook files can be placed in ENV/bin/ and are simply plain-text files with shell commands. virtualenvwrapper provides the following hooks:

postmkvirtualenv
prermvirtualenv
postrmvirtualenv
postactivate
predeactivate
postdeactivate

When you are done with that virtualenv, you can just type

rmvirtualenv myfirstenv # this will destroy that virtualenv named `myfirstenv` under ~/domains

Installing multiple versions of Ruby

2012-08-17T10:24:00-07:00

Install XCode command-line tools. Available from the Preferences > Download panel in XCode, or as a separate download from the Apple Developer site.

Install gcc-4.2. Ruby versions before 1.9 (such as 1.8.7 or REE) do not play well with Apple’s LLVM compiler, so you’ll need to install the old gcc-4.2 compiler. It’s available in the homebrew homebrew/dupes repository.

brew tap homebrew/dupes
brew install apple-gcc42

Install xquartz. The OS X upgrade will also remove your old X11.app installation, so go grab xquartz from http://xquartz.macosforge.org/landing/ and install it (you’ll need v2.7.2 or later for Mountain Lion).

Install Ruby 1.9. This one is simple.

rbenv install 1.9.3-p194

Install Ruby 1.8.7. Remember to add the path to the xquartz X11 includes in CPPFLAGS. Here I’m using rbenv, but the same environment variables should work for rvm.

CPPFLAGS=-I/opt/X11/include rbenv install 1.8.7-p370

Install ree. Remember to add the path to the xquartz X11 includes in CPPFLAGS and the path to gcc-42 in CC. Here I’m using rbenv, but the same environment variables should work for rvm.

CPPFLAGS=-I/opt/X11/include CC=/usr/local/bin/gcc-4.2 rbenv install ree-1.8.7-2012.02

Enjoy your new Ruby versions

rbenv versions

ZFS to the rescue

2012-08-13T05:58:00-07:00

This morning one of my harddisks ada5p2 in the tank pool decided to become unavailable. Even though I store critical data on this pool, I have nothing really to worry about because this ZFS pool is configured as raidz2 - a disk pool that can tolerate two simultaneous disk failures.

# zpool status -v tank
	pool: tank
	state: DEGRADED
	status: One or more devices could not be opened.  Sufficient replicas exist for the pool to continue functioning in a degraded state.
	action: Attach the missing device and online it using 'zpool online'. 
	see: http://www.sun.com/msg/ZFS-8000-2Q
	scrub: scrub in progress for 0h0m, 0.00% done, 73h11m to go
	config:
 
		NAME        STATE     READ WRITE CKSUM
		tank        DEGRADED     0     0     0
		  raidz2    DEGRADED     0     0     0
			ada1p2  ONLINE       0     0     0
			ada2p2  ONLINE       0     0     0
			ada3p2  ONLINE       0     0     0
			ada4p2  ONLINE       0     0     0
			ada5p2  UNAVAIL      3 3.69K     0  cannot open

	errors: No known data errors

Without shutting down my storage system, I just yanked the SATA cable from that broken harddisk and hot replaced it with another of similar size. Now ZFS would resilver that replaced drive on its own in the next couple hours but I was essentially done without any downtime and without any data errors. ZFS is nice indeed.

A cron job periodically scrubbing the zpools helps. ZFS has a built in scrub function that checks for errors and corrects them when possible. Running this task is pretty essential to prevent more errors that aren’t correctable. By default, ZFS doesn’t run this periodically, you have to tell it when to scrub. The easiest way to set up periodic scrubbing is to use crontab, a feature present in all UNIX systems for scheduling background tasks. Start the editing of root user’s crontab by issuing the command crontab -e as root. The crontab is set up by a simple set of commands:

* * * * * command to run
- - - - -
| | | | |
| | | | +----- day of week (0-6) (Sunday is 0)
| | | +------- month (1-12)
| | +--------- day of month (1-31)
| +----------- hour (0-23)
+------------- min (0-59)

For example, I want my system to scrub my tank zpool on Sundays at 04:00 and my twoteebee zpool on Thursdays at 04:00. The specific commands that I put in my crontab are:

0 4 * * 0 /sbin/zpool scrub tank
0 4 * * 4 /sbin/zpool scrub twoteebee

Improving blog performance using AWS S3 + CloudFront

2012-08-03T17:14:00-07:00

When I first switched over to blogging using Octopress, I loaded it up on heroku via git but I was not super satisfied by the site’s performance for world wide audience. It took me a bit of exploring for a good but cost effective way to improve performance using CDN so here is a writeup explaining my setup that might help others.

If you have a blog but haven’t heard of Octopress, you should check it out. It’s great for anyone who likes writing in the text editor of their choice (I currently like IA Writer, and Writing Kit) instead of some web interface, wants to store the work in git, and is comfortable running a few Terminal commands.

Why AWS S3 and CloudFront?

I initially started out hosting my blog using a single Web Dyno, which is a free service offered by heroku for hosting my Octopress blog stored in git. The price was certainly right, but Heroku experienced a bit of downtime over the course of the life of my blog on Heroku and I feel strongly about uptime.

An alternative is using Amazon S3, Amazon’s cloud file storage service. Amazon lets you host a static website on S3 with your own domain name. You can also easily use Amazon CloudFront with S3. CloudFront is a CDN (content distribution network) that serves your content from a worldwide server network and helps to make your website faster.

Setting up S3

If you’ve never used Amazon Web Services before, it can be a little confusing to get started. First, you need to sign up for an AWS account. When you have your account, log into the AWS Management Console and head to the S3 tab. Then:

Create a bucket called blog.myowndomain.com. You can not use myowndomain.com so use a subdomain like www or blog.
Under the properties for this bucket, you’ll need to go to the Website tab, check the box to enable static web hosting, and set your index and error documents. Your index document should probably be index.html. Your error document could be 404.html (an HTML page for file not found (404) errors). Make a note of your endpoint (http://blog.fungibleclouds.com.s3-website-us-east-1.amazonaws.com/). You’ll need it to create custom origin CloudFront distribution.
Create a bucket policy under permissions. Here is my bucket policy.

Setting CloudFront

In AWS Console, go to the CloudFront tab, and create a new Distribution for the S3 website end point as custom origin. This link on custom origin helps. This will mirror your S3 bucket on CloudFront, for example, (http://d2h7g34rdqpc09.cloudfront.net/index.html) shows the home page of my website exactly as it appears on S3.

CloudFront will cache the contents of your S3 bucket for up to 24 hours. This cache is created from S3 the first time someone hits an asset under your CloudFront URL. This means that CloudFront won’t necessarily reflect changes on S3 immediately. You can manually invalidate/expire objects in CloudFront, but it’s easier to just not use it for anything that will change frequently.

Setting up your DNS

You’ll need to create a DNS CNAME alias record to use your own domain with CloudFront that mirrors your S3 bucket. The way you do this depends on your DNS provider (I use Zerigo, which is cheap, reliable, and easy to use). You need to create a CNAME pointing blog.myowndomain.com to your CloudFront endpoint.

After propagation, your DNS results should look something like this.

nilesh$ dig blog.fungibleclouds.com

; <<>> DiG 9.8.1-P1 <<>> blog.fungibleclouds.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13827
;; flags: qr rd ra; QUERY: 1, ANSWER: 9, AUTHORITY: 13, ADDITIONAL: 0

;; QUESTION SECTION:
;blog.fungibleclouds.com.	IN	A

;; ANSWER SECTION:
blog.fungibleclouds.com. 900	IN	CNAME	d2h7g34rdqpc09.cloudfront.net.
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.16
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.67
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.91
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.140
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.174
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.176
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.226
d2h7g34rdqpc09.cloudfront.net. 59 IN	A	205.251.215.2

;; AUTHORITY SECTION:
.			491338	IN	NS	j.root-servers.net.
.			491338	IN	NS	a.root-servers.net.
.			491338	IN	NS	g.root-servers.net.
.			491338	IN	NS	k.root-servers.net.
.			491338	IN	NS	c.root-servers.net.
.			491338	IN	NS	f.root-servers.net.
.			491338	IN	NS	e.root-servers.net.
.			491338	IN	NS	h.root-servers.net.
.			491338	IN	NS	d.root-servers.net.
.			491338	IN	NS	l.root-servers.net.
.			491338	IN	NS	i.root-servers.net.
.			491338	IN	NS	b.root-servers.net.
.			491338	IN	NS	m.root-servers.net.

;; Query time: 155 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Aug  3 21:15:39 2012
;; MSG SIZE  rcvd: 420

nilesh$

Pushing your Octopress changes over to S3

This action is fairly simple. First you edit your posts you store in source/_posts. I currently prefer iA Writer so I keep a little executable script I label as ia to invoke it from the terminal.

#!/bin/bash
for FILE in "$@"
do
	open -a "iA Writer" "$FILE"
done;

Then you generate static HTML for your site.

$ rake generate

and finally you push your incremental updates over to S3 using s3cmd in rsync like fashion

$ s3cmd sync --reduced-redundancy --recursive --exclude "*.tiff"  --exclude "*.plist" --delete-removed ~/blog.fungibleclouds.com/public/* s3://blog.fungibleclouds.com/ --verbose

Homebrew mysql

2012-08-02T09:02:00-07:00

Homebrew certainly makes my life simple. This morning it took just 84 seconds to install mysql from source.

brew install mysql

Set up databases to run AS YOUR USER ACCOUNT with:
    unset TMPDIR
    mysql_install_db --verbose --user=`whoami` --basedir="$(brew --prefix mysql)" --datadir=/usr/local/var/mysql --tmpdir=/tmp

To set up base tables in another folder, or use a different user to run
mysqld, view the help for mysqld_install_db:
    mysql_install_db --help

and view the MySQL documentation:
  * http://dev.mysql.com/doc/refman/5.5/en/mysql-install-db.html
  * http://dev.mysql.com/doc/refman/5.5/en/default-privileges.html

To run as, for instance, user "mysql", you may need to `sudo`:
    sudo mysql_install_db ...options...

Start mysqld manually with:
    mysql.server start

    Note: if this fails, you probably forgot to run the first two steps up above

A "/etc/my.cnf" from another install may interfere with a Homebrew-built
server starting up correctly.

To connect:
    mysql -uroot

To launch on startup:
* if this is your first install:
    mkdir -p ~/Library/LaunchAgents
    cp /usr/local/Cellar/mysql/5.5.25a/homebrew.mxcl.mysql.plist ~/Library/LaunchAgents/
    launchctl load -w ~/Library/LaunchAgents/homebrew.mxcl.mysql.plist

* if this is an upgrade and you already have the homebrew.mxcl.mysql.plist loaded:
    launchctl unload -w ~/Library/LaunchAgents/homebrew.mxcl.mysql.plist
    cp /usr/local/Cellar/mysql/5.5.25a/homebrew.mxcl.mysql.plist ~/Library/LaunchAgents/
    launchctl load -w ~/Library/LaunchAgents/homebrew.mxcl.mysql.plist

You may also need to edit the plist to use the correct "UserName".

==> Summary
/usr/local/Cellar/mysql/5.5.25a: 6382 files, 222M, built in 84 seconds

Set up db to run as your user account

unset TMPDIR
mysql_install_db --verbose --user=`whoami` --basedir="$(brew --prefix mysql)" --datadir=/usr/local/var/mysql --tmpdir=/tmp

Start the server

mysql.server start

Secure the installation

mysql_secure_installation

Make sure to let mysql launch on startup

mkdir -p ~/Library/LaunchAgents
cp /usr/local/Cellar/mysql/5.5.25a/homebrew.mxcl.mysql.plist ~/Library/LaunchAgents/
launchctl load -w ~/Library/LaunchAgents/homebrew.mxcl.mysql.plist

Make sure to check the plist to use the correct user

vi ~/Library/LaunchAgents/homebrew.mxcl.mysql.plist

Unhide Library in Finder

2012-08-01T08:20:00-07:00

This is a quick way to unhide ~/Library and make it visible in Finder

chflags nohidden ~/Library

Cloud music without keeping a tab open

2011-10-30T00:00:00-07:00

Anesidora extension for Chrome lets you play cloud music without opening a flash tab in Chrome.

This nifty controller lets you manage Pandora from a small drop-down window while reducing the performance hit that often accompanies Flash-based apps.

Can you avoid cloud vendor lock-in?

2011-10-13T00:00:00-07:00

Vendor lock-in is the situation in which you are dependent on a single vendor for a product (i.e., a good or a service) and cannot move to another vendor without substantial costs and/or inconvenience. Lock-in is typically a result of standards controlled by the vendor, thereby granting the vendor some degree of monopoly power that usually leads to better profits for such vendor.

Here is a recent example illustrating the lock-in problem:

Few weeks ago, Google announced a significant price increase for use of its Google App Engine Platform-as-a-Service (PaaS). Google App Engine users knew and expected that Google would increase the price at some point but what shocked most developers was the jump in price which increased the cost of using the Google App Engine runtime environment by 100% or more in specific cases. It is a non trivial exercise to port to another location once an app is deployed on the Google App Engine infrastructure. This led to a big backlash on the App Engine google groups. Google responded with a few adjustments to its pricing but this incidence resurfaced some doubts about the cloud. Hart Singh of flipbook LLC, creators of the flipbook app on Facebook, raised a concern, “My team spent so much time learning app engine but I continue to wonder if we are betting our company on Google…any app we build can only be run on the Google App Engine.” Google App Engine requires custom code to run apps in that environment. Customizing take effort and time and impacts the bottomline.

According to Gartner, cloud computing customers are more concerned about vendor lock-in than about cloud security. So what exactly lock-in means in the context of cloud computing. For this we look at the various types of lock-in:

Horizontal lock-in limits the ability to replace a product with a comparable competing product. If you chose CRM solution from Oracle earlier, then you will need to migrate your data and code, retrain your users and rebuild the integrations to your other solutions if you want to move to Microsoft Dynamics CRM. Wouldn’t it be nice it you could reuse your garage, cabling, etc., when you switch from Toyota Prius to a Nissa Leaf? The higher you go up the levels of the cloud computing stack the stronger is the horizontal lock-in.

Moving from one SaaS solution to another in the cloud is no differenf from moving from one software to another provided there is a clear migration path. But PaaS can be a very deep lock-in especially if code needs to written to comply with PaaS requirement. IaaS lock-in is much less severe however the underlying hypervisors (_containers of virtual machines_) differ and can lead to some complexity during migration from one IaaS vendor to another.

Vertical lock-in limits choice in other levels of the cloud services stack. For example, selecting solution A mandates the use of database B, operating system C, hardware vendor D and/or implementation partner E. Open standards help prevent vertical lock-in by ensuring that hardware, middleware, and operating systems could be chosen independently.

Vertical lock-in built-into SaaS and PaaS offerings as the underlying infrastructure comes with the service. However, you won’t need to worry about managing the underlying layers of the cloud stack. IaaS offers comparatively less vertical lock-in. You know that application logic and data need proximity to gain decent performance so you should almost always procure storage services from the same IaaS provider as used for application logic processing.

Inclined lock-in is a tendency to buy as many solutions as possible from one provider, even if such solutions in some of these areas are less desirable. You tend to sometimes select a single vendor not only to make management, training and integration easier with a single throat to choke but also to be able to demand higher discounts. This leads to large and powerful vendors causing a high degree of inclined lock-in.

Generational lock-in becomes an issue when an entirely new generation of technology reaches the market. No technology generation and no platform lives forever. The first three types of lock-in are not too bad if you picked the right solution vendors (generally the ones that turn out to become the market leaders). But even such market leaders at some point reach end of life. You want to be able to replace them with the new generation of technology without it being prohibitively expensive or even impossible.

Vendor lock-in makes you vulnerable. Think defensively before committing

With vendor lock-in comes vulnerability to price increases. So think defensive. Here are our quick defence tactics against cloud vendor lock-in.

1. Avoid vendor lock-in Ensure your app is able to move easily to another cloud provider as and when needed. In essence, keep your plan B in implementable shape and prepare plan B before making serious customizations for a specific cloud platform.

2. Analyze the TCO for language and tools selection When building your cloud app, think hard about the code selection before you start filling up your git repository. Popular coding languages may not be the most economical for your specific situation. Think of availability of professionals skilled in the coding language of your choice both within and ourside your organization.

3. Carefully select your code base Runtime, scripting environments and code frameworks are not all similar. Discuss with your dev team members on the choice that would be most optimal for you.

4. Understand redundancy and cloud architecture Identify single points of failure (SPOF) in the architecture. Judge the redundancy elements for yourself and consult with the experts.

5. Tread PaaS land carefully Explore installable PaaS that you can run yourself if need be. Spread the risk among several different PaaS providers that do not depend on a common IaaS provider.

These tactics are the ones we find most used by our cloud clients in attempting to reduce the impact of vendor lock-in to a good degree.

Got other ideas on how you would avoid cloud vendor lock-in? Share via comments.

Wake a Sleeping Machine From Your Network

2011-10-12T11:46:00-07:00

Most modern computers are capable of Wake on LAN which allows you to turn on a sleeping computer remotely by sending a “magic packet.” Scheduled applications, nighly backup for example, use this feature.

I use this feature to turn on my sleeping iMac when I am away from it but want to log on using ssh (I maintain one Ubuntu machine on the local network that is always running).

The magic packet format is very simple: it must include 6 times hexadecimal FF, followed by 16 times the target machine’s MAC address.

Here is a Python script that will wake up your target machine remotely.

(wakeup.py) download

#!/usr/bin/env python
import socket
s=socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto('\xff'*6+'\x01\x23\x45\x67\x89\x0a'*16, ('192.168.2.109', 80))

This script assumes that the target machine MAC address is 01-23-45-67-89-0a and that your local DHCP server issues an IP address of 192.168.2.109 to your target machine.

You can run this script from another machine on your local network, like so.

$ python wakeup.py

Got better ideas on waking up sleeping machines remotely when needed? Share below via comments.

Easily toggle the function keys between media and standard function

2011-10-10T20:58:00-07:00

Tired of having to open system preferences/keyboard to toggle the default behavior of functions keys? Use this AppleScript with Quicksilver or Alfred to easily toggle the function keys between media and standard role.”

(togglefn.applescript) download

tell application "System Preferences"
  set current pane to pane "com.apple.preference.keyboard"
end tell

tell application "System Events"
  if UI elements enabled then
    tell application process "System Preferences"
      get properties
      click radio button "Keyboard" of tab group 1 of window "Keyboard"
      click checkbox "Use all F1, F2, etc. keys as standard function keys" of tab group 1 of window "Keyboard"
    end tell
  else
    tell application "System Preferences"
      activate
      set current pane to pane "com.apple.preference.universalaccess"
      display dialog "UI element scripting is not enabled. Check \"Enable access for assistive devices\""
    end tell
  end if
end tell

tell application "System Preferences"
  quit
end tell

Similarly, use this app-toggler AppleScript to toggle between Chrome and TextMate.

(app-toggler.applescript) download

on GetCurrentApp()
  tell application "System Events" to get short name of first process whose frontmost is true
end GetCurrentApp

if GetCurrentApp() is equal to "TextMate" then
    tell application "Google Chrome" to activate
else
    tell application "TextMate" to activate
end if

Now just assign a hotkey to the script file and the hotkey becomes a toggle button.

Got more tips to increase your productivity? Share via comments below.

Hide unhide hidden files in Finder on a Mac

2011-10-10T20:17:00-07:00

If you can’t see hidden files in OSX but want to, open up Terminal and type the following 2 lines:

defaults write com.apple.finder AppleShowAllFiles TRUE
killall Finder

To hide hidden files again, type:

defaults write com.apple.finder AppleShowAllFiles FALSE
killall Finder

Get colorful directory listing in terminal on a Mac

2011-10-10T20:17:00-07:00

Terminal looks monochromatic by default on a Mac like this screenshot below.

Want to get ls colors on terminal on a mac like you may have seen in Ubuntu and some other linux distributions? Just append these two lines to your .bash_profile

export CLICOLOR=1
export LSCOLORS=ExFxCxDxbxegedabagacad

Make sure to restart your SHELL after which the terminal will show ls colors like below.

Got your productivity enhancing tips? Share via comments below.

Leaves Are Turning Red

2011-10-09T21:41:00-07:00

Enjoy the colors while they last…Soon it will be grey all over :)

5 Tips for Blogging

2011-10-09T12:55:00-07:00

Use title to grab attention: Make them see what you see. You may think that everyone sees things the way you do. But they don’t. Readers won’t pay attention until they perceive what you perceive. So make your position crystal clear. Use storytelling, personal experiences, or anything that will put the reader in the right position to understand your message.

Use emotion. Emotion brings clarity to your messages while making them personal. Emotion also comes with the triple bonus of adding clarity, giving readers a reason to talk about you, and triggering action you may want — emotion is much better at that than logic is. Emotional messages get attention. Tell a meaningful and personal story. When you make your writing personal, you make it important. Personally interesting or perceptually meaningful information grabs attention and brings clarity.
Offer something - an idea, a new way, a point of view: Offer something to your readers - an idea, a new way of thinking, a new point of view, a new experiment to try… something they can take away from your blog. Keep users engaged. Behavioral economics experts have established that people are generally fond of the 4 letter F-Word - A preference for FREE seems to be a feature hardwired into humans brains. See [Dan Ariely’s experiment](http://danariely.com/2009/08/10/the-nuances-of-the-free-experiment/) “_Free kisses beat bargain truffles.”_ Give them something free so they keep coming back for more…eventually becoming a repeat subscriber.
Write content to align with reader scan preferences: People tend to scan web pages like in a pattern different from what they read in print. Eye tracking research indicates the dominant patterns people tend to deploy while reading computer screens. In general, people tend to read blog posts in an F pattern, beginning at the top going through the first few rows, then scan down, scan across a bit again, and then scan down to skim for any thing interesting. The intensity of attention gets weaker (or the ink gets fainter) as readers scan down the post. Keeping this human behavior in mind will help you write better blog posts.
Write in bite sized chunks using a structured framework whenever feasible: Write small sized chunks that fits above the fold or above the scroll. Avoid complex/theoretical writing or marketing hyperbole. Use colloquialism. Try limiting a blog post to 450 - 675 words with 2 to 3 sections per post. Limit each section to about 2 or 3 paragraphs each no more than 75 words.
Stick to a manageable schedule for posting: Sticking to a schedule that your readers can predit and that you can manage is very useful for your readers. It helps to provide a predictability to your readers on when they would expect to see new posts…lets say:
- a new learning every Monday
- a case study every Wednesday
- a video story every Friday
- a photograph every Saturday

What worked for you/didn’t work well in your blog? Chime in below with your comments.

via my friend @sterlizzi CEO of http://www.wearephotographers.com

Fall Is Here

2011-10-02T00:00:00-07:00