Chef routinizes the repeatable steps in cloud operations management and it does that in a way that is almost agnostic to the underlying cloud provider. Chef thus helps make applications almost agnostic to the underlying machines.
Using Chef to manage cloud applications makes cloud computer from a provider, say AWS EC2, fairly easily substituted by cloud computers from another provider, say HP Cloud.
As my friend Kevin Jackson describes, Cloud computing has several economic benefits, however, cloud computing would become even more economical if cloud computers were fungibile, meaning, a cloud machine from provider X would be practically no different than another cloud machine from provider Y. Fungibility would make cloud computers easily substitutable driving the price further down by increasing the competition and reducing the differention between providers of cloud computers. Fungibility, by the way, is not a property of eukaryotic organisms.
Routinizing the repeatable is key to successful operations management and is described in detail in a paper on Integrated Operations by Prof. William Lovejoy of The University of Michigan Business School, Ann Arbor and I quote…
” If some task is to be repeated many times, it makes sense to find out the best way to perform the task and then require its execution according to that best practice. This means that in stable task environments, stable work routines and policies will be generated over time, and this is efficient. This derives from March and Simon’s (1958) model of organizational learning. The consequences for this are that one will want to consider the relationship between efficiency and discretion allowed workers in a stable environment. ”
The ability to routinize the repeatable and provide consistent environments from development, through testing and staging, to production is a key benefit to successful business operations in the cloud.
Chef operation (sudo chef-client
) is idempotent, repeat runs will produce the exact same resulting machine configuration as the initial run did. Idempotence is the property of certain operations in mathematics and computer science, that they can be applied multiple times without changing the result beyond the initial application. The term was introduced by Benjamin Peirce in the context of elements of an algebra that remain invariant when raised to a positive integer power, and literally means the quality of having the same power, from idem + potence (same + power).
Idempotent operations enables consistently reproducible cloud environments for development and production use. It helps bring order and reduce chaos in business operations.
Ok, I admit, this is going to be an incorrect analogy from biological science perspective but it does seem to work for some people as an crude example to explain the logic. What you get to accomplish is to give life to, say a giraffe (Giraffa camelopardalis), a cow (Bos primigenius), a leopard (Panthera pardus), or a person (Homo sapiens), based on the DNA you inject into an embryo. On similar lines, you create a web server running nginx with a specific configuration, or a proxy running HAProxy, or a database master server running PostgreSQL or whatever you need, by asking Chef to run an appropriate set of cookbooks on top of a cloud machine running just enough OS or jeOS (pronounced as juice or jüs).
Chef helps spin up machines just the way you want with a specific set of software and specific configuration by building up from scratch right from bare metal machines loaded with just enough OS.
Let’s see in practise how these core concepts pan out in reality. This is best illustrated in form of a hands on exercise of creating an infrastructure in the cloud where we will have the production environment running first on a single server instance which is useful for rapid prototyping of apps while sharing a single machine among multiple applications to minimize cost. Once you’re comfortable with this basic all-in-one configuration, it’s relatively simple to scale it out, separating the various roles onto multiple machine instances. move I must caution you that this is a fairly elaborate setup that would be needed on your linux/unix workstation but fortunately it’s all pretty straightforward, and there’s a lot of good documentation available on the internet.
homebrew, git, chef, librarian, vagrant, ec2, hpcloud
ruby -e "$(curl -fsSkL raw.github.com/mxcl/homebrew/go)"
brew install git
gem install chef
gem install librarian
gem install vagrant
vagrant box add precise64 http://files.vagrantup.com/precise64.box # latest Ubuntu LTS 12.04 for vagrant
gem install knife-ec2
gem install knife-hp
Follow the Chef Fast Start Guide to set up your workstation. You need to go through about half that guide up to step 4. Stop there as you don’t need to configure the workstation as a client. In our case, the chef clients will run on vagrant, ec2, hpcloud and other cloud providers you select instead of on your workstation.
If you follow that guide, it will create a ~/chef-repo/.chef/
folder with three files - knife.rb
, validation.pem
and username.pem
. You need to move that .chef
folder and those three files and to ~/.chef/
which lets you execute knife
commands from anywhere on the workstation.
My knife.rb
looks something like this. Please note that I am using my own chef server - not opscode hosted chef - so make sure to customize your knife.rb
file to properly connect with your Chef server.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
username
on chef serverpem
fileorganization
namepem
fileurl:port
for your chef servermkdir -p ~/fungibility
cd ~/fungibility
git init
We will use librarian
to gather the required cookbooks from the Chef community. Download this Cheffile
to your ~/fungibility
folder. This ~/fungibility/Cheffile
will guide librarian
what to bring down to the workstation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
You can customize this Cheffile depending on your infrastructure needs. Libraian will download these specified cookbooks in $PWD/cookbooks
and will also create a $PWD/tmp
folder. You can put both these subfolders in your .gitignore
file so as to not clutter your git repo.
cd ~/fungibility
librarian-chef install
echo cookbooks >> .gitignore
echo tmp >> .gitignore
git add .
git commit -m "initial commit"
Create a site-cookbooks
sub-folder to store any custom cookbooks or customization to community cookbooks.
cd ~/fungibility
mkdir -p ~/fungibility/site-cookbooks
touch site-cookbooks/readme.md
echo "store custom cookbooks in here" >> site-cookbooks/readme.md
git add .
git commit -m "added a place for custom cookbooks"
knife cookbook upload -a -o ./cookbooks
knife cookbook upload -a -o ./site-cookbooks
and here is a BE CAREFUL: nuke all cookbooks
command just in case you need to start from fresh.
knife cookbook bulk delete -y '.*'
Your development, test, and production environments may differ, for example, the development environment might include debugging tools, which may not be installed in production. Chef lets you define different environments and assign a node to a particular environment. Let’s create a dev
, a stage
and a prod
environment which we can customize and fine tune later.
Create environments/dev.rb
, with the following contents:
name "dev"
description "The development environment"
Create environments/stage.rb
, with the following contents:
name "stage"
description "The staging environment"
Create environments/prod.rb
, with the following contents:
name "prod"
description "The production environment"
Commit the environment files to version control:
git add environments
git commit -m 'Add development, staging, and production environments.'
Upload the environments to the Chef server:
cd ~/fungibility
knife environment from file environments/dev.rb
knife environment from file environments/stage.rb
knife environment from file environments/prod.rb
Chef roles are a way to define certain patterns and processes that exist across nodes in a Chef organization as belonging to a single job function. It helps you define a group of recipes and attributes that should be applied to all nodes that perform a particular function.
Lets us start creating a base
role that would apply to all nodes, a webserver
role, and a db_master
role for the master database.
Create roles/base.rb
containing the following:
name "base"
description "Base role applied to all nodes."
run_list(
"recipe[apt]",
"recipe[git]",
"recipe[build-essential]",
"recipe[sudo]",
"recipe[users::sysadmins]",
"recipe[vim]"
)
override_attributes(
:authorization => {
:sudo => {
:users => ["ubuntu", "vagrant"],
:passwordless => true
}
}
)
The run_list
method defines a list of recipes to be applied to nodes in the base
role. The override_attributes
method overrides the default attributes used by community recipes. for example, this will overriding attributes used by the sudo
cookbook so the vagrant
and ubuntu
users can run sudo without manually entering a password.
Create roles/webserver.rb
containing the following:
name "webserver"
description "Web server role"
all_env = [
"role[base]",
"recipe[php]",
"recipe[php::module_mysql]",
"recipe[apache2]",
"recipe[apache2::mod_php5]",
"recipe[apache2::mod_rewrite]",
]
run_list(all_env)
env_run_lists(
"_default" => all_env,
#"dev" => all_env + ["recipe[php:module_xdebug]"],
"dev" => all_env,
"prod" => all_env,
)
The env_run_list
method in this webserver
role defines different run lists for different environments. all_env
array defines a common run list for all environments and is appended with additional run list item php:module_xdebug
unique to dev
environment in this case.
Create roles/db_master.rb
containing the following:
name "db_master"
description "Master database server"
all_env = [
"role[base]",
"recipe[mysql::server]"
]
run_list(all_env)
env_run_lists(
"_default" => all_env,
"prod" => all_env,
"dev" => all_env,
)
The all_env
array again defines a common run list for all environments.
Commit these three roles, base
, webserver
, and db_master
created under the roles
subfolder.
git add roles
git commit -m "add roles for base, webserver and db_master"
Update the roles in Chef Server
knife role list # list all roles
knife role delete `rolename` # to delete any stale role you may not need
knife role from file roles/base.rb
knife role from file roles/webserver.rb
knife role from file roles/db_master.rb
You need a user account created with sysadmin privileges on every node. You accomplish that by defining a data bag for the users cookbook, with attributes describing your credentials.
It is best to use your existing user credentials from your workstation. Look for your public key under ~/.ssh
for a file named id_dsa.pub
or id_rsa.pub
or similar. That is your public key for your user account $USER
on your workstation. Create a public/private keypair
if you don’t find any by executing these commands.
echo "Checking for SSH key, generating one if it doesn't exist ..."
[[ -f ~/.ssh/id_dsa.pub ]] || ssh-keygen -t dsa -C your@email.address
echo "Copying public key to your clipboard so you can paste it whereever you like ..."
[[ -f ~/.ssh/id_dsa.pub ]] && cat ~/.ssh/id_dsa.pub | pbcopy
Then create a new data bag
file named after the user you want to create:
mkdir -p data_bags/users
vi data_bags/users/$USER.json
Edit below and paste your public key
as one long string into the ssh_keys
segment below and add this to the $USER.json file. Remeber to replace nilesh
with your username
on your workstation.
{
"id": "nilesh",
"ssh_keys": "ssh-dss AAAAB3NzaC1kc3MAAACBANcunES89sbKlIhrtkpnECp7Z4a+BlJHZTHYjBAo/Itw2R4WmuXhbQiEcYdiYR0tZjKmIXzzG5M5wWIzpmvuOaBxThVMKk8Irgu0bzi9eNY/MD+EDTNRhzry8q/IJeh8jDRfSB2exdcMcFAjmiVdKJd5bbql5NkU9uZaxGhV2W8XAAAAFQCVxO/iejN6s/ToaJWfV8IEFaJiqwAAAIAHl3vQcjQ40G+ZLoj8S73fU7/XhX8ushb3fP4ERCFUm54mvkkezUXJGupUgEihZuPNHWZdvjouzD7H1HMf6xLaR/umjzBX3sNhKFwA0I1gFBsxnHEu3QW0JV9ObJdmfz70lm9/y8Cj96T+ErkgRKd7dW7XWeF125cR9yPWmPWsZwAAAIEAvXo9aoAtX9ZS/Z9WmNcdP2IH4/blOnLr8wMDk+r4hUd7nExWFF7ckDwOl5Wlm1iagvUHzkjRHQjyPX9uEs3WAxm7kk6ofnBiFYzfNAGemDgN1D5FkpTeg/cbkYohpr9Zyl9m1N5hV0jBW5faoh/O0KmFInLVi7yIrPHQNjGv/9o= your@email.address",
"groups": [ "sysadmin", "dba", "devops" ],
"uid": 2001,
"shell": "\/bin\/bash"
}
Commit the data bag file to version control:
git add data_bags
git commit -m 'Add sysadmin user data bag item.'
Upload the data bag to the Chef server:
knife data bag list # to list out existing data bag items
knife data bad delete `itemname` # to delete any stale data bag item you may want to get rid of
knife data bag create users
knife data bag from file users users data_bags/users/$USER.json
Use an encrypted data bag to store secrets like passwords and encryption keys.
Create an encryption key:
openssl rand -base64 512 | tr -d '\r\n' > ~/.chef/encrypted_data_bag_secret
chmod 400 ~/.chef/encrypted_data_bag_secret
Add this line to ~/.chef/knife.rb
which would copy the encryption key to Chef clients so they can use it for decryption.
encrypted_data_bag_secret "#{current_dir}/encrypted_data_bag_secret"
Set the $EDITOR
environment variable to vi
by adding export EDITOR=vi
to ~/.bash_profile
or ~/.zshenv
). The knife data bag
command will launch this editor and allow you to edit the encrypted data bag contents. Create a new encrypted data bag item for storing MySQL passwords:
knife data bag create --secret-file ~/.chef/encrypted_data_bag_secret secrets mysql
Enter this into vi when it opens and make sure to select better passwords that these:
{
"id": "mysql",
"dev": {
"root": "dev-my-root-password",
"repl": "dev-my-replication-password",
"debian": "dev-my-debian-password"
},
"prod": {
"root": "secret-root-password",
"repl": "secret-replication-user-password",
"debian": "secret-debian-password"
}
}
This would encrypt, set passwords for the dev
and the prod
environment and upload to the Chef server. To save the encrypted data bag locally, download that into a file and commit to version control in JSON format.
mkdir -p data_bags/secrets
knife data bag show secrets mysql -Fj > data_bags/secrets/mysql.json
cat data_bags/secrets/mysql.json # you will see the encypted version of mysql data bag item
{
"id": "mysql",
"dev": "vLb86kC71FK6Z860ru/5Nkz3oKTOu/+4fPY2ics3h82mfiZEZTS3KR3QF8LV\nORwCikcK32ahjpwvgYVo3IexpDRh3tyPKWs3tlup7m7dsiDs9TrKbYsL3Ze+\n/9N6cQweV2+MbJmJ7+qqRjmyxEECbg==\n",
"prod": "/2tmCI0ewAmhSz9jr/izKLeqyEPUmq56p+9Ls7sf3Du6++hsryRnRse9ZDst\np+Z0OYKla0zzrknROqUrWCks+rmGuAMjHmUqFP14vSYN9F6znsf0I9EnEsLV\ncnflOuspU130zki7foaJmBo/OtyM5Q==\n"
}
Save that in version control.
git add data_bags
git commit -m "added encrypted data bag for mysql secrets"
Decrypt the data bag if you need to inspect. Just include the –secret-file argument:
knife data bag show secrets mysql --secret-file ~/.chef/encrypted_data_bag_secret
Modify the encrypted data bag if you need using the knife data bag edit command:
knife data bag edit --secret-file ~/.chef/encrypted_data_bag_secret secrets mysql
but make sure to commit any secret changes back into git. Next chef run would apply the changes onto nodes.
The best practise is to keep customizations to community cookbooks in a separate site-cookbooks
folder. However, I havn’t figured out yet what is the best way to override mysql community cookbook with this segment of code so resort to a hack and add the following to the top of cookbooks/mysql/recipes/server.rb:
# Customization: get passwords from encrypted data bag
secrets = Chef::EncryptedDataBagItem.load("secrets", "mysql")
if secrets && mysql_passwords = secrets[node.chef_environment]
node['mysql']['server_root_password'] = mysql_passwords['root']
node['mysql']['server_debian_password'] = mysql_passwords['debian']
node['mysql']['server_repl_password'] = mysql_passwords['repl']
end
Git will complain if you commit such a hack to version control so its best to remember that this is a hack and needs a cleaner approach to accomplish mysql cookbook to use encrypted data bag item for secrets.
git add cookbooks/mysql
git commit -m 'Read MySQL passwords from encrypted data bag.'
upload the updated mysql cookbook to Chef server.
knife cookbook upload mysql -o ./cookbooks
This command assumes that you have a key named awsdefault
and the corresponding awsdefault-east.pem
pemfile saved in ~/Downloads
folder, that the pemfile is marked read only for you chmod 400 ~/Downloads/awsdefault-east.pem
, that you have a security group named webserver
defined on EC2, that you have roles defined and upload onto Chef server, and that have environments also uploaded. The specified AMI ami-9c78c0f5 is the official 64-bit Ubuntu 12.04 EBS image in the us-east-1 region. If you want to use a different EC2 region, select a similar AMI in your desired region from the ubuntu AMI list. Also, you musts specify the db_master role before the webserver role.
knife ec2 server create \
-S awsdefault -i ~/Downloads/awsdefault-east.pem \
-G webserver,default \
-x ubuntu \
-d ubuntu12.04-gems \
-E prod \
-I ami-9c78c0f5 \
-f m1.small \
-r "role[base],role[db_master],role[webserver]"
After the provisioning is completed, knife will list some details on your new EC2 instalce like this.
...
...
ec2-50-17-75-229.compute-1.amazonaws.com [2012-12-12T01:19:21+00:00] INFO: Chef Run complete in 259.308362 seconds
ec2-50-17-75-229.compute-1.amazonaws.com [2012-12-12T01:19:21+00:00] INFO: Running report handlers
ec2-50-17-75-229.compute-1.amazonaws.com [2012-12-12T01:19:21+00:00] INFO: Report handlers complete
Instance ID: i-604df61e
Flavor: m1.small
Image: ami-9c78c0f5
Region: us-east-1
Availability Zone: us-east-1a
Security Groups: webserver, default
Security Group Ids: default
Tags: {"Name"=>"i-604df61e"}
SSH Key: awsdefault
Root Device Type: ebs
Root Volume ID: vol-66fa4619
Root Device Name: /dev/sda1
Root Device Delete on Terminate: true
Public DNS Name: ec2-50-17-75-229.compute-1.amazonaws.com
Public IP Address: 50.17.75.229
Private DNS Name: ip-10-101-51-86.ec2.internal
Private IP Address: 10.101.51.86
Environment: prod
Run List: role[base], role[db_master], role[webserver]
➜ fungibility git:(master)
At the end of this run, you should see It works!
page apache generates when you visit the public url of your amazon ec2 instance in a web browser. If you run into any errors during provisioning, you can edit the Chef configuration, upload that the Chef server, and then re-run the Chef client directly on the EC2 instance:
➜ fungibility git:(master) ssh -i ~/Downloads/awsdefault-east.pem ubuntu@ec2-50-17-75-229.compute-1.amazonaws.com
ec2$ sudo chef-client
Another way without ssh directly is to use knife to do a remote chef run
knife ssh role:base 'sudo chef-client'
Idempotence would come into play here and make it the fastest way to make config amendments because Chef won’t re-install things that are already installed.
This command assumes that you have a key named hpdefault
and the corresponding hpdefault.pem
pemfile saved in ~/Downloads
folder, that the pemfile is marked read only for you chmod 400 ~/Downloads/hpdefault.pem
, that you have a security group named webserver
defined on HP Cloud, that you have roles defined and upload onto Chef server, and that have environments also uploaded. The specified image 120 is the Ubuntu 12.04 EBS image in HP Cloud and 102 is the flavor (standard.medium) of machine used. knife hp flavor list
will give all flavors of machines available in the HP cloud. Also, you musts specify the db_master role before the webserver role.
knife hp server create \
-S hpdefault -i ~/Downloads/hpdefault.pem \
-G webserver,default \
-x ubuntu \
-d ubuntu12.04-gems \
-E prod \
-I 120 \
-f 102 \
-r "role[base],role[db_master],role[webserver]"
After provisioning, knife will print out some details on your HP instance.
...
15.185.226.228 [2012-12-12T01:39:55+00:00] INFO: Chef Run complete in 132.742629 seconds
15.185.226.228 [2012-12-12T01:39:55+00:00] INFO: Running report handlers
15.185.226.228 [2012-12-12T01:39:55+00:00] INFO: Report handlers complete
Instance ID: 403017
Instance Name: hp15-185-227-146
Flavor: 102
Image: 120
SSH Key Pair: hpdefault
Public IP Address: 15.185.226.228
Private IP Address: 10.2.2.51
Environment: prod
Run List: role[base], role[db_master], role[webserver]
You should be able to It works!
page when you visit the public url of your hp instance in a web browser. You just witnessed what I would refer to as the rudimentary beginnings of fungibility of cloud machines. Now that you have two instances running identical configuration but on machines from two different providers.
Let’s query the cloud instances using knife.
➜ fungibility git:(master) knife hp server list
Instance ID Name Public IP Private IP Flavor Image Key Pair State
403017 hp15-185-227-146 15.185.226.228 10.2.2.51 102 120 hpdefault active
➜ fungibility git:(master) knife ec2 server list
Instance ID Name Public IP Private IP Flavor Image SSH Key Security Groups State
i-604df61e i-604df61e 50.17.75.229 10.101.51.86 m1.small ami-9c78c0f5 awsdefault default, webserver running
and globally check uptime
, restart apache
and also run a sudo chef-client
on all machines with base
role.
knife ssh role:base 'uptime'
knife ssh role:base 'sudo service apache restart'
knife ssh role:base 'sudo chef-client'
# Restart Apache on all webservers
knife ssh role:webserver 'service apache restart'
# Check the free disk space on all nodes
knife ssh name:* 'df -h'
Unless you are planning to use the instances for something else, it is a good idea to destroy them so you wont get charged. Enumerate your instances using knife
➜ fungibility git:(master) knife hp server list
Instance ID Name Public IP Private IP Flavor Image Key Pair State
403017 hp15-185-227-146 15.185.226.228 10.2.2.51 102 120 hpdefault active
➜ fungibility git:(master) knife ec2 server list
Instance ID Name Public IP Private IP Flavor Image SSH Key Security Groups State
i-604df61e i-604df61e 50.17.75.229 10.101.51.86 m1.small ami-9c78c0f5 awsdefault default, webserver running
Delete the server instances, node, and client using knife:
#cleaning up HP
knife hp server delete 403017
INSTANCE=hp15-185-227-146
knife node delete $INSTANCE
knife client delete $INSTANCE
# cleaning up AWS EC2
INSTANCE=i-604df61e
knife ec2 server delete $INSTANCE
knife node delete $INSTANCE
knife client delete $INSTANCE
While deleting, you will notice that there are some minor differences between HP and AWS EC2 in the way knife deletion works.
Now that you have the configuration working on two different cloud providers, lets configure vagrant in a similar fashion so we can use that as a development environment. Create a directory on your workstation for your vagrant VM that would be shared with the vagrantbox. This is where your dev work will reside in subdirectories within this folder.
VMDIR=~/dev/vagrant-vm
mkdir -p $VMDIR
cd $VMDIR
Next, download and save this as $VMDIR/Vagrantfile
to help create and provision the vagrantbox.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|
In this Vagrantfile, make sure to set orgname to the orgname you use in Hosted Chef. The node must be unique among all nodes that use your Chef server. You can override it by exporting a $NODE
environment variable, or you can accept the default vagrant-$USER
. This Vagrantfile uses NFS for shared folders which is useful on a Mac or Linux host. Omit the , :nfs => true
argument on a Windows host. Don’t try to mount a shared directory on /home/vagrant
as it will cause important configuration to be overwritten, such as the .ssh directory (preventing key-based ssh authentication). You can change the amount of memory allocated to the VM with the config.vm.customize [ "--memory", 2048]
setting (currently configured to allocate 2GB). You must specify the db_master
role before the webserver
role.
Next, provision the vagrantbox:
cd $VMDIR
vagrant up
Or, to specify a custom NODE name such as my-cool-vm
:
NODE=my-cool-vm vagrant up
If you need to tweak the Chef scripts and then re-provision over the top of the existing configuration:
cd $VMDIR
vagrant provision # a bug https://github.com/mitchellh/vagrant/issues/1111 ?
vagrant ssh # this is a workaround
sudo chef-client # this is a workaround
To wipe it out and start over:
NODE=vagrant-$USER
cd $VMDIR
vagrant destroy
knife node delete $NODE
knife client delete $NODE
Check if vagrantvm set up correctly by opening http://localhost:8080 in your browser to see the It works!
page.
access key id
and secret access key
in a specific way.
Create this credentials master file $HOME/.credentials-master.txt
in the following format (replacing the values with your own credentials):
1 2 |
|
Note: The above is the sample content of .credentials-master.txt
file you are creating, and not shell commands to run.
Protect the above file and set an environment variable to tell AWS tools where to find it:
export AWS_CREDENTIAL_FILE=$HOME/.credentials-master.txt
chmod 600 $AWS_CREDENTIAL_FILE
We can now use the command line tools to create and manage the cloud.
iPython is a beautiful interactive shell for python which you can easily install in a virtualenv. Just type
pip install tornado pyzmq ipython
and then run
ipython notebook --pylab inline
This would open http://127.0.0.1:8888/
in a browser window where you can run python interactively. According to iPython notebook installation, MathJax is not installed by default which can be installed with these steps.
from IPython.external.mathjax import install_mathjax
install_mathjax()
]]>brew install pianobar
Now you can run your flash-free Pandora player in your terminal.
➜ ~ pianobar
Welcome to pianobar (2012.09.07)! Press ? for a list of commands.
[?] Email: lvnilesh@yahoo.com
[?] Password:
(i) Login... Ok.
(i) Get stations... Ok.
0) Boston Radio
1) Guns N' Roses Radio
2) Kishore Kumar, Mohd. Rafi, Mukesh & Lata Mangeshkar Radio
3) Lata Mangeshkar Radio
4) Led Zeppelin Radio
5) q Michael Jackson Radio
6) Q QuickMix
7) Super Freak Radio
[?] Select station: 5
|> Station "Michael Jackson Radio" (116177894800507788)
(i) Receiving new playlist... Ok.
|> "Wanna Be Startin' Somethin'" by "Michael Jackson" on "Thriller"
|> "Signed, Sealed, Delivered I'm Yours [Alternate Mix]" by "Stevie Wonder" on "The Complete Motown Singles: Volume 10: 1970"
|> "Freak" by "Chic" on "The Definitive Groove Collection: Chic"
|> "Brick House" by "The Commodores" on "Colour Collection"
|> "Thriller" by "Michael Jackson" on "Thriller"
# -05:34/05:59
Also run last.fm via terminal. Open terminal and type:
brew install shell-fm
and create the file ~/.shell-fm/shell-fm.rc containing this.
username = your-username
password = your-password
default-radio = lastfm://user/your-username/your-station-name
# for example: lastfm://user/lvnilesh/personal
and run
➜ ~ shell-fm
Shell.FM v0.8, (C) 2006-2010 by Jonas Kramer
Published under the terms of the GNU General Public License (GPL).
Press ? for help.
Receiving lvnilesh’s Library Radio.
Now playing "Call Me Maybe" by Carly Rae Jepsen.
-00:01
Enjoy!
]]>$$ [ i\hbar\frac{\partial \psi}{\partial t} = \frac{-\hbar^2}{2m} \left( \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2} + \frac{\partial^2}{\partial z^2} \right) \psi + V \psi ] $$
Here is $\rm \LaTeX$ inline, math representation of a circle ( $\begin{align} x^2 + y^2 = 1 \end{align}$) and here is Euler’s constant. $$ e = \mathop {\lim }\limits_{n \to \infty } \left( {1 + \frac{1}{n}} \right)^n $$
ruby <(curl -fsSkL raw.github.com/mxcl/homebrew/go)
brew install wget
brew install pyqt # brew installed sip as sip is a dependency
brew install gfortran
brew install gtk
brew install ghostscript
brew install swig
Use a virtual environment for use with QSTK (so it wont mess up existing setup) See my other post on setting up a virtualenv and create a quant virtualenv
mkvirtualenv quant
cd ~/domains/quant
The rest of the steps take place inside the newly created quant
virtualenv.
Install numpy
from source
pip install -e git+https://github.com/numpy/numpy.git#egg=numpy-dev
Install other dependencies via a requirements.txt file created by pip freeze > requirements.txt
from a working installation. 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Cython==0.16
distribute==0.6.28
epydoc==3.0.1
ipython==0.13
lxml==2.3.5
patsy==0.1.0
python-dateutil==1.5
pytz==2012d
pyzmq==2.2.0.1
tornado==2.3
wsgiref==0.1.2
Jinja2==2.6
Pygments==1.5
Sphinx==1.1.3
docutils==0.9.1
readline==6.2.2
six==1.1.0
xlrd==0.8.0
-e git+https://github.com/pydata/pandas.git#egg=pandas-dev
-e git+https://github.com/sympy/sympy.git#egg=sympy-dev
-e git+https://github.com/matplotlib/matplotlib.git#egg=matplotlib-dev
-e git+https://github.com/scipy/scipy.git#egg=scipy-dev
wget http://blog.fungibleclouds.com/downloads/code/requirements.txt
pip install -r requirements.txt
Install statsmodels
from source
pip install -e git+https://github.com/statsmodels/statsmodels.git#egg=statsmodels-dev
Install CVXopt
from source
pip install cvxopt
should work but seems there is a bug with cvxopt.
cd ~/domains/quant/src
wget http://abel.ee.ucla.edu/src/cvxopt-1.1.5.tar.gz
tar zxvf cvxopt-1.1.5.tar.gz
cd cvxopt-1.1.5/src
python setup.py install
Install QSTK
cd ~/domains/quant/
mkdir QSTK
cd QSTK
svn checkout http://svn.quantsoftware.org/openquantsoftware/trunk .
Install QSDATA
- sample data from the stock market
wget http://www.quantsoftware.org/QSData.zip
unzip QSData.zip
Configure the qstk specific env
variables
cp config.sh local.sh
vi local.sh # edit the $QSDATA env var to point to $QS/QSData/
vi local.sh # edit this to match path of QSTK and QSDATA
$QS : This is the path to your installation (The location of the Bin, Example, Docs) folders.
$QSDATA : This is where all the stock data will be.
source local.sh
Test the env
variables
echo $QS # would show ~/domains/quant/QSTK
echo $QSDATA # would show ~/domains/quant/QSTK/QSData
ipython notebook --pylab inline # This will open your default browser http://localhost:8888
Click on new notebook to create a new tab with new empty notebook. In that new notebook, type this code segment to test your setup
import numpy as np
import pandas as pand
import matplotlib.pyplot as plt
from pylab import *
x = np.random.randn(1000)
plt.hist(x,100)
plt.savefig('test.png',format='png')
Press SHIFT-ENTER to see something like this below.
The class is not started yet but here are the two recommended readings that I ordered already.
Active Portfolio Management: A Quantitative Approach for Producing Superior Returns and Controlling Risk by Richard Grinold, Ronald Kahn
All About Hedge Funds: The Easy Way to Get Started by Robert Jaeger
I am looking forward to applying the learnings from this class to my personal portfolio.
]]>sudo easy_install pip
sudo pip install virtualenv virtualenvwrapper
mkdir domains # create a directory to store different virtual environments
Create a temporary text file (say ~/appendthis
) with below text
export WORKON_HOME=$HOME/domains
source /usr/local/bin/virtualenvwrapper.sh
export PIP_VIRTUALENV_BASE=$
Append that temp file to ~/.zshenv
(or .profile
or .bashrc
depending on your shell)
cat ~/appendthis >> ~/.zshenv
Exit current shell and start terminal again to see something like this show up:
Linux quant 2.6.32-27-generic #49-Ubuntu SMP Thu Dec 2 00:51:09 UTC 2010 x86_64 GNU/Linux Ubuntu 10.04.1 LTS
Welcome to Ubuntu!
* Documentation: https://help.ubuntu.com/
Last login: Thu Dec 23 14:35:06 2010 from imac.workgroup
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/initialize
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/premkvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postmkvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/prermvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postrmvirtualenv
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/predeactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postdeactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/preactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/postactivate
virtualenvwrapper.user_scripts Creating /home/nilesh/domains/get_env_details
Now you can create any number of python virtual environments. For example, I create myfirstenv
mkvirtualenv myfirstenv # create my first virtual environment named myfirstenv
pip install BLAH # install BLAH
deactivate # deactivate that virtualenv
rmvirtualenv myfirstenv # remove myfirstenv
To work with virtualenv again, simply type:
workon myfirstenv
cd ~/domains/myfirstenv
Wrappers: Virtualenv provides several useful wrappers that can be used as shortcuts
mkvirtualenv (create a new virtualenv)
rmvirtualenv (remove an existing virtualenv)
workon (change the current virtualenv)
add2virtualenv (add external packages in a .pth file to current virtualenv)
cdsitepackages (cd into the site-packages directory of current virtualenv)
cdvirtualenv (cd into the root of the current virtualenv)
deactivate (deactivate virtualenv, which calls several hooks)
Hooks: One of the coolest things about virtualenvwrapper is the ability to provide hooks when an event occurs. Hook files can be placed in ENV/bin/
and are simply plain-text files with shell commands. virtualenvwrapper provides the following hooks:
postmkvirtualenv
prermvirtualenv
postrmvirtualenv
postactivate
predeactivate
postdeactivate
When you are done with that virtualenv, you can just type
rmvirtualenv myfirstenv # this will destroy that virtualenv named `myfirstenv` under ~/domains
]]>Install gcc-4.2. Ruby versions before 1.9 (such as 1.8.7 or REE) do not play well with Apple’s LLVM compiler, so you’ll need to install the old gcc-4.2 compiler. It’s available in the homebrew homebrew/dupes repository.
1 2 |
|
Install xquartz. The OS X upgrade will also remove your old X11.app installation, so go grab xquartz from http://xquartz.macosforge.org/landing/ and install it (you’ll need v2.7.2 or later for Mountain Lion).
Install Ruby 1.9. This one is simple.
1
|
|
Install Ruby 1.8.7. Remember to add the path to the xquartz X11 includes in CPPFLAGS. Here I’m using rbenv, but the same environment variables should work for rvm.
1
|
|
Install ree. Remember to add the path to the xquartz X11 includes in CPPFLAGS and the path to gcc-42 in CC. Here I’m using rbenv, but the same environment variables should work for rvm.
1
|
|
Enjoy your new Ruby versions
1
|
|
ada5p2
in the tank
pool decided to become unavailable. Even though I store critical data on this pool, I have nothing really to worry about because this ZFS pool is configured as raidz2
- a disk pool that can tolerate two simultaneous disk failures.
# zpool status -v tank
pool: tank
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
scrub: scrub in progress for 0h0m, 0.00% done, 73h11m to go
config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz2 DEGRADED 0 0 0
ada1p2 ONLINE 0 0 0
ada2p2 ONLINE 0 0 0
ada3p2 ONLINE 0 0 0
ada4p2 ONLINE 0 0 0
ada5p2 UNAVAIL 3 3.69K 0 cannot open
errors: No known data errors
Without shutting down my storage system, I just yanked the SATA cable from that broken harddisk and hot replaced it with another of similar size. Now ZFS would resilver that replaced drive on its own in the next couple hours but I was essentially done without any downtime and without any data errors. ZFS is nice indeed.
A cron job periodically scrubbing the zpools helps. ZFS has a built in scrub function that checks for errors and corrects them when possible. Running this task is pretty essential to prevent more errors that aren’t correctable. By default, ZFS doesn’t run this periodically, you have to tell it when to scrub. The easiest way to set up periodic scrubbing is to use crontab, a feature present in all UNIX systems for scheduling background tasks. Start the editing of root user’s crontab by issuing the command crontab -e
as root
. The crontab is set up by a simple set of commands:
* * * * * command to run
- - - - -
| | | | |
| | | | +----- day of week (0-6) (Sunday is 0)
| | | +------- month (1-12)
| | +--------- day of month (1-31)
| +----------- hour (0-23)
+------------- min (0-59)
For example, I want my system to scrub my tank
zpool on Sundays at 04:00 and my twoteebee
zpool on Thursdays at 04:00. The specific commands that I put in my crontab are:
0 4 * * 0 /sbin/zpool scrub tank
0 4 * * 4 /sbin/zpool scrub twoteebee
]]>When I first switched over to blogging using Octopress, I loaded it up on heroku via git but I was not super satisfied by the site’s performance for world wide audience. It took me a bit of exploring for a good but cost effective way to improve performance using CDN so here is a writeup explaining my setup that might help others.
If you have a blog but haven’t heard of Octopress, you should check it out. It’s great for anyone who likes writing in the text editor of their choice (I currently like IA Writer, and Writing Kit) instead of some web interface, wants to store the work in git, and is comfortable running a few Terminal commands.
I initially started out hosting my blog using a single Web Dyno, which is a free service offered by heroku for hosting my Octopress blog stored in git. The price was certainly right, but Heroku experienced a bit of downtime over the course of the life of my blog on Heroku and I feel strongly about uptime.
An alternative is using Amazon S3, Amazon’s cloud file storage service. Amazon lets you host a static website on S3 with your own domain name. You can also easily use Amazon CloudFront with S3. CloudFront is a CDN (content distribution network) that serves your content from a worldwide server network and helps to make your website faster.
If you’ve never used Amazon Web Services before, it can be a little confusing to get started. First, you need to sign up for an AWS account. When you have your account, log into the AWS Management Console and head to the S3 tab. Then:
Create a bucket called blog.myowndomain.com. You can not use myowndomain.com so use a subdomain like www or blog.
Under the properties for this bucket, you’ll need to go to the Website tab, check the box to enable static web hosting, and set your index and error documents. Your index document should probably be index.html. Your error document could be 404.html (an HTML page for file not found (404) errors). Make a note of your endpoint (http://blog.fungibleclouds.com.s3-website-us-east-1.amazonaws.com/). You’ll need it to create custom origin CloudFront distribution.
Create a bucket policy under permissions. Here is my bucket policy.
In AWS Console, go to the CloudFront tab, and create a new Distribution for the S3 website end point as custom origin. This link on custom origin helps. This will mirror your S3 bucket on CloudFront, for example, (http://d2h7g34rdqpc09.cloudfront.net/index.html) shows the home page of my website exactly as it appears on S3.
CloudFront will cache the contents of your S3 bucket for up to 24 hours. This cache is created from S3 the first time someone hits an asset under your CloudFront URL. This means that CloudFront won’t necessarily reflect changes on S3 immediately. You can manually invalidate/expire objects in CloudFront, but it’s easier to just not use it for anything that will change frequently.
You’ll need to create a DNS CNAME alias record to use your own domain with CloudFront that mirrors your S3 bucket. The way you do this depends on your DNS provider (I use Zerigo, which is cheap, reliable, and easy to use). You need to create a CNAME pointing blog.myowndomain.com to your CloudFront endpoint.
After propagation, your DNS results should look something like this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
|
This action is fairly simple. First you edit your posts you store in source/_posts. I currently prefer iA Writer so I keep a little executable script I label as ia to invoke it from the terminal.
1 2 3 4 5 |
|
Then you generate static HTML for your site.
1
|
|
and finally you push your incremental updates over to S3 using s3cmd in rsync like fashion
1
|
|
1
|
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
Set up db to run as your user account
1 2 |
|
Start the server
1
|
|
Secure the installation
1
|
|
Make sure to let mysql launch on startup
1 2 3 |
|
Make sure to check the plist to use the correct user
1
|
|
1
|
|
This nifty controller lets you manage Pandora from a small drop-down window while reducing the performance hit that often accompanies Flash-based apps.
]]>Vendor lock-in is the situation in which you are dependent on a single vendor for a product (i.e., a good or a service) and cannot move to another vendor without substantial costs and/or inconvenience. Lock-in is typically a result of standards controlled by the vendor, thereby granting the vendor some degree of monopoly power that usually leads to better profits for such vendor.
Here is a recent example illustrating the lock-in problem:
Few weeks ago, Google announced a significant price increase for use of its Google App Engine Platform-as-a-Service (PaaS). Google App Engine users knew and expected that Google would increase the price at some point but what shocked most developers was the jump in price which increased the cost of using the Google App Engine runtime environment by 100% or more in specific cases. It is a non trivial exercise to port to another location once an app is deployed on the Google App Engine infrastructure. This led to a big backlash on the App Engine google groups. Google responded with a few adjustments to its pricing but this incidence resurfaced some doubts about the cloud. Hart Singh of flipbook LLC, creators of the flipbook app on Facebook, raised a concern, “My team spent so much time learning app engine but I continue to wonder if we are betting our company on Google…any app we build can only be run on the Google App Engine.” Google App Engine requires custom code to run apps in that environment. Customizing take effort and time and impacts the bottomline.
According to Gartner, cloud computing customers are more concerned about vendor lock-in than about cloud security. So what exactly lock-in means in the context of cloud computing. For this we look at the various types of lock-in:
Horizontal lock-in limits the ability to replace a product with a comparable competing product. If you chose CRM solution from Oracle earlier, then you will need to migrate your data and code, retrain your users and rebuild the integrations to your other solutions if you want to move to Microsoft Dynamics CRM. Wouldn’t it be nice it you could reuse your garage, cabling, etc., when you switch from Toyota Prius to a Nissa Leaf? The higher you go up the levels of the cloud computing stack the stronger is the horizontal lock-in.
Moving from one SaaS solution to another in the cloud is no differenf from moving from one software to another provided there is a clear migration path. But PaaS can be a very deep lock-in especially if code needs to written to comply with PaaS requirement. IaaS lock-in is much less severe however the underlying hypervisors (_containers of virtual machines_) differ and can lead to some complexity during migration from one IaaS vendor to another.
Vertical lock-in limits choice in other levels of the cloud services stack. For example, selecting solution A mandates the use of database B, operating system C, hardware vendor D and/or implementation partner E. Open standards help prevent vertical lock-in by ensuring that hardware, middleware, and operating systems could be chosen independently.
Vertical lock-in built-into SaaS and PaaS offerings as the underlying infrastructure comes with the service. However, you won’t need to worry about managing the underlying layers of the cloud stack. IaaS offers comparatively less vertical lock-in. You know that application logic and data need proximity to gain decent performance so you should almost always procure storage services from the same IaaS provider as used for application logic processing.
Inclined lock-in is a tendency to buy as many solutions as possible from one provider, even if such solutions in some of these areas are less desirable. You tend to sometimes select a single vendor not only to make management, training and integration easier with a single throat to choke but also to be able to demand higher discounts. This leads to large and powerful vendors causing a high degree of inclined lock-in.
Generational lock-in becomes an issue when an entirely new generation of technology reaches the market. No technology generation and no platform lives forever. The first three types of lock-in are not too bad if you picked the right solution vendors (generally the ones that turn out to become the market leaders). But even such market leaders at some point reach end of life. You want to be able to replace them with the new generation of technology without it being prohibitively expensive or even impossible.
Vendor lock-in makes you vulnerable. Think defensively before committing
With vendor lock-in comes vulnerability to price increases. So think defensive. Here are our quick defence tactics against cloud vendor lock-in.
1. Avoid vendor lock-in Ensure your app is able to move easily to another cloud provider as and when needed. In essence, keep your plan B in implementable shape and prepare plan B before making serious customizations for a specific cloud platform.
2. Analyze the TCO for language and tools selection When building your cloud app, think hard about the code selection before you start filling up your git repository. Popular coding languages may not be the most economical for your specific situation. Think of availability of professionals skilled in the coding language of your choice both within and ourside your organization.
3. Carefully select your code base Runtime, scripting environments and code frameworks are not all similar. Discuss with your dev team members on the choice that would be most optimal for you.
4. Understand redundancy and cloud architecture Identify single points of failure (SPOF) in the architecture. Judge the redundancy elements for yourself and consult with the experts.
5. Tread PaaS land carefully Explore installable PaaS that you can run yourself if need be. Spread the risk among several different PaaS providers that do not depend on a common IaaS provider.
These tactics are the ones we find most used by our cloud clients in attempting to reduce the impact of vendor lock-in to a good degree.
Got other ideas on how you would avoid cloud vendor lock-in? Share via comments.
]]>I use this feature to turn on my sleeping iMac when I am away from it but want to log on using ssh (I maintain one Ubuntu machine on the local network that is always running).
The magic packet format is very simple: it must include 6 times hexadecimal FF, followed by 16 times the target machine’s MAC address.
Here is a Python script that will wake up your target machine remotely.
1 2 3 4 |
|
This script assumes that the target machine MAC address is 01-23-45-67-89-0a and that your local DHCP server issues an IP address of 192.168.2.109 to your target machine.
You can run this script from another machine on your local network, like so. 1
$ python wakeup.py
Got better ideas on waking up sleeping machines remotely when needed? Share below via comments.
]]>1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Similarly, use this app-toggler AppleScript to toggle between Chrome and TextMate.
1 2 3 4 5 6 7 8 9 |
|
Now just assign a hotkey to the script file and the hotkey becomes a toggle button.
Got more tips to increase your productivity? Share via comments below.
]]>1 2 |
|
To hide hidden files again, type:
1 2 |
|
Want to get ls colors on terminal on a mac like you may have seen in Ubuntu and some other linux distributions? Just append these two lines to your .bash_profile
1 2 |
|
Make sure to restart your SHELL after which the terminal will show ls colors like below.
Got your productivity enhancing tips? Share via comments below.
]]>Enjoy the colors while they last…Soon it will be grey all over :)
]]>Use title to grab attention: Make them see what you see. You may think that everyone sees things the way you do. But they don’t. Readers won’t pay attention until they perceive what you perceive. So make your position crystal clear. Use storytelling, personal experiences, or anything that will put the reader in the right position to understand your message.
Use emotion. Emotion brings clarity to your messages while making them personal. Emotion also comes with the triple bonus of adding clarity, giving readers a reason to talk about you, and triggering action you may want — emotion is much better at that than logic is. Emotional messages get attention. Tell a meaningful and personal story. When you make your writing personal, you make it important. Personally interesting or perceptually meaningful information grabs attention and brings clarity.
Offer something - an idea, a new way, a point of view: Offer something to your readers - an idea, a new way of thinking, a new point of view, a new experiment to try… something they can take away from your blog. Keep users engaged. Behavioral economics experts have established that people are generally fond of the 4 letter F-Word - A preference for FREE seems to be a feature hardwired into humans brains. See [Dan Ariely’s experiment](http://danariely.com/2009/08/10/the-nuances-of-the-free-experiment/) “_Free kisses beat bargain truffles.”_ Give them something free so they keep coming back for more…eventually becoming a repeat subscriber.
Write content to align with reader scan preferences: People tend to scan web pages like in a pattern different from what they read in print. Eye tracking research indicates the dominant patterns people tend to deploy while reading computer screens. In general, people tend to read blog posts in an F pattern, beginning at the top going through the first few rows, then scan down, scan across a bit again, and then scan down to skim for any thing interesting. The intensity of attention gets weaker (or the ink gets fainter) as readers scan down the post. Keeping this human behavior in mind will help you write better blog posts.
Write in bite sized chunks using a structured framework whenever feasible: Write small sized chunks that fits above the fold or above the scroll. Avoid complex/theoretical writing or marketing hyperbole. Use colloquialism. Try limiting a blog post to 450 - 675 words with 2 to 3 sections per post. Limit each section to about 2 or 3 paragraphs each no more than 75 words.
Stick to a manageable schedule for posting: Sticking to a schedule that your readers can predit and that you can manage is very useful for your readers. It helps to provide a predictability to your readers on when they would expect to see new posts…lets say:
What worked for you/didn’t work well in your blog? Chime in below with your comments.
via my friend @sterlizzi CEO of http://www.wearephotographers.com
]]>