Deploying the Eucalyptus Management Console on Eucalyptus

The Eucalyptus Management Console can be deployed in a variety of ways, but we’d obviously like it to be scalable, highly available and responsive. Last summer, I wrote up the details of deploying the console with Auto Scaling coupled with Elastic Load Balancing. The Cloud Formations service ties this all together by putting all of the details of how to use these services together in one template. This post will describe an example of how you can do this which works well on Eucalyptus (and AWS) and may guide you with your own application as well.

Let’s tackle a fairly simple deployment for the first round. For now, we’ll setup a LaunchConfig, AS group and ELB. We’ll also set up a security group for the AS group and allow access only to the ELB. Finally, we’ll set up a self signed SSL cert for the console. In another post, we’ll add memcached and and a cloudwatch alarm to automate scaling the console.

Instead of pasting pieces of the template here, why not open the template in another window. Under the “Resources” section, you’ll find the items I listed above. Notice “ConsoleLaunchConfig” pulls some values from the “Parameters” section such as KeyName, ImageId and InstanceType. Also uses is the “CloudIP”, but that gets included in a cloud-init script that is passed to UserData. Also, notice the SecurityGroups section that refers to the “ConsoleSecurityGroup” defined further down.

Right above that is the “ConsoleScalingGroup” which pulls in the launch config we just defined. Next “ConsoleELB” defines an ELB that listens for https traffic on 443 and talks to port 8888 on autoscaled instances. It defines a simple health check to verify the console process is running.

The “ConsoleSecurityGroup” uses attributes of the ELB to allow access only to the ELB security group on port 8888. We also allow for ssh ingress from a provided CIDR via “SSHLocation”.

To automate deploying the console using this Cloud Formations template, I wrote a shell script to pass the required values and create the stack. At the top of the script, there are 3 values you will need to set based on your cloud. CLOUD_IP is the address for your cloud front end. SSH_KEY is the name of the Keypair you’d like to use for ssh access into the instances (if any). IMAGE_ID must be the emi-id of a CentOS 6.6 image on your cloud. There are other values you may wish to change just below that. Those are used to create a self-signed SSL certificate. This cert will be installed in your account and it’s name passed into “euform-create-stack” command along with several other values we’ve already discussed.

If you’ve run this script successfully, you can check the status of the stack by running “euform-describe-stacks console-stack”. Once completed, the output section will show the URL to use to connect to your new ELB front-end.

To adjust the number of instances in the scaling group, you can use euscale-update-auto-scaling-group –desired-capacity=N. Since the template defines max count as 3, you would need to make other adjustments for running a larger deployment.

Check back again to see how to configure a shared memcached instance and auto-scale the console.

Running the Eucalyptus Management Console on Eucalyptus with the triangle services

At Eucalyptus, we’ve leveraged the existing compute infrastructure to deploy some new services. For example, ELB and our imaging service user workers that run as instances. This is useful because the cloud administrator won’t need to configure new machines to handle these tasks. The workers can be dynamically provisioned on top of existing infrastructure and that’s what cloud is all about! The management console can be deployed on top of Eucalyptus as well. In fact, using ELB and Autoscaling, we can provide a single service endpoint for users and runs a scalable back-end. Since Eucalyptus provides RHEL/CentOS packages, I started by installing a CentOS 6 image from http://emis.eucalyptus.com/. This image included cloud-init so I can very easily provision the console on an instance with user data. Here is the cloud-init script you would supply in user data. The one value that needs to be adjusted for your install is the cloud IP address (10.111.5.35).

#cloud-config
# vim: syntax=yaml
#
# This config installs the eucalyptus and epel repos, then installs and
# configures the eucaconsole package
runcmd:
 - [ yum, -y, install, "http://downloads.eucalyptus.com/software/eucalyptus/nightly/4.0/centos/6/x86_64/eucalyptus-release-4.0-0.1.el6.noarch.rpm" ]
 - [ yum, -y, install, eucaconsole ]
 - [ sed, -i, "s/localhost/10.111.5.35/", /etc/eucaconsole/console.ini ]
 - [ service, eucaconsole, restart ]

Here are some commands you can run with euca2ools to set things up. First, assume the above script is stored in a file called “console-init”

eulb-create-lb -z PARTI00,PARTI01 -l "lb-port=80, protocol=HTTP, instance-port=8888, instance-protocol=HTTP" console-lb

The cloud I used had 2 clusters shown above. I also set up port 80 on the elb to talk to port 8888 on the instances. We could also set up port 443 and SSL termination instead. Now, run eulb-describe-lbs console-lb –show-long and you’ll notice the owner-alias and group-name values. That’s the internal security group you’ll need to authorize port 8888 ingress for. What that does is indicate these instances only give access to ELB traffic on the port the console runs on. Run the euca-authorize command using the owner-alias and group-name (i.e. euca-authorize -P tcp -p 8888 -o euca-internal-276586128672-console-elb -u 641936683417 console-as-group).

euscale-create-launch-config -i emi-22536a68 -t m1.medium --group console-as-group --key dak-ssh-key --monitoring-enabled -f console-init console-launch-config

The launch config needs the CentOS 6 EMI ID. I also used an m1.medium since it uses more memory, but still a single CPU. You can certainly dedicate more resources to single instances as you see fit. Specifying an ssh key is optional unless things have gone pear-shaped.

euscale-create-auto-scaling-group -l console-launch-config -m 1 --desired-capacity 2 --max-size 4 --grace-period 300 -z PARTI00,PARTI01 --load-balancers console-lb consolegroup

The autoscaling group ties things together. After the last command runs, you should get 2 instances pending. Once those are up, eulb-describe-instance-health console-lb will show you the state of the instances from an end-user perspective. An “InService” instance can handle requests going through the ELB whereas “OutOfService” instances may still be installing/configuring per cloud-init. The grace period determines how long the scaling group waits for those to be ready. There is a lot more we could do with cloud watch data and autoscaling. For now, this setup will let you manually adjust the number of instances you dedicate to the console scaling group. You can point your browser to the ELB DNS name and see the console login screen!

Extra Credit

Let’s setup SSL termination for ELB. You either have your own certs or you could generate your own. Here are the commands to generate self-signed certs:

openssl genrsa 2048 > myssl.pem
openssl req -new -key myssl.pem -out csr.pem
openssl x509 -req -in csr.pem -signkey myssl.pem -days 365 -sha512 -out myssl.crt
chmod 600 myssl.*

Now you have the key and cert you need. The csr.pem file can be discarded. Now, upload the cert

euare-servercertupload -s myssl --certificate-file myssl.crt --private-key-file myssl.pem

To get the ARN for this cert, run “euare-servercertgetattributes -s myssl”

Now, add the listener to the ELB

eulb-create-lb-listeners console-lb --listener "protocol=HTTPS,lb-port=443,instance-port=8888,instance-protocol=HTTP,cert-id=arn:aws:iam::276586128672:server-certificate/myssl"

Now you can use the console with https! To see details of the ELB, run “eulb-describe-lbs console-elb –show-long”. You might want to remove the port 80 listener. To do that, type “eulb-delete-lb-listeners -l 80 console-lb”.

 

Adventures in memcached integration

As developers, we sometimes run into problems that are somewhat… challenging. That’s part of the fun of writing code though. I like trying to find clever ways to solve a problem. This was the case when trying to integrate memcached into the Eucalyptus Management Console.

Version 4.0 of the console uses Gunicorn which utilizes separate worker processes to handle requests. To implement any kind of effective caching, we’d need a shared cache. Memcached is a pretty obvious choice. Since we were using pyramid, beaker seems like an obvious option. Beaker does have support for memcached, but as the author points out, dogpile.cache is a much better choice as a cache interface library. Dogpile.cache has backends for memcached, redis and others which allow for some more interesting choices architecturally.

Our application uses boto to talk to both Eucalyptus and AWS. To start with, we wanted to cache image lists since they don’t change often and they can be fairly large. Dogpile.cache has regions you configure (generally for different expiration times). We set up short_term, long_term and others for our application. While working on a prototype for this, I ran into 2 main issues which I’ll cover in detail: pickle doesn’t handle all boto object graphs and invalidation of cache data.

Pickled Botos

We have an array of boto.ec2.image.Image objects that need to be cached. The memcached backend for dogpile.cache can use one of a few python interfaces to memcached. I chose to use python-memcached. It pickles the data before sending it to the memcached server. For those who don’t know, pickling is a way to encode python data and can be used to marshall and unmarshall object graphs. Anyway, some boto objects don’t marshall very well. I ran into this about 2 years ago when working on the first version of the console that used the JSONEncoder to send json versions of the boto objects to the browser as AJAX responses. I had to write my own JSONEncoder to handle the objects which didn’t marshall properly. The JSONEncoder supports passing your own implementation which handles object conversion, so that made life a little easier. The Pickler also supports this, but the implementation is buried down in the python-memcached package and there is no way to pass your own pickler down from the dogpile.cache layer. (I feel a pull request coming..) What I chose to do instead was to iterate over the image list and make adjustments to the objects graphs prior to storing in the cache. Certainly, this isn’t ideal, but it works for now. In doing this, I was able to delete some values out of the object graph which I don’t care about which saves time and space in the cache mechanism. I also found that (in this case) the boto.ec2.blockdevicemapping.BlockDeviceType object contained a circular reference which was causing the pickler to barf. I trimmed this out during my iteration and pickling worked fine!

The hard part was figuring out which object was causing the problem. I found a stackoverflow article that helped here. It showed how to extend the Pickler to either log what it was operating on, or catch exceptions (as I added for my purposes). Here’s my code;

class MyPickler (pickle.Pickler):
  def save(self, obj):
    #print 'pickling object', obj, 'of type', type(obj)
    try:
      pickle.Pickler.save(self, obj)
    except:
      print "--------- object dict = "+str(obj.__dict__)

I found it very helpful to see what object was causing the problem and could insert a breakpoint to inspect that object when the problem occurs. In the memcache.py file of python-memcached, I had to change an import so that cpickler wasn’t used. That’s a native pickler which is much faster, but doesn’t allow me to extend it in this way. This is clearly only a debugging tool and the standard package code should be used in production.

Invalidate == Delete

Each item stored in a cache region has a key generated. When using the @cache_on_arguments decorator, the cache key is created based on the string form of the arguments passed to the cache function. The decorator takes a namespace argument, so I was able to specify an additional key component so that any image values being cached all included “image” in the cache key. By default the key is also run through sha1 to create a digest to get consistent length (and obfuscated) cache keys. This works well and would have been all I had to do except that I couldn’t simply rely on the configured expiration of the cache region. There are cases where we needed to invalidate the set of data in the cache due to changes initiated within the application. In that case, our user would expect to see the new data immediately.

To invalidate, we would need to know the cache key used to refer to the data in the cache and perform a delete on the key. Since the cache key was generated for us, I had no idea what to use for deletion. I could have reverse engineered it, but if something changed in the underlying library, that could be fragile. Fortunately, a cache region can be given a key generator function when it is configured. We could use our own code to generate the cache key and call that again to invalidate the cache. This is the key generator I’m using:

def euca_key_generator(namespace, fn):
  def generate_key(*arg):
    # generate a key:
    # "namespace_arg1_arg2_arg3..."
    key = namespace + "_" + "_".join(str(s) for s in arg[1:])

    # return cache key
    # apply sha1 to obfuscate key contents
    return sha1(key).hexdigest()

  return generate_key

To use this to invalidate a cache (based on args), I wrote another function:

def invalidate_cache(cache, namespace, *arg):
  key = euca_key_generator(namespace, None)(*arg)
  cache.delete(key)

The namespace and arg list are passed to the key generator as you can see. This is merely a helper function. To invalidate the image cache, I needed to call the above function with the proper arguments. These are the same arguments passed in to the cache function (which uses the decorator).

The work on shared caching is currently in a branch, but will likely be merged into develop over the next month or so.

Size Problem

After beating my head against a wall for a while, I found there is a size limitation on memcached. It will only take values up to 1MB in size unless you recompile it. Fortunately, there is a handy solution. Since value get pickled, they do really well with compression. The python-memcached library supports compression, but you need to enabled it. By default the min_compress_len is zero, which means it never tries to compress the pickled data. In fact, the code silently returns from the set method having done nothing. This is where the frustration came in. I ended up spending some quality time in pdb to figure out that I could configure a dogpile.cache region with a min_compress_len greater than zero to get the underlying code to compress my data. Bingo! My large data set went from 3MB to 650K. This is how I configured my regions:

 long_term.configure(
     memory_cache,
     expiration_time = int(settings.get('cache.long_term.expire')),
     arguments = {
         'url':[memory_cache_url],
         'min_compress_len':1024,
     },
 )

I realize that 650K is not that far from 1MB, so perhaps splitting up the data will be needed at some point. The failure mode is simply a performance one, not so fatal.

Memcached Debug Tips

I learned a couple of things about monitoring my memcached server while debugging things. Two tips I found that will help are:

  • run memcached from a shell with -vv option. You’ll get useful output about get, set, send and delete operations.
  • telnet into the server using “telnet localhost 11211”. You can run commands like “stats” and “stats items”

Using the Eucalyptus User Console with AWS

At the end of last year, we (Eucalyptus) released version 3.2 which included our user console. This feature finally allowed regular users to login to a web UI to manage their resources. Because this was our first release, we had a lot of catching up to do. I would say that is still the case, but the point here is that we were able to test all of our features against Eucalyptus. As we add features to the user console which are currently under development in the server side, we must have the capability to test using the AWS services. Our API fidelity goal means that we are really able to develop against the Amazon implementation and then test against the Eucalyptus version when that becomes ready. Recently we did this for resource tagging. The server side folks have just finished implementing that, so we’ll be able to point the user console at our own server soon.

As a result of this need for testing with Amazon, we have hacked in a way to connect the user console with AWS endpoints. The trick is simple. In the login screen, there are 3 fields. Those fields are normally for account, username and password. To connect with Amazon, simply supply endpoint, access key and secret key in that order, in the account/user/password fields, like in the picture below

AWSLogin

After logging in, you can use all of the features in the user console against your AWS account. One difference is in how images are handled. Because of the very large number of public images (14 thousand at last count), the user console will only show images owned by (or shared with) the AWS account. The picture below shows what you might see on the dashboard. Notice the access key and endpoint appear to the upper right.

AWSDashboard

 

You may notice the large number of snapshots shown. This includes all public snapshots, and maybe need to be limited to those owned by the user at some point.

The code is currently in the testing branch on github. https://github.com/eucalyptus/eucalyptus/tree/testing

Scripting IAM Part 2: restoring from backup

Last time, we talked about a way to save some IAM resources on a cloud to a text file as a way to backup this information. We captured accounts/users/groups and policies. This post will focus on using the backup we created to restore those same resources to a new cloud (or the same one in a recovery scenario).

Since I’ve become fond of Python for some types of scripting, I decided I’d use python here to parse the text file. Originally, I thought I’d have python execute the euare- commands itself. Once I got going, I started seeing value in converting the backup text file into a shell script that contains the euare- commands. Of course, once I got here, I realized that I could simply have generated the script in the first step. Ah, well.. water under the bridge. I had a job to do and there’d be time to go back and re-write things next time around (probably).

We’ll use the same example set of resources from the last post:


accounts:
my account
user: admin
enduser
user: testuser
policy-name: testpolicy
policy-val: {
 "Statement":[
 {
 "Effect": "Allow",
 "Action": "*",
 "Resource": "*",
 }
]
}
enduser
user: dev1
enduser
user: dev2
enduser
group: developers
user: dev1
user: dev2
policy-name: allow_all
policy_val: {
 "Statement":[
 {
 "Effect": "Allow",
 "Action": "*",
 "Resource": "*",
 }
]
}
endgroup
endaccounts

The script I wrote is pretty brute force. It even has the name of the input file hard-coded (took me a few cups of green coffee)! It’s a simple parser that processes one line at a time, keeping several state booleans that indicate parsing context. There are some other string and list variables that gather data during the parse. When enough data is parsed, euare- commands are emitted to standard out.


#!/usr/bin/env python

def main():
 inAccounts = False
 inUser = False
 inGroup = False
 inPolicy = False

accountName = None
 userName = None
 groupName = None
 policyName = None
 userList = []
 policyLines = []

f = open('post.iam', 'r')
 line = f.readline()
 while (line != ""):
 line = line.strip()
 idx = line.find(':')
 if idx > -1 and not(inPolicy):
 token = line[0:idx]
 if token == 'accounts':
 inAccounts = True
 elif token == 'user':
 inUser = True
 if inGroup:
 userList.append(line[idx+1:].strip())
 else:
 userName = line[idx+1:].strip()
 elif token == 'group':
 inGroup = True
 groupName = line[idx+1:].strip()
 elif token == 'policy-name':
 policyName = line[idx+1:].strip()
 elif token == 'policy-val':
 policyLines.append(line[idx+1:].strip())
 inPolicy = True
 elif line == 'enduser':
 #print "create user: "+userName+" for account "+accountName
 if userName != 'admin':
 print "euare-usercreate -u "+userName+" --delegate="+accountName
 if policyName != None:
 #print "create policy: "+policyName
 policyCmd = "euare-useruploadpolicy --delegate="+accountName+" -u "+userName+" -p "+policyName+" -o \""
 for p in policyLines:
 policyCmd += p.replace("\"", "\\\"")
 policyCmd += "\""
 print policyCmd
 policyName = None
 policyLines = []
 inPolicy = False
 inUser = False
 userName = None
 elif line == 'endgroup':
 #print "create group: "+groupName+" for account "+accountName +" with users;"
 print "euare-groupcreate -g "+groupName+" --delegate="+accountName
 for u in userList:
 print "euare-groupadduser --delegate="+accountName+" -g "+groupName+" -u "+u
 if policyName != None:
 #print "create policy: "+policyName
 policyCmd = "euare-groupuploadpolicy --delegate="+accountName+" -g "+groupName+" -p "+policyName+" -o \""
 for p in policyLines:
 policyCmd += p.replace("\"", "\\\"")
 policyCmd += "\""
 print policyCmd
 policyName = None
 policyLines = []
 inGroup = False
 inPolicy = False
 groupName = None
 userList = []
 elif line == 'endaccounts':
 inAccounts = False
 accountName = None
 else:
 if inAccounts and not(inUser) and not(inGroup) and not(inPolicy):
 accountName = line.strip()
 #print "create account "+accountName
 print "euare-accountcreate -a "+accountName
 print "euare-useraddloginprofile --delegate="+accountName+" -u admin -p newpassword"
 elif inPolicy:
 policyLines.append(line.strip())

 line = f.readline()

if __name__ == "__main__":
 main();

When I run this on the input file, it produces;


euare-accountcreate -a my account
euare-useraddloginprofile --delegate=my account -u admin -p newpassword
euare-usercreate -u testuser --delegate=my account
euare-useruploadpolicy --delegate=my account -u testuser -p testpolicy -o "{\"Statement\":[{\"Effect\": \"Allow\",\"Action\": \"*\",\"Resource\": \"*\",}]}"
euare-usercreate -u dev1 --delegate=my account
euare-usercreate -u dev2 --delegate=my account
euare-groupcreate -g developers --delegate=my account
euare-groupadduser --delegate=my account -g developers -u dev1
euare-groupadduser --delegate=my account -g developers -u dev2
euare-groupuploadpolicy --delegate=my account -g developers -p allow_all -o ""

You’ll notice the euare-addloginprofile commands, which simply set a default password.

Scripting IAM Part 1: Extracting Resources for Backup

Amazon has supported the IAM (Identity and Access Management) API for some time now. The new release of Eucalyptus adds IAM support and got me thinking of how somebody could backup IAM settings. What really got me going on an actual solution was the need to save a set of accounts/users/groups and policies in order to restore them to a new Eucalyptus cloud.
My plan was to generate a data file which contains all of the information and can be parsed fairly easily to restore the information to the destination cloud. I considered writing JSON, but I had some time constraints and didn't feel like fiddling around getting the formatting just so. I chose to output some tokens followed by a colon. It looks like this:
accounts:
my account
user: admin
enduser
user: testuser
policy-name: testpolicy
policy-val: {
  "Statement":[
  {
    "Effect": "Allow",
    "Action": "*",
    "Resource": "*",
  }
]
}
enduser
user: dev1
enduser
user: dev2
enduser
group: developers
user: dev1
user: dev2
policy-name: allow_all
policy_val: {
  "Statement":[
  {
    "Effect": "Allow",
    "Action": "*",
    "Resource": "*",
  }
]
}
endgroup
endaccounts

I decided to write a bash script to run the commands and parse their output to produce the above tagged format. What you see below is what I came up with. It assumes you have environment variables set up properly (source the eucarc file). It loops through all accounts, then within each account, all users and groups. For users, it also looks for policies. For groups, it lists the users in the group and looks for policies.


#!/bin/bash
echo "accounts:"
for i in `euare-accountlist |awk '{ print $1 }'`
do
 echo $i
 for u in `euare-userlistbypath --delegate=$i`
 do
 u2=`echo $u |cut -d/ -f2-`
 u3=`basename $u2`
 echo user: $u3
 if [ `euare-userlistpolicies -u $u3 --delegate=$i|wc -l` > 0 ]
 then
 for p in `euare-userlistpolicies -u $u3 --delegate=$i`
 do
 echo policy-name: $p
 policy=`euare-usergetpolicy -u $u3 -p $p --delegate=$i`
 echo "policy-val: $policy"
 done
 fi
 echo enduser
 done

for j in `euare-grouplistbypath --delegate=$i | tail -n +2`
 do
 k=`echo $j |cut -d/ -f2-`
 l=`basename $k`
 echo group: $l
 for gu in `euare-grouplistusers -g $l --delegate=$i | tail -n +3`
 do
 gu2=`echo $gu |cut -d/ -f2-`
 gu3=`basename $gu2`
 echo user: $gu3
 done
 if [ `euare-grouplistpolicies -g $l --delegate=$i|wc -l` > 0 ]
 then
 for p in `euare-grouplistpolicies -g $l --delegate=$i`
 do
 echo policy-name: $p
 policy=`euare-groupgetpolicy -g $l -p $p --delegate=$i`
 echo "policy-val: $policy"
 done
 fi
 echo endgroup
 done
done
echo endaccounts

In the next post, I'll talk about how I used this backup data to restore accounts to a new cloud.

Automating EBS Volume Attach at Boot Time

A few years ago, I found myself attaching volumes to instances with some frequency. The volume often came from a snapshot which contained some test data. Like any lazy programmer, I didn’t want to do this work over and over again! I wrote this little utility which would examine the user data and mount a pre-existing volume, or create a new volume from a snapshot and attach that. Here’s the code;

import java.io.IOException;
import java.util.List;
import java.util.StringTokenizer;

import com.xerox.amazonws.ec2.AttachmentInfo;
import com.xerox.amazonws.ec2.EC2Exception;
import com.xerox.amazonws.ec2.EC2Utils;
import com.xerox.amazonws.ec2.Jec2;
import com.xerox.amazonws.ec2.VolumeInfo;

public class AttachVolume {

	public static void main(String [] args) {
		try {
			String userData = EC2Utils.getInstanceUserdata();
			StringTokenizer st = new StringTokenizer(userData);
			String accessId = st.nextToken();
			String secretKey = st.nextToken();
			String volumeOrSnapId = st.nextToken();

			Jec2 ec2 = new Jec2(accessId, secretKey);
			String volumeId = null;
			if (volumeOrSnapId.startsWith("snap-")) {
				String zone = EC2Utils.getInstanceMetadata("placement/availability-zone");
				// create volume from snapshot and wait
				VolumeInfo vinf = ec2.createVolume(null, volumeOrSnapId, zone);
				volumeId = vinf.getVolumeId();
				List<VolumeInfo> vols = ec2.describeVolumes(new String [] {volumeId});
				while (!vols.get(0).getStatus().equals("available")) {
					System.out.println(vols.get(0).getStatus());
					try { Thread.sleep(2); } catch (InterruptedException ex) {}
					vols = ec2.describeVolumes(new String [] {volumeId});
				}
			}
			if (volumeOrSnapId.startsWith("vol-")) {
				volumeId = volumeOrSnapId;
			}
			// attach volume and wait
			String instanceId = EC2Utils.getInstanceMetadata("instance-id");
			ec2.attachVolume(volumeId, instanceId, "/dev/sdh");
			List<VolumeInfo> vols = ec2.describeVolumes(new String [] {volumeId});
			while (!vols.get(0).getAttachmentInfo().get(0).getStatus().equals("attached")) {
				System.out.println(vols.get(0).getAttachmentInfo().get(0).getStatus());
				try { Thread.sleep(2); } catch (InterruptedException ex) {}
				vols = ec2.describeVolumes(new String [] {volumeId});
			}
		} catch (Exception ex) {
			System.err.println("Couldn't complete the attach : "+ex.getMessage());
			ex.printStackTrace();
			System.exit(-1);
		}
	}
}

Requirements

  • Java Runtime Environment (1.5 or greater)
  • Typica + and it’s dependencies
  • This utility (compiled)

A Few Words About the Code

The first thing you’ll notice is that user data is being parsed. The expectations are that the following items are passed via user data;

  • access id – AWS Access Id
  • secret key – AWS Secret Key
  • volumeOrSnapId – either a volume ID or snapshot ID

The code inspects the last parameter to see if it is a snapshot id. If so, it creates a volume and waits for it to become “available”. One that’s done, it gets the instance ID from meta data and attaches the volume at a hard-coded device.  (obviously, this could be in user data which is an exercise I’ll leave to the reader)

On a linux machines, I’d often call this from the /etc/rc.local script. I should also note that this works just as well with Eucalyptus due to its API fidelity with Amazon EC2

There you have it!

Using the Ruby Fog library to connect with Eucalyptus

There is a popular Ruby cloud library called Fog. I’ve seen a number of applications that use this for connecting to AWS EC2. I’ve done some testing and it is pretty simple to get Fog talking to Eucalyptus. The first thing you need are your access id and secret key, much like you’d get from EC2. These are available on the credentials tab in Eucalyptus. The other thing you need to specify is the endpoint. In the case of Eucalyptus, that will point to the cloud endpoint for your private cloud (or in this example, the Eucalyptus Community Cloud).

This is an example credentials file, stored in ~/.fog

#######################################################
# Fog Credentials File
#
:default:
  :aws_access_key_id:       IyJWpgObMl2Yp70BlWEP4aNGMfXdhL0FtAx4cQ
  :aws_secret_access_key:   7DeDGG2YMOnOqmWxwnHD5x9Y0PKbwE3xttsew
  :endpoint:                http://ecc.eucalyptus.com:8773/services/Eucalyptus

Notice that the eucalyptus endpoint requires port 8773 and the “/services/Eucalyptus” path.
You can use the Fog interactive tool to test this out. Notice we’re using the AWS compute provider because the Eucalyptus cloud is API compatible with EC2.

# fog
  Welcome to fog interactive!
  :default provides AWS and AWS
>> servers = Compute[:aws].servers
  <Fog::Compute::AWS::Servers
    filters={}
    []
  >
>>

As you can see, there are no servers running. By replacing “servers” with “images”, you can show a list of images available on the ECC.
To start an instance, you can run a command like this;

>> servers = Compute[:aws].servers.create(:image_id => 'emi-9ACB1363', :flavor_id => 'm1.small')
  <Fog::Compute::AWS::Server
    id="i-3D7A079C",
    ami_launch_index=0,
    availability_zone="open",
    block_device_mapping=[],
    client_token=nil,
    dns_name="euca-0-0-0-0.eucalyptus.eucasys.com",
    groups=["default"],
    flavor_id="m1.small",
    image_id="emi-9ACB1363",
    kernel_id="eki-6CBD12F2",
    key_name=nil,
    created_at=Wed Oct 19 15:19:16 UTC 2011,
    monitoring=false,
    placement_group=nil,
    platform=nil,
    product_codes=[],
    private_dns_name="euca-0-0-0-0.eucalyptus.internal",
    private_ip_address=nil,
    public_ip_address=nil,
    ramdisk_id="eri-A97113E4",
    reason="NORMAL:  -- []",
    root_device_name=nil,
    root_device_type=nil,
    state="pending",
    state_reason=nil,
    subnet_id=nil,
    tenancy=nil,
    tags=nil,
    user_data=nil
  >
>> servers = Compute[:aws].servers  <Fog::Compute::AWS::Servers
    filters={}
    [
      <Fog::Compute::AWS::Server
        id="i-3D7A079C",
        ami_launch_index=0,
        availability_zone="open",
        block_device_mapping=[],
        client_token=nil,
        dns_name="euca-0-0-0-0.eucalyptus.eucasys.com",
        groups=["default"],
        flavor_id="m1.small",
        image_id="emi-9ACB1363",
        kernel_id="eki-6CBD12F2",
        key_name=nil,
        created_at=Wed Oct 19 15:19:16 UTC 2011,
        monitoring=false,
        placement_group=nil,
        platform=nil,
        product_codes=[],
        private_dns_name="euca-0-0-0-0.eucalyptus.internal",
        private_ip_address=nil,
        public_ip_address=nil,
        ramdisk_id="eri-A97113E4",
        reason="NORMAL:  -- []",
        root_device_name=nil,
        root_device_type=nil,
        state="pending",
        state_reason={},
        subnet_id=nil,
        tenancy=nil,
        tags={},
        user_data=nil
      >
    ]
  >

You’ll notice that now the instance (which is in “pending” state) appears in the list of servers.
I won’t show more here, but we’ve established basic connectivity to a Eucalyptus cloud. I hope this helps enable many existing and new applications to work with Eucalyptus!

Migrating an EC2 AMI to Eucalyptus

There have been different instructions for using an image from Amazon’s EC2 on a local Eucalyptus cluster. This seems to be what worked best for me.

The basic steps are, launch an instance of the AMI, run euca-bundle-vol with your Eucalyptus credentials, upload bundle, register. While it would be possible to use the download-bundle/un-bundle method detailed in this post, that only works with images that your account created. The use case I’m addressing here is to get starting images for building some custom images within your private cloud. Another use case is when duplicating custom images from private to public cloud for purposes of cloud-bursting. That’ll be covered in another post.

specifically, when converting ami-1a837773 (Ubuntu-Maverick-32bit)

ec2-run-instances ami-1a837773 -k dak-keypair

When that boots, scp the credentials zip file that you got from the ECC (or your own cloud)  (i.e. scp -i dak-keypair euca2*.zip ubuntu@50.16.60.6:.) (UPDATE: my image didn’t have zip installed, so I repackaged the zip as a tar.gz) Because Ubuntu images don’t allow root login, we can only copy files into the user directory. Ideally, we don’t want credentials on the root filesystem because they’ll end up in the bundle. So, the first thing we’ll need to do after logging into the instance is to move the zip file to /mnt directory (ephemeral store). (There are additional security concerns that may apply. This post at alestic.com covers that well.)

On the instance;

sudo mv euca2*.zip /mnt
cd /mnt
sudo unzip euca2*.zip
source eucarc

To bundle/upload the image, you’ll need the euca2ools. There are some instructions here that help. This Maverick image already has them installed.

If the image has a default kernel specified (as this Maverick one does), that aki id won’t work on eucalyptus. For the ECC, looking at the list of images shows that many of them specify the eki-6CBD12F2 kernel, so I will also use that when overriding the EC2 kernel.  If you run your own Eucalyptus installation, it is easy to get the default kernel id via the management interface on the “Configuration” tab. Take note of the ramdisk id also, since that goes hand-in-hand with the kernel.

In the case of a private Eucalyptus installation, network restrictions probably won't allow the EC2 instance to upload to Eucalyptus directly. One way to do that is downloading a gzipped image to your local machine, run euca-bundle-image prior to upload. That is time consuming and since I'm working with ECC here, all of the operations can be run on the EC2 instance.
sudo -E euca-bundle-vol -p Ubuntu-10.10-Maverick-32bit -s 2048 -d /mnt -r i386 --kernel eki-6CBD12F2 --ramdisk eri-A97113E4</pre>
euca-upload-bundle -b dak-images -m /mnt/Ubuntu-10.10-Maverick-32bit.manifest.xml
euca-register dak-images/Ubuntu-10.10-Maverick-32bit.manifest.xml

At this point, you should be all set to launch the image.

Footnote: I've tested this with a Maverick S3 backed AMI and a Lucid EBS backed AMI.

A New Adventure

My professional career has been spent at 2 companies, Eastman Kodak and D.O.Tech/directThought (rebranded 9 years in). At Kodak, I worked on blood analyzers (which they spun off to J&J), Photo CD (which was made obsolete by newer technologies) and Picture Maker, which is still going strong after N generations of hardware/software. At directThought, I had the joy of working with a lot of great people and working on some interesting projects. I worked on a Picture Maker-like kiosk/web-app/desktop app combination at Xerox. They even created a new division for that project called Pixography. We had XML templates that described printed products like greeting cards, calendars, business cards, brochures and photo books (to name a few). Java 2D rendered everything for print and preview. We had tight integration between the 3 different apps but that project died in its original form, but lived on in spirit in a custom production printing installation out on the west coast. After that, I worked on some enterprise apps for Pfizer, a payroll application for Paychex, then back to more custom apps for the services arm of Xerox. At that point, I got involved in Amazon Web Service and started kicking the tires on this new service called EC2. During that time, I started my most successful open source project called typica, which is still has a lot of users. After Xerox, I helped a number of customers run their apps on AWS’s infrastructure. We were fortunate enough to be come an inaugural AWS System Integrator. I was also asked to learn how to write apps for this hot new platform called the iPhone. I’ve had a couple of apps in the app store, and worked on a few more. I also got to go to the only WWDC where Steve didn’t deliver the keynote (because he was getting a new liver). All in all, a pretty great experience with may interesting technologies under my belt.

Now, I feel like it is time for a change. I’ve just accepted a job with Eucalyptus Systems! They build infrastructure that powers clouds. They have a lot of great people working there and I am looking forward to doing my part to help the company grow, if not flourish in this exciting space. Since they just started business last year, I can say I’m now part of a fast growing startup! Very excited!