Deploying the Eucalyptus Management Console on Eucalyptus

The Eucalyptus Management Console can be deployed in a variety of ways, but we’d obviously like it to be scalable, highly available and responsive. Last summer, I wrote up the details of deploying the console with Auto Scaling coupled with Elastic Load Balancing. The Cloud Formations service ties this all together by putting all of the details of how to use these services together in one template. This post will describe an example of how you can do this which works well on Eucalyptus (and AWS) and may guide you with your own application as well.

Let’s tackle a fairly simple deployment for the first round. For now, we’ll setup a LaunchConfig, AS group and ELB. We’ll also set up a security group for the AS group and allow access only to the ELB. Finally, we’ll set up a self signed SSL cert for the console. In another post, we’ll add memcached and and a cloudwatch alarm to automate scaling the console.

Instead of pasting pieces of the template here, why not open the template in another window. Under the “Resources” section, you’ll find the items I listed above. Notice “ConsoleLaunchConfig” pulls some values from the “Parameters” section such as KeyName, ImageId and InstanceType. Also uses is the “CloudIP”, but that gets included in a cloud-init script that is passed to UserData. Also, notice the SecurityGroups section that refers to the “ConsoleSecurityGroup” defined further down.

Right above that is the “ConsoleScalingGroup” which pulls in the launch config we just defined. Next “ConsoleELB” defines an ELB that listens for https traffic on 443 and talks to port 8888 on autoscaled instances. It defines a simple health check to verify the console process is running.

The “ConsoleSecurityGroup” uses attributes of the ELB to allow access only to the ELB security group on port 8888. We also allow for ssh ingress from a provided CIDR via “SSHLocation”.

To automate deploying the console using this Cloud Formations template, I wrote a shell script to pass the required values and create the stack. At the top of the script, there are 3 values you will need to set based on your cloud. CLOUD_IP is the address for your cloud front end. SSH_KEY is the name of the Keypair you’d like to use for ssh access into the instances (if any). IMAGE_ID must be the emi-id of a CentOS 6.6 image on your cloud. There are other values you may wish to change just below that. Those are used to create a self-signed SSL certificate. This cert will be installed in your account and it’s name passed into “euform-create-stack” command along with several other values we’ve already discussed.

If you’ve run this script successfully, you can check the status of the stack by running “euform-describe-stacks console-stack”. Once completed, the output section will show the URL to use to connect to your new ELB front-end.

To adjust the number of instances in the scaling group, you can use euscale-update-auto-scaling-group –desired-capacity=N. Since the template defines max count as 3, you would need to make other adjustments for running a larger deployment.

Check back again to see how to configure a shared memcached instance and auto-scale the console.

Running the Eucalyptus Management Console on Eucalyptus with the triangle services

At Eucalyptus, we’ve leveraged the existing compute infrastructure to deploy some new services. For example, ELB and our imaging service user workers that run as instances. This is useful because the cloud administrator won’t need to configure new machines to handle these tasks. The workers can be dynamically provisioned on top of existing infrastructure and that’s what cloud is all about! The management console can be deployed on top of Eucalyptus as well. In fact, using ELB and Autoscaling, we can provide a single service endpoint for users and runs a scalable back-end. Since Eucalyptus provides RHEL/CentOS packages, I started by installing a CentOS 6 image from http://emis.eucalyptus.com/. This image included cloud-init so I can very easily provision the console on an instance with user data. Here is the cloud-init script you would supply in user data. The one value that needs to be adjusted for your install is the cloud IP address (10.111.5.35).

#cloud-config
# vim: syntax=yaml
#
# This config installs the eucalyptus and epel repos, then installs and
# configures the eucaconsole package
runcmd:
 - [ yum, -y, install, "http://downloads.eucalyptus.com/software/eucalyptus/nightly/4.0/centos/6/x86_64/eucalyptus-release-4.0-0.1.el6.noarch.rpm" ]
 - [ yum, -y, install, eucaconsole ]
 - [ sed, -i, "s/localhost/10.111.5.35/", /etc/eucaconsole/console.ini ]
 - [ service, eucaconsole, restart ]

Here are some commands you can run with euca2ools to set things up. First, assume the above script is stored in a file called “console-init”

eulb-create-lb -z PARTI00,PARTI01 -l "lb-port=80, protocol=HTTP, instance-port=8888, instance-protocol=HTTP" console-lb

The cloud I used had 2 clusters shown above. I also set up port 80 on the elb to talk to port 8888 on the instances. We could also set up port 443 and SSL termination instead. Now, run eulb-describe-lbs console-lb –show-long and you’ll notice the owner-alias and group-name values. That’s the internal security group you’ll need to authorize port 8888 ingress for. What that does is indicate these instances only give access to ELB traffic on the port the console runs on. Run the euca-authorize command using the owner-alias and group-name (i.e. euca-authorize -P tcp -p 8888 -o euca-internal-276586128672-console-elb -u 641936683417 console-as-group).

euscale-create-launch-config -i emi-22536a68 -t m1.medium --group console-as-group --key dak-ssh-key --monitoring-enabled -f console-init console-launch-config

The launch config needs the CentOS 6 EMI ID. I also used an m1.medium since it uses more memory, but still a single CPU. You can certainly dedicate more resources to single instances as you see fit. Specifying an ssh key is optional unless things have gone pear-shaped.

euscale-create-auto-scaling-group -l console-launch-config -m 1 --desired-capacity 2 --max-size 4 --grace-period 300 -z PARTI00,PARTI01 --load-balancers console-lb consolegroup

The autoscaling group ties things together. After the last command runs, you should get 2 instances pending. Once those are up, eulb-describe-instance-health console-lb will show you the state of the instances from an end-user perspective. An “InService” instance can handle requests going through the ELB whereas “OutOfService” instances may still be installing/configuring per cloud-init. The grace period determines how long the scaling group waits for those to be ready. There is a lot more we could do with cloud watch data and autoscaling. For now, this setup will let you manually adjust the number of instances you dedicate to the console scaling group. You can point your browser to the ELB DNS name and see the console login screen!

Extra Credit

Let’s setup SSL termination for ELB. You either have your own certs or you could generate your own. Here are the commands to generate self-signed certs:

openssl genrsa 2048 > myssl.pem
openssl req -new -key myssl.pem -out csr.pem
openssl x509 -req -in csr.pem -signkey myssl.pem -days 365 -sha512 -out myssl.crt
chmod 600 myssl.*

Now you have the key and cert you need. The csr.pem file can be discarded. Now, upload the cert

euare-servercertupload -s myssl --certificate-file myssl.crt --private-key-file myssl.pem

To get the ARN for this cert, run “euare-servercertgetattributes -s myssl”

Now, add the listener to the ELB

eulb-create-lb-listeners console-lb --listener "protocol=HTTPS,lb-port=443,instance-port=8888,instance-protocol=HTTP,cert-id=arn:aws:iam::276586128672:server-certificate/myssl"

Now you can use the console with https! To see details of the ELB, run “eulb-describe-lbs console-elb –show-long”. You might want to remove the port 80 listener. To do that, type “eulb-delete-lb-listeners -l 80 console-lb”.

 

Adventures in memcached integration

As developers, we sometimes run into problems that are somewhat… challenging. That’s part of the fun of writing code though. I like trying to find clever ways to solve a problem. This was the case when trying to integrate memcached into the Eucalyptus Management Console.

Version 4.0 of the console uses Gunicorn which utilizes separate worker processes to handle requests. To implement any kind of effective caching, we’d need a shared cache. Memcached is a pretty obvious choice. Since we were using pyramid, beaker seems like an obvious option. Beaker does have support for memcached, but as the author points out, dogpile.cache is a much better choice as a cache interface library. Dogpile.cache has backends for memcached, redis and others which allow for some more interesting choices architecturally.

Our application uses boto to talk to both Eucalyptus and AWS. To start with, we wanted to cache image lists since they don’t change often and they can be fairly large. Dogpile.cache has regions you configure (generally for different expiration times). We set up short_term, long_term and others for our application. While working on a prototype for this, I ran into 2 main issues which I’ll cover in detail: pickle doesn’t handle all boto object graphs and invalidation of cache data.

Pickled Botos

We have an array of boto.ec2.image.Image objects that need to be cached. The memcached backend for dogpile.cache can use one of a few python interfaces to memcached. I chose to use python-memcached. It pickles the data before sending it to the memcached server. For those who don’t know, pickling is a way to encode python data and can be used to marshall and unmarshall object graphs. Anyway, some boto objects don’t marshall very well. I ran into this about 2 years ago when working on the first version of the console that used the JSONEncoder to send json versions of the boto objects to the browser as AJAX responses. I had to write my own JSONEncoder to handle the objects which didn’t marshall properly. The JSONEncoder supports passing your own implementation which handles object conversion, so that made life a little easier. The Pickler also supports this, but the implementation is buried down in the python-memcached package and there is no way to pass your own pickler down from the dogpile.cache layer. (I feel a pull request coming..) What I chose to do instead was to iterate over the image list and make adjustments to the objects graphs prior to storing in the cache. Certainly, this isn’t ideal, but it works for now. In doing this, I was able to delete some values out of the object graph which I don’t care about which saves time and space in the cache mechanism. I also found that (in this case) the boto.ec2.blockdevicemapping.BlockDeviceType object contained a circular reference which was causing the pickler to barf. I trimmed this out during my iteration and pickling worked fine!

The hard part was figuring out which object was causing the problem. I found a stackoverflow article that helped here. It showed how to extend the Pickler to either log what it was operating on, or catch exceptions (as I added for my purposes). Here’s my code;

class MyPickler (pickle.Pickler):
  def save(self, obj):
    #print 'pickling object', obj, 'of type', type(obj)
    try:
      pickle.Pickler.save(self, obj)
    except:
      print "--------- object dict = "+str(obj.__dict__)

I found it very helpful to see what object was causing the problem and could insert a breakpoint to inspect that object when the problem occurs. In the memcache.py file of python-memcached, I had to change an import so that cpickler wasn’t used. That’s a native pickler which is much faster, but doesn’t allow me to extend it in this way. This is clearly only a debugging tool and the standard package code should be used in production.

Invalidate == Delete

Each item stored in a cache region has a key generated. When using the @cache_on_arguments decorator, the cache key is created based on the string form of the arguments passed to the cache function. The decorator takes a namespace argument, so I was able to specify an additional key component so that any image values being cached all included “image” in the cache key. By default the key is also run through sha1 to create a digest to get consistent length (and obfuscated) cache keys. This works well and would have been all I had to do except that I couldn’t simply rely on the configured expiration of the cache region. There are cases where we needed to invalidate the set of data in the cache due to changes initiated within the application. In that case, our user would expect to see the new data immediately.

To invalidate, we would need to know the cache key used to refer to the data in the cache and perform a delete on the key. Since the cache key was generated for us, I had no idea what to use for deletion. I could have reverse engineered it, but if something changed in the underlying library, that could be fragile. Fortunately, a cache region can be given a key generator function when it is configured. We could use our own code to generate the cache key and call that again to invalidate the cache. This is the key generator I’m using:

def euca_key_generator(namespace, fn):
  def generate_key(*arg):
    # generate a key:
    # "namespace_arg1_arg2_arg3..."
    key = namespace + "_" + "_".join(str(s) for s in arg[1:])

    # return cache key
    # apply sha1 to obfuscate key contents
    return sha1(key).hexdigest()

  return generate_key

To use this to invalidate a cache (based on args), I wrote another function:

def invalidate_cache(cache, namespace, *arg):
  key = euca_key_generator(namespace, None)(*arg)
  cache.delete(key)

The namespace and arg list are passed to the key generator as you can see. This is merely a helper function. To invalidate the image cache, I needed to call the above function with the proper arguments. These are the same arguments passed in to the cache function (which uses the decorator).

The work on shared caching is currently in a branch, but will likely be merged into develop over the next month or so.

Size Problem

After beating my head against a wall for a while, I found there is a size limitation on memcached. It will only take values up to 1MB in size unless you recompile it. Fortunately, there is a handy solution. Since value get pickled, they do really well with compression. The python-memcached library supports compression, but you need to enabled it. By default the min_compress_len is zero, which means it never tries to compress the pickled data. In fact, the code silently returns from the set method having done nothing. This is where the frustration came in. I ended up spending some quality time in pdb to figure out that I could configure a dogpile.cache region with a min_compress_len greater than zero to get the underlying code to compress my data. Bingo! My large data set went from 3MB to 650K. This is how I configured my regions:

 long_term.configure(
     memory_cache,
     expiration_time = int(settings.get('cache.long_term.expire')),
     arguments = {
         'url':[memory_cache_url],
         'min_compress_len':1024,
     },
 )

I realize that 650K is not that far from 1MB, so perhaps splitting up the data will be needed at some point. The failure mode is simply a performance one, not so fatal.

Memcached Debug Tips

I learned a couple of things about monitoring my memcached server while debugging things. Two tips I found that will help are:

  • run memcached from a shell with -vv option. You’ll get useful output about get, set, send and delete operations.
  • telnet into the server using “telnet localhost 11211”. You can run commands like “stats” and “stats items”

IAM Role support for the Eucalyptus Management Console

If you like using IAM Roles in Eucalyptus, there’s a nice option for you to try out. Since Eucalyptus 4.0 was just released with a brand new management console, I thought it would be a good time to let you know about a feature branch you can try out if you don’t mind installing from source. Please find instructions in the README for installing dependencies and running from source. You can install using “python setup.py install”

If you haven’t seen the new console yet, it adds support for IAM Users and Groups. This means you can manage users, groups and policies for the account, all from the console. This branch adds support for IAM Roles. Along side users and groups, you’ll be able to create, view and delete roles. To use roles, you assign them to instances or launch configurations (for AutoScaling). This allows you to assign special privileges to instances. I’ve added the ability to assign a role in both the new instance wizard and launch configuration wizard. The required IAM Instance Profile is handled for you by the console.

Here are some screen shots to show you a few of the changes.

Image

Image

Image

 

Installing the Eucalyptus Console from source and packages

I previously posted some information about the new User Console we’ve been working on at Eucalyptus Systems. There has been a lot of activity and we’ve shown it to a lot of users to get feedback. We will be releasing it officially very soon, but till then, you can run it yourself a couple of ways. You can build from source, which is very easy, or install nightly builds which we provide for RHEL 6 and CentOS 5 and 6.

Did You Get the Package?

Packages are available in the nightly directory here:

http://downloads.eucalyptus.com/software/eucalyptus/nightly/3.2/

Configure the repo like this:

rpm -Uvh http://downloads.eucalyptus.com/software/eucalyptus/nightly/3.2/centos/6/x86_64/eucalyptus-release-3.2.0-0.1.el6.noarch.rpm

You’ll also need to have the elrepo configured as well.

rpm -Uvh http://elrepo.org/elrepo-release-6-4.el6.elrepo.noarch.rpm

Next, run

yum install -y eucalytpus-console

Next, you’ll need to configure the cloud location and perhaps some other things. The file is located in /etc/eucalyptus-console/console.ini  The config file has a lot of settings, but there are just a few you need to understand to get going quickly.

clchost – this is the IP or dns name for the eucalyptus cloud you’ll connect to

uiport – defaults to 8888, but you can use a different port if you like

sslcert, sslkey – these values are used to configure SSL, which you don’t need to do for development

usemock – this is important if you don’t have a cloud to talk to. Setting this to true instructs the user console to load mock data and many features operate on the mock data (though many also don’t work as well). In this mode, the console can be run standalone for simple demos or to work on the browser side like when you want to changing branding or other look and feel items.

Now, start the service. You’ll see something like this;


# service eucalyptus-console start
Generating self-signed certificate: [ OK ]
Generating cookie secret: [ OK ]
Starting eucalyptus-console: [ OK ]

By default, SSL is enabled. You can connect to the application using https://localhost:8888/ (assuming you’re on the same host, otherwise substitute the right hostname or IP). Skip down to the “Once You’re Running” section below.

Using the Source

A source install is also very easy, but there are some differences you will need to be aware of. First, the code lives on github here: https://github.com/eucalyptus/eucalyptus/tree/maint/3.2/testing

Notice that I’m showing you the maint/3.2/testing branch. That is because we’re putting the very latest fixes there. If you’re a little more risk-averse, you may want to simply run from maint/3.2/master instead. We push changes from testing to master after the code has passed QA to a reasonable degree.

To get started, you’ll need;

  • a git client
  • python 2.6 (or 2.7)
  • boto
  • m2crypto
  • tornado (2.1 or higher)
On CentOS 6,

rpm -Uvh http://elrepo.org/elrepo-release-6-4.el6.elrepo.noarch.rpm

yum install -y git python python-boto m2crypto python-tornado

or, Ubuntu 12.04,

sudo apt-get install git python-boto m2crypto python-tornado

Now, grab the source.

git clone git://github.com/eucalyptus/eucalyptus.git
cd eucalyptus
git checkout maint/3.2/testing
git pull origin maint/3.2/testing

For better or for worse, the user console is in a subdirectory of the main eucalyptus source tree. This means you pull down a whole lot more than you really need if you're only interested in the console. The good news is that in the console directory, you have a completely independent project that doesn't have any build or run-time dependencies on the rest (aside from connecting to the cloud). To get the console running, you'll first need to configure one or two things.  The console/eucaconsole/console.ini file has many settings, which we cover in the package install section above.

Once you've set things up to your liking, simply run

cd console
./launcher.sh

You should see a message like "2012-11-10 23:56:43 INFO Starting Eucalyptus Console". If not, there may be other issues you need to fix first. If you have problems and need help, visit #eucalytpus-ui or one of the other #eucalyptus channels on freenode.net.

If you saw this message, that's great! It's very likely you're ready to connect with the browser. Assuming you are running the console on the same machine as your browser, use the appropriate localhost URL like "http://localhost:8888/" if you kept the default port and ssl config (off).

Once You're Running

You should see a login page. If you're using the mock (usemock: True), you can put anything in the fields, or nothing and simply login. If you're connecting to a Eucalyptus cloud, you'll need to use the regular web login credentials you'd use to login to the Eucalyptus admin console (read on...)

* Warning: Science Content *

The user console uses temporary session credentials from the STS service to make calls, so your actual access key and secret key are never passed to the console app. As a result, accounts that want to use the console must have a) a password set up and b) access credentials assigned. These things can be done by the cloud admin via the admin console or using euca2ools like this;

euare-accountcreate -a testuser1
euare-useraddloginprofile --delegate testuser1 -u admin -p euca123
euare-useraddkey --delegate testuser1 -u admin

Another way to do this would be using the Eucalyptus Admin UI which you can reach in your web browser here: https://<your-cloud-frontend&gt;:8443/ and there are some helpful docs here.

Assuming you were able to login, you should be ready to explore!

Eustore, a set of image tools for your cloud

I want to talk about something new we’re working on at Eucalyptus, but first let me start with a little background. Quite simply, it is a hassle to get an image installed. The current process for Eucalyptus (as we document it) is to download a tarball, untar it, bundle/upload/register the kernel/ramdisk and image itself. That’s about 11 steps. We thought there must be a simpler way to do this.

What we came up with is eustore. In the spirit of euca2ools (euca- and euare- commands), eustore commands give  you access to a Eucalyptus image store. That’s store, as in storehouse, not a shop. We have some updated “base” images available on our servers. We have a catalog file that contains metadata about those images. The eustore tools simply give you access to those, and let you issue a single command to download an install an image on your local cloud (or any Eucalyptus cloud you have access to).

The code has been checked in with the euca2ools. To install and use the commands, you’ll need to build from source and tweak the setup.py. Let’s go over that now.

If you don’t have bzr, you’ll need to download it and grab the code with


bzr branch lp:euca2ools

You’ll find the eustore commands in euca2ools/commands/eustore. The commands still need to be added to setup.py, as does the package to get it installed with the rest of euca2ools. Here’s s patch script you can apply with “patch -p0 <setup.patch” (assuming you copy this into a file named setup.patch);

--- setup.py 2012-01-20 17:17:48.000000000 -0800
+++ setup.py 2012-01-20 17:18:53.000000000 -0800
@@ -161,10 +161,13 @@ setup(name = &quot;euca2ools&quot;,
 &quot;bin/euca-unbundle&quot;,
 &quot;bin/euca-unmonitor-instances&quot;,
 &quot;bin/euca-upload-bundle&quot;,
- &quot;bin/euca-version&quot;],
+ &quot;bin/euca-version&quot;,
+ &quot;bin/eustore-describe-images&quot;,
+ &quot;bin/eustore-install-image&quot;],
 url = &quot;http://open.eucalyptus.com&quot;,
 packages = [&quot;euca2ools&quot;, &quot;euca2ools.nc&quot;, &quot;euca2ools.commands&quot;,
- &quot;euca2ools.commands.euca&quot;, &quot;euca2ools.commands.euare&quot;],
+ &quot;euca2ools.commands.euca&quot;, &quot;euca2ools.commands.euare&quot;,
+ &quot;euca2ools.commands.eustore&quot;],
 license = 'BSD (Simplified)',
 platforms = 'Posix; MacOS X; Windows',
 classifiers = [ 'Development Status :: 3 - Alpha',

Once that file is patched, installing euca2ools (+eustore) is as simple as running (as root)

python setup.py install

Once you do this, you’ll have access to 2 new commands; eustore-describe-images and eustore-install-image. Here are the command summaries;

Usage: eustore-describe-images [options]

Options:
 -h, --help show this help message and exit
 -v, --verbose display more information about images

 

Usage: eustore-install-image [options]

Options:
 -h, --help show this help message and exit
 -i IMAGE_NAME, --image_name=IMAGE_NAME
 name of image to install
 -b BUCKET, --bucket=BUCKET
 specify the bucket to store the images in
 -k KERNEL_TYPE, --kernel_type=KERNEL_TYPE
 specify the type you're using [xen|kvm]
 -d DIR, --dir=DIR specify a temporary directory for large files
 --kernel=KERNEL Override bundled kernel with one already installed
 --ramdisk=RAMDISK Override bundled ramdisk with one already installed

eustore-describe-images list the images available at emis.eucalyptus.com. You have the ability to change the url (using the EUSTORE_URL environment variable which is helpful sometimes). The output looks like this;

centos-x86_64-20111228 centos x86_64 2011.12.28 CentOS 5 1.3GB root
centos-x86_64-20120114 centos x86_64 2012.1.14 CentOS 5 1.3GB root
centos-lg-x86_64-20111228centos x86_64 2011.12.28 CentOS 5 4.5GB root
centos-lg-x86_64-20120114centos x86_64 2012.1.14 CentOS 5 4.5GB root
debian-x86_64-20111228 debian x86_64 2011.12.28 Debian 6 1.3GB root
debian-x86_64-20120114 debian x86_64 2012.1.14 Debian 6 1.3GB root
debian-lg-x86_64-20111228debian x86_64 2011.12.28 Debian 6 4.5GB root
debian-lg-x86_64-20120114debian x86_64 2012.1.14 Debian 6 4.5GB root
ubuntu-x86_64-20120114 ubuntu x86_64 2012.1.14 Ubuntu 10.04 1.3GB root
ubuntu-lg-x86_64-20120114ubuntu x86_64 2012.1.14 Ubuntu 10.04 4.5GB root

To install one of these images on your local cloud, you’d use eustore-install-image like this;

eustore-install-image -i debian-x86_64-20120114 -b myimages

This command installs the image named into the myimages bucket on the cloud you are setup to talk to. As with all euca2ools, you’d first source the eucarc file that came with your cloud credentials. I should point out something about uploading kernel and ramdisk to your cloud. Only the admin can install these. If you have admin credentials, the above command will work fine. If you don’t and want to install an image anyway, you would use the –kernel and –ramdisk options to refer to a kernel id and ramdisk id already installed on the cloud. That way, this command will ignore the kernel and ramdisk bundled with the image and refer to the previously uploaded ones.

The project management is happening here: https://projects.eucalyptus.com/redmine/projects/eustore/

It is discussed during the images meetings on IRC  (calendar here)

Migrating an EC2 AMI to Eucalyptus

There have been different instructions for using an image from Amazon’s EC2 on a local Eucalyptus cluster. This seems to be what worked best for me.

The basic steps are, launch an instance of the AMI, run euca-bundle-vol with your Eucalyptus credentials, upload bundle, register. While it would be possible to use the download-bundle/un-bundle method detailed in this post, that only works with images that your account created. The use case I’m addressing here is to get starting images for building some custom images within your private cloud. Another use case is when duplicating custom images from private to public cloud for purposes of cloud-bursting. That’ll be covered in another post.

specifically, when converting ami-1a837773 (Ubuntu-Maverick-32bit)

ec2-run-instances ami-1a837773 -k dak-keypair

When that boots, scp the credentials zip file that you got from the ECC (or your own cloud)  (i.e. scp -i dak-keypair euca2*.zip ubuntu@50.16.60.6:.) (UPDATE: my image didn’t have zip installed, so I repackaged the zip as a tar.gz) Because Ubuntu images don’t allow root login, we can only copy files into the user directory. Ideally, we don’t want credentials on the root filesystem because they’ll end up in the bundle. So, the first thing we’ll need to do after logging into the instance is to move the zip file to /mnt directory (ephemeral store). (There are additional security concerns that may apply. This post at alestic.com covers that well.)

On the instance;

sudo mv euca2*.zip /mnt
cd /mnt
sudo unzip euca2*.zip
source eucarc

To bundle/upload the image, you’ll need the euca2ools. There are some instructions here that help. This Maverick image already has them installed.

If the image has a default kernel specified (as this Maverick one does), that aki id won’t work on eucalyptus. For the ECC, looking at the list of images shows that many of them specify the eki-6CBD12F2 kernel, so I will also use that when overriding the EC2 kernel.  If you run your own Eucalyptus installation, it is easy to get the default kernel id via the management interface on the “Configuration” tab. Take note of the ramdisk id also, since that goes hand-in-hand with the kernel.

In the case of a private Eucalyptus installation, network restrictions probably won't allow the EC2 instance to upload to Eucalyptus directly. One way to do that is downloading a gzipped image to your local machine, run euca-bundle-image prior to upload. That is time consuming and since I'm working with ECC here, all of the operations can be run on the EC2 instance.
sudo -E euca-bundle-vol -p Ubuntu-10.10-Maverick-32bit -s 2048 -d /mnt -r i386 --kernel eki-6CBD12F2 --ramdisk eri-A97113E4</pre>
euca-upload-bundle -b dak-images -m /mnt/Ubuntu-10.10-Maverick-32bit.manifest.xml
euca-register dak-images/Ubuntu-10.10-Maverick-32bit.manifest.xml

At this point, you should be all set to launch the image.

Footnote: I've tested this with a Maverick S3 backed AMI and a Lucid EBS backed AMI.

Connecting to the Eucalyptus Community Cloud with typica

Eucalyptus recently announced a public “cloud” sandbox known as Eucalyptus Community Cloud. It is a place where you can kick the tires to some degree and since they support a subset of the Amazon EC2 API, you can generally point EC2 tools at the ECC. This post will deal with using typica to interact with the ECC from within your Java software.

First thing to do is follow the ECC link above and create an account. If you already have an account to get into the Eucalyptus forums, you can login and apply for an ECC account. Once you get a confirmation e-mail and confirm the account, you’ll be able to login and get your access id and secret key. To do that, visit the ECC, login and select “show keys”, which reveal the QueryID (access id) and Secret Key. While you’re hear, you should also download credentials. This gives you a zip that contains something we’ll need later.


Jec2 ec2 = new Jec2(props.getProperty("aws.accessId"), props.getProperty("aws.secretKey"), true, "ecc.eucalyptus.com", 8773);
ec2.setResourcePrefix("/services/Eucalyptus");

Let me explain this code. The first line creates a new Jec2 object, that is configured to talk to the ECC. The “props” variable came from reading a property file containing the access id and secret key. The next parameter specifies SSL. Then, you pass the hostname for the ECC and the port it uses. After that, it would be business as usual. The EC2 sample code demonstrates some normal operations, and the API docs give a more complete picture.

When running the code, there’s a special option you’ll need as compared to using typica to talk to AWS. Since Eucalyptus clouds are generally installed with self signed SSL certs, you’ll need to specify a file that came with that credentials download in your java options. If you don’t do this, you’ll likely see a “SSLPeerUnverifiedException: peer not authenticated” error.


$ java ... -Djavax.net.ssl.trustStore=<path to files from credentials zip>/jssecacerts ... TestJec2

Amazon EC2 – Boot from EBS and AMI conversion

Amazon recently announced an important new feature for their Elastic Compute Cloud. Previously, each instance was based on an image that could be a maximum of 10 GB in size. So, each machine you brought up could have a root partition up to 10 GB in size and additional storage would need to be added in other ways. The size restriction alone is somewhat limiting. Amazon has not only addressed that, but given users some other very powerful abilities.

Now, you can define an image in an EBS snapshot. That means the size of your root partition can be as large as 1 TB. Yes, that’s 100 times larger than the old 10 GB limit. Beyond the obvious benefit of having larger images, you can also stop instances. Stopping an instance is different than terminating an instance. The distinction is important because stopping an instance is very much like hitting the “pause” button. It doesn’t take a lot to realize that pausing a running instance and being able to start it up again later is very powerful! Instances tend to boot faster off EBS. As  you might expect, if you create a really large volume for a root partition (like 100s of GBs), it will take longer to come up. That’s just because it takes longer to create larger volumes than smaller ones.

Let’s go further and look at how powerful it is to have snapshots as the basis for images. By having a snapshot that you can create EBS volumes from, that means you can mount a volume, based on your snapshot (which represents your image) and make modifications to it! This is immensely helpful when trying to make changes to an image. Previously, it was somewhat more awkward to modify an image. You actually had to boot it up and run it. But now, even if there is an error that prevents proper running, you can access the image storage and make changes. Very useful!

Of course judging by the number of public AMIs out there, there are a great number of images backed by S3 that people will want to convert. Towards this end, I came up with a script to convert AMIs from the old to the new style. Here’s the cliff’s notes version.

Use an instance in the same region as your image to do the following,

  • download the image bundle to the ephemeral store
  • unbundle the image (resulting in a single file)
  • create a temporary EBS volume in the same availability zone as the instance
  • attach the volume to your instance
  • copy the unbundled image onto the raw EBS volume
  • mount the EBS volume
  • edit /etc/fstab on the volume to remove the ephemeral store mount line
  • unmount and detach the volume
  • create a snapshot of the EBS volume
  • register the snapshot as an image, and you’re done!

During the private beta for this feature, I created an AMI to handle all of this, so you boot the AMI with a set of parameters and it does the dirty work. The script uses the standard API and AMI tools that Amazon supplies. I’ll roll that out on the public cloud shortly.

Here’s the interesting portion of the script (parsing arguments and setting up environment variable for the tools has been omitted) :

Using the AMI ID, get the manifest name and architecture
AMI_DESC=`$EC2_HOME/bin/ec2dim |grep $AMI_ID`
MANIFEST=`echo $AMI_DESC | awk '{ print $3 }'`
ARCH=`echo $AMI_DESC | awk '{ print $7 }'`
MANIFEST_PATH=`dirname $MANIFEST`/
MANIFEST_PREFIX=`basename $MANIFEST |awk -F. '{ print $1 }'`

Download the bundle to /mnt

echo grabbing bundle $MANIFEST_PATH $MANIFEST_PREFIX
/usr/local/bin/ec2-download-bundle -b $MANIFEST_PATH -a $ACCESS_ID -s $SECRET_KEY -k pk.pem -p $MANIFEST_PREFIX -d /mnt

Unbundle the image into a single (rather large) file.


echo unbundling, this will take a while
/usr/local/bin/ec2-unbundle -k pk.pem -m /mnt/$MANIFEST_PREFIX.manifest.xml  -s /mnt -d /mnt

Create an EBS volume, 10 GB. This size is used because that is the largest size for an S3 based AMI. Using launch options I show at the end of this article, you can increase that at run time. Notice, the availability zone comes from instance metadata. We must wait till the volume is created before moving on.


ZONE=`curl http://169.254.169.254/latest/meta-data/placement/availability-zone`
VOL_ID=`$EC2_HOME/bin/ec2addvol -s 50 -z $ZONE | awk '{ print $2 }'`
STATUS=creating
while [ $STATUS != "available" ]
do
echo volume $STATUS, waiting for volume create...
sleep 3
STATUS=`$EC2_HOME/bin/ec2dvol $VOL_ID | awk '{ print $5 }'`
done

Attach the volume

INST_ID=`curl http://169.254.169.254/latest/meta-data/instance-id`
$EC2_HOME/bin/ec2attvol $VOL_ID -i $INST_ID -d $EBS_DEV

Here’s where we turn the image into a real volume, using our old friend “dd”

echo copying image to volume, this will also take a while
dd if=/mnt/$MANIFEST_PREFIX of=$EBS_DEV

Mount the volume and remove ephemeral store entry from /etc/fstab. This is required because “Boot from EBS” doesn’t use the ephemeral store by default.

mount $EBS_DEV /perm
cat /perm/etc/fstab |grep -v mnt >/tmp/fstab
mv /perm/etc/fstab /perm/etc/fstab.bak
mv /tmp/fstab /perm/etc/

Then, unmount and detach the volume. We’re nearly there.

umount /perm
$EC2_HOME/bin/ec2detvol $VOL_ID -i $INST_ID

Create a snapshot and wait for it to complete.

SNAP_ID=`$EC2_HOME/bin/ec2addsnap $VOL_ID -d "created by createAMI.sh" | awk '{ print $2 }'`
# now, wait for it
STATUS=pending
while [ $STATUS != "completed" ]
do
echo volume $STATUS, waiting for snap complete...
sleep 3
STATUS=`$EC2_HOME/bin/ec2dsnap $SNAP_ID | awk '{ print $4 }'`
done

Finally, delete the volume and register the snapshot


$EC2_HOME/bin/ec2delvol $VOL_ID
$EC2_HOME/bin/ec2reg -s $SNAP_ID -a $ARCH -d $DESCR -n $MANIFEST_PREFIX

To run your AMI with a larger root partition, use a command like this (which specifies 100GB);
  ec2-run-instances –key <KEYPAIR> –block-device-mapping /dev/sda1=:100 <AMI_ID>

Amazon Relational Database Service

logo_awsI’m pretty excited about this new service from Amazon. They’ve taken a lot of the pain out of running a relational database in the cloud. Specifically, they now support managed instances running MySQL. Amazon RDS handles provisioning, operating and scaling your database. Much like Elastic Load Balancing and Auto Scaling did for the application tier, RDS does for the database tier.

Amazon RDS provides APIs and command line tools to manage your database, removing complicated scripts and much of the “muck” that was somewhat tricky before. There are additional Cloud Watch fields that give additional information about the state of the database, such as # open connections.

There are 4 main groups of commands with RDS.

  • High Level Instance Management
  • Database Configuration
  • Security Group Management
  • Backup and Restore Services

Initially, you might create a db instance, authorize access to an existing EC2 security group (perhaps for your application tier auto scaling group). Going further, you can get more sophisticated about configuring the database. You can configure a parameter group and set the types of things you’d have configured in your my.cnf file. You can also add storage while the database is running. Finally, you’ll want to make use of snapshots to make backups of your database.

To help monitoring, Amazon RDS provides more than additional CloudWatch parameters. They’ve added the ability to track access details, so you can request events related to instances and security groups for the past 2 weeks.

To support relational databases in the EC2, there are 2 new instance types, m2.2xlarge and m2.4xlarge which are both high memory and higher I/O. This is great news and dovetails nicely with Amazon RDS (not by mistake either).

  • High-Memory Double Extra Large Instance 34.2 GB of memory, 13 EC2 Compute Units (4 virtual cores with 3.25EC2 Compute Units each), 850 GB of instance storage, 64-bit platform
  • High-Memory Quadruple Extra Large Instance 68.4 GB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform

I think they’ve done a good job of making database deployment and management easier in their cloud! I’m considering adding support to the typica java client. I’d appreciate feedback to help with that decision.

Some folks have been questioning the relevancy of SimpleDB in light of RDS. I think Mitch does a great job on elastician.com of discussing that topic. I have to agree that the scale issue still applies. Some applications can live with the limitations of SimpleDB and gain the advantage of massive scale. That is something that RDS cannot provide. Amazon RDS does give a good set of management APIs for running MySQL in the cloud, and people shouldn’t expect more than that.