Archive Page 3
There have been different instructions for using an image from Amazon’s EC2 on a local Eucalyptus cluster. This seems to be what worked best for me.
The basic steps are, launch an instance of the AMI, run euca-bundle-vol with your Eucalyptus credentials, upload bundle, register. While it would be possible to use the download-bundle/un-bundle method detailed in this post, that only works with images that your account created. The use case I’m addressing here is to get starting images for building some custom images within your private cloud. Another use case is when duplicating custom images from private to public cloud for purposes of cloud-bursting. That’ll be covered in another post.
specifically, when converting ami-1a837773 (Ubuntu-Maverick-32bit)
ec2-run-instances ami-1a837773 -k dak-keypair
When that boots, scp the credentials zip file that you got from the ECC (or your own cloud) (i.e. scp -i dak-keypair euca2*.zip email@example.com:.) (UPDATE: my image didn’t have zip installed, so I repackaged the zip as a tar.gz) Because Ubuntu images don’t allow root login, we can only copy files into the user directory. Ideally, we don’t want credentials on the root filesystem because they’ll end up in the bundle. So, the first thing we’ll need to do after logging into the instance is to move the zip file to /mnt directory (ephemeral store). (There are additional security concerns that may apply. This post at alestic.com covers that well.)
On the instance;
sudo mv euca2*.zip /mnt cd /mnt sudo unzip euca2*.zip source eucarc
To bundle/upload the image, you’ll need the euca2ools. There are some instructions here that help. This Maverick image already has them installed.
If the image has a default kernel specified (as this Maverick one does), that aki id won’t work on eucalyptus. For the ECC, looking at the list of images shows that many of them specify the eki-6CBD12F2 kernel, so I will also use that when overriding the EC2 kernel. If you run your own Eucalyptus installation, it is easy to get the default kernel id via the management interface on the “Configuration” tab. Take note of the ramdisk id also, since that goes hand-in-hand with the kernel.
In the case of a private Eucalyptus installation, network restrictions probably won't allow the EC2 instance to upload to Eucalyptus directly. One way to do that is downloading a gzipped image to your local machine, run euca-bundle-image prior to upload. That is time consuming and since I'm working with ECC here, all of the operations can be run on the EC2 instance.sudo -E euca-bundle-vol -p Ubuntu-10.10-Maverick-32bit -s 2048 -d /mnt -r i386 --kernel eki-6CBD12F2 --ramdisk eri-A97113E4</pre> euca-upload-bundle -b dak-images -m /mnt/Ubuntu-10.10-Maverick-32bit.manifest.xml euca-register dak-images/Ubuntu-10.10-Maverick-32bit.manifest.xml
At this point, you should be all set to launch the image.
Footnote: I've tested this with a Maverick S3 backed AMI and a Lucid EBS backed AMI.
Eucalyptus recently announced a public “cloud” sandbox known as Eucalyptus Community Cloud. It is a place where you can kick the tires to some degree and since they support a subset of the Amazon EC2 API, you can generally point EC2 tools at the ECC. This post will deal with using typica to interact with the ECC from within your Java software.
First thing to do is follow the ECC link above and create an account. If you already have an account to get into the Eucalyptus forums, you can login and apply for an ECC account. Once you get a confirmation e-mail and confirm the account, you’ll be able to login and get your access id and secret key. To do that, visit the ECC, login and select “show keys”, which reveal the QueryID (access id) and Secret Key. While you’re hear, you should also download credentials. This gives you a zip that contains something we’ll need later.
Jec2 ec2 = new Jec2(props.getProperty("aws.accessId"), props.getProperty("aws.secretKey"), true, "ecc.eucalyptus.com", 8773); ec2.setResourcePrefix("/services/Eucalyptus");
Let me explain this code. The first line creates a new Jec2 object, that is configured to talk to the ECC. The “props” variable came from reading a property file containing the access id and secret key. The next parameter specifies SSL. Then, you pass the hostname for the ECC and the port it uses. After that, it would be business as usual. The EC2 sample code demonstrates some normal operations, and the API docs give a more complete picture.
When running the code, there’s a special option you’ll need as compared to using typica to talk to AWS. Since Eucalyptus clouds are generally installed with self signed SSL certs, you’ll need to specify a file that came with that credentials download in your java options. If you don’t do this, you’ll likely see a “SSLPeerUnverifiedException: peer not authenticated” error.
$ java ... -Djavax.net.ssl.trustStore=<path to files from credentials zip>/jssecacerts ... TestJec2
My professional career has been spent at 2 companies, Eastman Kodak and D.O.Tech/directThought (rebranded 9 years in). At Kodak, I worked on blood analyzers (which they spun off to J&J), Photo CD (which was made obsolete by newer technologies) and Picture Maker, which is still going strong after N generations of hardware/software. At directThought, I had the joy of working with a lot of great people and working on some interesting projects. I worked on a Picture Maker-like kiosk/web-app/desktop app combination at Xerox. They even created a new division for that project called Pixography. We had XML templates that described printed products like greeting cards, calendars, business cards, brochures and photo books (to name a few). Java 2D rendered everything for print and preview. We had tight integration between the 3 different apps but that project died in its original form, but lived on in spirit in a custom production printing installation out on the west coast. After that, I worked on some enterprise apps for Pfizer, a payroll application for Paychex, then back to more custom apps for the services arm of Xerox. At that point, I got involved in Amazon Web Service and started kicking the tires on this new service called EC2. During that time, I started my most successful open source project called typica, which is still has a lot of users. After Xerox, I helped a number of customers run their apps on AWS’s infrastructure. We were fortunate enough to be come an inaugural AWS System Integrator. I was also asked to learn how to write apps for this hot new platform called the iPhone. I’ve had a couple of apps in the app store, and worked on a few more. I also got to go to the only WWDC where Steve didn’t deliver the keynote (because he was getting a new liver). All in all, a pretty great experience with may interesting technologies under my belt.
Now, I feel like it is time for a change. I’ve just accepted a job with Eucalyptus Systems! They build infrastructure that powers clouds. They have a lot of great people working there and I am looking forward to doing my part to help the company grow, if not flourish in this exciting space. Since they just started business last year, I can say I’m now part of a fast growing startup! Very excited!
A previous post talked about my need for some local, reliable storage in my home. That project led to investigating some other options. Since I’m a big fan of Amazon S3, it seemed like something I should involve in my storage solution. The Elastician (Mitch Garnaat) and I bought the same hardware and are working through the setup together. Here’s the rundown of the hardware including costs;
My previous post discusses the hardware in more detail and some of the choices. Here’s a picture of inside of the case once things were assembled. The observant among you would notice that one of the drives doesn’t have power. That’s because the case power supply didn’t have 2 SATA power connectors and the adapter cable was on order when this picture was taken. I’ll also point out that this case isn’t ideal for mounting several 3.5″ drives. With adapters, I can fit 4 in there, true. However, shopping around for something more to my liking is something I’d do differently next time. Thinking more about the software to run on the NAS has led to several projects including FreeNAS and OpenFiler. We decided to go with something we’re familiar with, Ubuntu. Ubuntu has instructions on their download page for creating a bootable flash drive. I tried the Mac OS-X method and failed, so I resorted the tool from pendrivelinux.com on the family window box. The Universal USB Installer they have works well and created good, bootable flash drives every time.
Creating a Bootable Flash Drive
To get things going, I needed to connect a mouse/keyboard/monitor. Once I configured the BIOS to boot from the USB HDD, it recognized the bootable flash drive and started bring Ubuntu up. It seems to take “forever” to boot up. I could hit “escape” to watch the console and found that it was timing out on the floppy drive, which I don’t have. I went into the BIOS settings to let it know there wasn’t a floppy drive attached and boot time went WAY down! I let the desktop come up, but since this is an install image, changes made aren’t saved. Having the 2nd flash drive will come in very handy now! Plug it into another USB port before prceeding. Select the “System”->”Administration” menus, then the “Install Ubuntu… ” option. There are steps on the install wizard that require special mention. On step 4, select “erase and use the entire disk”, and select your flash drive (not of the hard drivces!). In step 5, after you’ve entered the required information, select “log in automatically”, which will help when running headless later. Now the most critical part, step 7 has an “advanced” button you need to click. Make sure you select the proper device, because it defaults to /dev/sda (the first hard drive). You need to select /dev/sdd, which is the last device connected (the target flash drive). Let the install proceed and you’ll have a bootable ubuntu image we can start configuring.
Remote Desktop for Administration
Once it was up, I could use the desktop and configure Remote Desktop. Having played with the default VNC server, it seemed like the wrong option. It didn’t run unless I had a monitor attached, so I did some digging and found that tightVNC is a popular alternative. There are a few steps to getting it installed and running at boot, detailed here.
For another means of access, its a good idea to install ssh (“apt-get install openssh-server”)
Configuring the RAID
The Disk Utility also has a menu option to configure the RAID. It uses mdadm, but I heard some folks talking about using lvm. Linux Mag has an article that talks about both. I decided to go with the built-in option.
Run “apt-get install mdadm” in a termal window. You can then use “Disk Utility” (on the “System”->”Administration” menu). One thing I noticed is that if you play around with RAID config or do your own partitioning of the drives, the RAID wizard isn’t really happy about using those drives. If this is the case, select each drive and then “Format Drive”. Select the “Don’t Partition” option to reset the drive state. You’ll find that you can now select the drives in the RAID setup wizard.
I’ve set the drives up in a RAID 0 config. Prior to doing this, I did a performance test on a single drive and got an average read rate of 84MB/sec. Once the RAID was configured and formatted, I ran the same performance test and got a read rate of 155MB/sec, which is approaching double the speed! Now that’s what I was hoping for!
To get the RAID started at boot time, edit the /etc/mdadm/mdadm.conf file and replace the existing “DEVICE” line with these 2 lines;
DEVICE /dev/sda1 /dev/sdb1 ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1 auto=yes
Next, run “dpkg-reconfigure mdadm” and accept the defaults. Thanks to goldfisch.at for the help.
Now, to get it mounted, add this to the /etc/fstab
/dev/md0 /mdeia/RAID ext4 rw,nosuid,nodev,uhelper=udisks 1 2
I might have been able to say “defaults” in that options column, but I took what was there when I mounted the RAID manually using the disk utility.
Sharing the Storage
Initially, I’m setting up Samba to share with my household machines. I found this article at ubuntu.com to help me. I’m concerned with privacy, not because I don’t trust my family, but because I plan on backing up my laptop and I don’t want others messing with my files.
I created a “data” directory on the RAID drive. If you right-click on that folder, select “sharing options”. It brings up a dialog, and if you click “share this folder”, you’ll get prompted to install some packages (do it!). I discovered that I needed to use “smbpasswd” to set the share password. I’ll probably need to do this for each user I create to access the RAID.
The Amazon S3 Backup
For the Amazon S3 backup part, we’ve tossed around a number of different options. S3sync isn’t bad, but doesn’t allow for threaded uploads, and there’s the issue of how often do we kick it off. We asked, “what about running an S3 based filesystem and doing a RAID 1 on top of that and the RAID 0 local drives?”. That might be OK, but how about traffic control? What block size do we use, and what penalty do we pay for a larger block size when storing small files? Where do we store the local cache? Do we even want a local cache since we have a local disk array? Along those lines, we looked at S3Backer and others.
What is the solution when you don’t really think the available options are great? Right your own! We think that we can write a daemon tied into the file system notification (pynotify) and use boto for the S3 part. Stay tuned… I smell another open source project!
I’ve been shopping around for a packaged NAS solution that is inexpensive. I’ve looked at LG, Netgear, D-link, WD, Cisco and others. Ultimately, I found plenty of complaints about those and they all seem to have some set of limitations that I just didn’t want to have to deal with. Being a “Maker”, I jumped at the chance to build my own NAS and some people recommended I look at using OpenSolaris. I used SunOS back in the day, then Solaris for many years at work, so it seemed like familiar territory.
My requirements are fairly simple. I want to start with a GigE network connection and 2 1TB drives, in a RAID 1 config for fully redundant storage. The option of adding more drives later, and going to a more sophisticated RAID config would be nice. Our house has a Windows 7 machine for family use and my Mac OSX 10.6 laptop. Probably more machines to come later, and I want to support them all. Likely a mix of Windows/Mac and maybe some Linux down the road.
The other day, Amazon.com had some 1TB drives on sale so I jumped at them. They are WD Green drives, so they aren’t ideal for RAID, but they were $56 each. For a more serious RAID box, you should really use a drive intended for that purpose. The big thing, aside from speed is to do with the Time Limited Error Recovery setting, which tells the drive to not spend time trying to recover data itself (which can hold up the controller for up to 2 minute), but to let the host handle things. RAID is good at this, so that’s why the drive ought to be configured for a short timeout.
Once I had those drives, I thought I’d see what I could piece together for an inexpensive system. I found a mini-tower case w/ power supply for $57 and MATX motherboard for $57, 4GB DDR2 RAM for $95 and a Core 2 Duo processor for $67. So far, we’re coming in < $400 before tax. Now, the next day, I realized I forgot to add a boot device. I wanted something more reliable than disc, and quite a bit cheaper. Flash drives fit the bill, so I picked up 2 8GB drives for $14 each. I figure I can boot off one, then script a backup to the other “just in case”. Here’s the list;
Already, I can see that there are some things I might have done differently, like spend more on drives, less on RAM (smarter shopping, perhaps). On the plus side, with those “Green” drives and the power saving features on the motherboard, my NAS will probably consume less power than most. The parts are due to arrive over the next 2 days, so I’ll post more details and some pictures as I go.
UPDATE: The direction has changed since I originally posted this and the project in its new form is being documented here.
Amazon has just come out with yet another service to help build your app on AWS. Their Simple Notification Service is a pub/sub setup where you create topics and users can subscribe. Delivery is via a “push” mechanism, so subscribers won’t need to poll for new messages. Output can be one of several protocols which include http/https/email/email-json or sqs. While the e-mail output can be useful for managing things like users watching a comment or blog post. The other options are clearly geared towards consumption by other software. Imagine the http options being used to implement a web service callback. SQS is clearly helpful for building loosely coupled services in the cloud. Now, SNS can help feed into those services.
typica now supports SNS. Subversion contains the latest code. A release will be coming shortly. (check this space for updates)
Here’s an example of how to use typica to create a topic, subscribe, send a message, then unsubscribe and remove the topic;
NotificationService sns = new NotificationService(props.getProperty("aws.accessId"), props.getProperty("aws.secretKey")); Result<String> ret = sns.createTopic("TestTopic"); String topicArn = ret.getResult(); System.err.println("topicArn: "+topicArn); sns.subscribe(topicArn, "email", "firstname.lastname@example.org"); System.out.println("Waiting till subscription is confirmed."); System.out.println("Check your e-mail, confirm, then press <return>"); System.in.read(); List<SubscriptionInfo> subs = sns.listSubscriptionsByTopic(topicArn, null).getItems(); String subArn = subs.get(0).getSubscriptionArn(); System.err.println("subscriptionArn: "+subArn); sns.publish(topicArn, TEST_MSG, "[SNS] testing..."); sns.unsubscribe(subArn); sns.deleteTopic(topicArn);