Eventually Consistent, or Immediately with SimpleDB

Amazon SimpleDB has been a service that provides a schema-less data store with some fairly simple query abilities. One of the catches has always been that when you put a piece of data in, you might not get it back in a query right away. That time delay is generally very short (like < 1 second), but there are no guarantees of this. The cause of this goes back to the fundamental tradeoffs in highly available and redundant systems, such as those Amazon builds. Werner Vogels does a pretty good job of laying out the tradeoffs in his "Eventually Consistent” blog post, and others he links to. Essentially, it’s the CAP theorem, which talks about how you can have only 2 of Consistency, Availability or Partitioning (which gets at redundancy).
Using SimpleDB has required an understanding of how inconsistent results will affect your application. Mostly, it has been important that the application never rely on data being there immediately. This can cause problems when trying to give the user completely up to date information.

SimpleDB now supports consistent read, put and delete. There is a cost for consistency, which is potentially higher latency. Let’s take a look at the new features.
The simplest improvement is in the Select and GetAttributes calls. Supplying the “ConsistentRead=true” parameter ensures consistent data is returned. Now, SimpleDB is an option for storing application state. A regular Put can be used and consistent read will get the current state, always.
What is far more interesting is what has been done with put and delete. PutAttributes has some optional parameters that define a condition that must be met to allow the put to continue. In the request, you can define an expected value for some attribute, or specify that the attribute simply must exist. One application for consistent put is a counter. Imagine an item that has counter attribute. To increment the counter, simply read the value, then do a conditional put, specifying the new value, but only if the old value is set. The request will fail if another writer got there first. A retry loop is required, as in this pseudo-code

value = read(counter);
while (put counter=value+1, if counter==value fails) {
    value = read(counter);
}

The same can happen for the delete operation. In a future post, I’ll talk about how to use typica to access these new features from Java. (added! https://coderslike.us/2010/03/09/persistent-counters-in-simpledb/)

Advertisements

2 thoughts on “Eventually Consistent, or Immediately with SimpleDB

  1. Hello,

    I wasn’t sure how to contact you, so I’m posting here. I just started looking into your java simple db project (typica) and I was hoping you could answer one question for me. I would like to make one query to get all the attributes for a filtered set of items for one domain. It seems that using the typica simple db api, I’d have to make two calls – one to get the domain, and then the call to get the items with attributes.

    Is there a way to just make one query call and get the items I’m looking for?

    Thanks,
    Rom

    1. Rom,
      You always need to get a Domain object to work with, but you can keep re-using it as much as you like. There is no state held there. You can call one Domain object from separate threads with no side effects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s