Wednesday, November 20, 2013

Introducing Scalable Walrus Tech Preview

As many of us know, Eucalyptus includes a component called Walrus that implements a significant subset of the S3 API.

However, one of the issues with object storage support in Eucalyptus has been that it did not scale past a single server. DRBD support in the high availability version of Walrus allowed data to be replicated, however, it still suffered from the same drawback, i.e., the inability to scale out.

In this tech preview that is based off of the Eucalyptus 3.4.0 code base, there are some major architectural changes to the object storage subsystem.

We now have the ability to use third party distributed stores. For the purpose of this tech preview, we have decided on RiakCS from Basho. RiakCS is relatively easy to install and maintain, and is designed primarily as an object store.

There is a new component called the Object Storage Gateway (OSG) that has been added. You can have multiple instances of the OSG per Eucalyptus cloud. OSGs act as proxies for incoming S3 requests. They handle API validation, translation and most importantly, integration and checks for Identity and Access Management (IAM), which is the mechanism used for access control and policy enforcement for the rest of the cloud. The OSG also stores bucket and object metadata and can be used to quickly perform access checks without having the query the backend store. In addition, the OSG has logic to handle concurrent PUTs and resolve conflicts. The following figure demonstrates the high level architecture.

Eucalyptus includes a DNS service that responds to queries for Eucalyptus components. The DNS server will return a list of active OSGs in a round robin manner.

We recommend that you use Nginx or a similar load balancing proxy in front of your RiakCS installation. In addition, make sure that proxy is configured to correctly pass through large responses.

Instructions for trying out this new feature can be found here.

Note that this tech preview is only designed to showcase scale-out storage integrated with Eucalyptus. It does not currently run instances, so please only register the Cloud Controller (CLC) and one or more Object Storage Gateways (OSGs). Obviously, we don't recommend that you use this tech preview in production. You will need a fully configured RiakCS installation and at least one user defined. RiakCS installation instructions can be found here.

In the future, in the 4.0 release time frame (early April), we will be adding support for other distributed stores, such as Swift and Ceph, as well as connecting to Amazon S3 directly.

Enjoy and let us know what you think!

