Tuesday, February 18, 2014

Using Roles in Eucalyptus to control access

One of the cool new features in Eucalyptus 4.0 is support for IAM Roles. IAM roles let users define a way for applications to request temporary security credentials on the user's behalf. A role has a set of access control policies associated with it, so that an application only has access to the services and resources that are defined by the policy. Policies are defined as JSON documents.

For instance, the following IAM policy allows access to list a S3 bucket with the name "mybucket":

{
  "Statement": [
    {
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::mybucket"
    }
  ]
}

More information on IAM Roles can be found in Amazon's IAM documentation.

The service that is the equivalent of IAM in Eucalyptus is called Euare (I-am, Eu-are, get it?). The API calls we are interested in for IAM roles are:

  • GetRole
  • CreateRole
  • PutRolePolicy
  • AssumeRole

A really interesting thing about the way we have designed access control is Eucalyptus is that IAM accounts and roles can be used for internal communication between Eucalyptus components, in addition to serving user requests. That is, to the service, there is no difference between an API operation requested by an end user, by an application on an end user's behalf, or by another Eucalyptus component.

An example of this usage is in the Storage Controller (SC) code in 4.0. The SC is responsible for implementing EBS functionality in Eucalyptus. Snapshots are stored in S3, or in case of Eucalyptus, via the Object Storage Gateway into either Walrus, or a distributed backend like RiakCS. However, we don't want to give the Storage Controller access to do whatever it wants with buckets and objects. That would be bad, for instance, if the SC were compromised or misbehaving. 

In addition, we would not like to have the notion of "special" internal operations that only Eucalyptus components are allowed to use. That is not good from a security point of view, as well as more paths to test and debug. So we want to simply use the same functionality that is available to end users.

This is where IAM roles enter the picture.

When Eucalyptus first bootstraps, a role for the S3 access for the SC is created in the "blockstorage" account. For example,

                
CreateRoleType createRoleType = new CreateRoleType();
createRoleType.setDelegateAccount(StorageProperties.BLOCKSTORAGE_ACCOUNT);
createRoleType.setAssumeRolePolicyDocument(StorageProperties.DEFAULT_ASSUME_ROLE_POLICY);
createRoleType.setPath("/blockstorage");
createRoleType.setRoleName(StorageProperties.EBS_ROLE_NAME);
CreateRoleResponseType createRoleResponseType = AsyncRequests.sendSync(euare, createRoleType);

Note the reference to DEFAULT_ASSUME_ROLE_POLICY.

To use roles, the user that is trying to "assume" a role needs to be granted access to the AssumeRole API operation. Our assume role policy looks like:

{
  "Statement": [
    {
      "Action": [
        "sts:AssumeRole"
      ],
      "Effect": "Allow",
      "Principal": {
          "Service": [
            "s3.amazonaws.com"
          ]
      }
}

Sts stands for Security Token Service. AsyncRequests.sendSync is the way to dispatch messages among components within Eucalyptus. It knows whether a service is local (i.e. on the same host as the client) or remote, but that is a topic for another day.

We then add an access policy for the role that was created earlier. For example,

{
  "Statement": [
      "Action": [
        "s3:CreateBucket",
        "s3:ListBucket",
        "s3:DeleteBucket",
      ]
      "Effect": "Allow",
      "Resource" : "arn:aws:s3:::snapshots"
}

This policy allows access to the s3:CreateBuckets3:CreateBucket and s3:DeleteBucket  operations against the resource arn:aws:s3:::snapshots,which simply put, means that you can create, list or delete the bucket with the name "snapshots"

We can add other policies to this role to allow specific objects to be accessed. That's it.

To use this role, we use the AssumeRole operation. For example,

AssumeRoleType assumeRoleType = new AssumeRoleType();
assumeRoleType.setDurationSeconds((int) TimeUnit.HOURS.toSeconds(1));
assumeRoleType.setRoleSessionName("S3Session");
assumeRoleType.setRoleArn(roleArn);
AssumeRoleResponseType assumeRoleResponseType = AsyncRequests.sendSync(tokens, assumeRoleType);
CredentialsType credentials = assumeRoleResponseType.getAssumeRoleResult().getCredentials();               

This gives us temporary credentials that the SC can now use to interact with the OSG to upload and download snapshots. Notice the call to setDurationSeconds. This means that the credentials will expire in a hour and will have to be renewed.

If you an end user, as opposed to a Eucalyptus component, you can use the Amazon AWS SDK to perform the same operations. You can use the CreateRole, PutRolePolicy and AssumeRole using the SDK, which works exactly the same against Eucalyptus as it does against Amazon IAM.

Enjoy and let us know what you think!

Tuesday, December 17, 2013

Object Storage Gateway now in the 4.0 code base

At Eucalyptus, we are furiously working away on the next major release: Eucalyptus 4.0. One of the important features in 4.0 is an object storage proxy called the Object Storage Gateway. The OSG is the user facing component of Eucalyptus object storage and it implements the S3 API. The OSG is also responsible for user request authentication and Identity and Access Management (IAM) policy enforcement. You may have multiple instances of the OSG in Eucalyptus, one on each host, thus achieving scale out.

On the backend, the OSG can interact with Walrus, the single server legacy implementation, or with other distributed object stores like Riak/RiakCS, Ceph, Swift or with S3 itself. In 4.0, we will be testing with RiakCS, but support for other backends will be tested soon. In addition, we will be adding features to support a hybrid storage solution.

Code has now landed in the "testing" branch. You can get it here: https://github.com/eucalyptus/eucalyptus/tree/testing

Some notes:

  • Users will now need to register and configure at least one Object Storage Gateway (OSG). Instructions can be found here.
  • Users will need to pick the backend or "provider" that they want the OSG to use. This can be "walrus" or "s3".
  • The "Walrus" component is now internal only and used by the OSG for single machine setups.
  • We have tested integration with RiakCS as the backend. You can use your RiakCS installation if you pick "s3" as the backend.
  • Eucalyptus will need to know the access/secret key of the admin user for your RiakCS installation.
  • Eucalyptus does not install or configure Riak/RiakCS. That is left as an exercise to the reader. There are some simple install instructions here. You will need at least 5 nodes for a production setup, but fewer nodes are okay if you are just kicking the tires.
  • Not all S3 API features have been implemented for the OSG, such as logging and versioning. We are bringing these over from the legacy Walrus implementation.
  • This code base currently does not support data migration or conversion. So if you have an existing Walrus installation, your Walrus data will not be migrated, sorry!

Again, configuration instructions are available here if you missed the link above.

Please clone it, install it, test it and let us know what you think!

We are also planning on supporting multipart uploads, bucket lifecycle and bucket policies soon. Stay tuned.

For a list of additional features that are planned for 4.0, please see the Eucalyptus Roadmap.

Wednesday, November 20, 2013

Introducing Scalable Walrus Tech Preview

As many of us know, Eucalyptus includes a component called Walrus that implements a significant subset of the S3 API.

However, one of the issues with object storage support in Eucalyptus has been that it did not scale past a single server. DRBD support in the high availability version of Walrus allowed data to be replicated, however, it still suffered from the same drawback, i.e., the inability to scale out.

In this tech preview that is based off of the Eucalyptus 3.4.0 code base, there are some major architectural changes to the object storage subsystem.

We now have the ability to use third party distributed stores. For the purpose of this tech preview, we have decided on RiakCS from Basho. RiakCS is relatively easy to install and maintain, and is designed primarily as an object store.

There is a new component called the Object Storage Gateway (OSG) that has been added. You can have multiple instances of the OSG per Eucalyptus cloud. OSGs act as proxies for incoming S3 requests. They handle API validation, translation and most importantly, integration and checks for Identity and Access Management (IAM), which is the mechanism used for access control and policy enforcement for the rest of the cloud. The OSG also stores bucket and object metadata and can be used to quickly perform access checks without having the query the backend store. In addition, the OSG has logic to handle concurrent PUTs and resolve conflicts. The following figure demonstrates the high level architecture.





Eucalyptus includes a DNS service that responds to queries for Eucalyptus components. The DNS server will return a list of active OSGs in a round robin manner.

We recommend that you use Nginx or a similar load balancing proxy in front of your RiakCS installation. In addition, make sure that proxy is configured to correctly pass through large responses.

Instructions for trying out this new feature can be found here.

Note that this tech preview is only designed to showcase scale-out storage integrated with Eucalyptus. It does not currently run instances, so please only register the Cloud Controller (CLC) and one or more Object Storage Gateways (OSGs). Obviously, we don't recommend that you use this tech preview in production. You will need a fully configured RiakCS installation and at least one user defined. RiakCS installation instructions can be found here.

In the future, in the 4.0 release time frame (early April), we will be adding support for other distributed stores, such as Swift and Ceph, as well as connecting to Amazon S3 directly.

Enjoy and let us know what you think!



Wednesday, January 23, 2013

Direct attached storage driver in Eucalyptus 3.2

Eucalyptus 3.2 introduces a new user console, better troubleshooting and logging, better reporting and support for EMC VNX SAN arrays.

A lesser known fact is that the EBS implementation in Eucalyptus 3.2 no longer requires loop devices. In 3.2, we open sourced a new driver called DASManager, which stands for Direct Attached Storage.

DASManager allows you to use raw block devices like JBOD arrays, as well as LVM volume groups with Eucalyptus. You are no longer limited to 256 volumes per storage controller host. The maximum capacity is limited by the number of LVM volumes that can be created and the size of your disk.

You do need a separate disk, partition, or a volume group though. Here is a link to the documentation that explains how to configure the Direct Attached Storage driver for EBS storage.

You can always access the latest Eucalyptus documentation here.

Enjoy!

Monday, January 14, 2013

Eucalyptus brought to you by Eucalyptus

This is an article about using Eucalyptus as a continuous delivery framework for automating software build and testing. At Eucalyptus, we use Eucalyptus itself to build and package Eucalyptus, as well as perform basic automated testing against Eucalyptus.

Eucalyptus as a continuous delivery framework

Eucalyptus is an open architecture and an implementation of Infrastructure-as-a-service (IaaS) cloud computing. It allows flexible, scalable and dynamic provisioning of compute and storage resources programmatically on behalf of users and their applications. The flexible and dynamic nature of the Eucalyptus platform makes it a suitable candidate for a variety of use cases where user applications need to scale as needed, based on demand.

One such use case is continuous delivery of software from a developer's desktop to packaged software that can install on hundreds of servers...for instance, Eucalyptus! With Jenkins, Git, Eucalyptus and some scripting, it is possible to set up a framework to package software from multiple development branches so that an end user is able to deploy it and quickly test features and bug fixes. In addition, a quality engineering organization can leverage Eucalyptus to run automated test suites against software as soon as it is built. 

If quality tests pass, packages can be "promoted" so that the end user has a level of confidence in the software they are installing. Software build, packaging and QA never stops. As opposed to a traditional model where one must wait for all code to be "ready" before packaging and QA work can start, continuous delivery and QA allow us to potentially release software at any point in the development cycle. Software packaging is no longer a bottleneck to software delivery. 

From "git commit" to tested software, automatically, in a matter of minutes.

Eucalyptus packaging development and management

Release Engineering @ Eucalyptus is tasked with creating and maintaining Eucalyptus software packages and repositories. Release engineers design the package installation process and develop packaging scripts that will automatically install the hundred or so pieces of third party software that Eucalyptus depends upon in the correct order. In addition, release engineers maintain the infrastructure necessary for this effort. For this purpose, we use our very own Eucalyptus cloud.

Build and Test Automation

Eucalyptus engineering works on two releases at any given point in time: A maintenance release to fix issues for users who are currently running Eucalyptus and the next development or feature release.

The Eucalyptus release engineering process has continuous delivery as a principle: Automatically build and test software packages whenever a developer commits code to the mainline branch. Software packages are always available in a state that they can be installed. When the decision is made to release the software as an "official" supported released, packages can be promoted to final release status.

Daily commits are available as nightly builds that end users can install to test out bug fixes or new features as they are being developed.

Our Eucalyptus cloud is a single front end installation with 14 nodes.

Eucalyptus User Console connected to Release Engineering cloud









Images registered and compute resources available for package build

































We use a EC2 plugin with Jenkins to provision instances in Eucalyptus as worker nodes for Jenkins. The plugin is "tweaked" slightly so that instances are not terminated if they are idle to speed up start up time. This modification can be turned off if the Eucalyptus scheduler is configured to suspend nodes when idle.

Jenkins configured to deploy to Eucalyptus




















A top level Jenkins project is configured to monitor commits to mainline Eucalyptus code repositories. This project, in turn, start a number of "downstream" tasks, which include building Eucalyptus from source, generating RPM packages, creating package repositories and triggering quality tests that will use packages from repositories.

Jenkins project set up to monitor the Eucalyptus git repository
Jenkins project set up to build Eucalyptus RPM packages after Eucalyptus binaries are successfully built from source




Jenkins project set up to create package repository, so that programs like "yum" can install Eucalyptus on Linux























































































RPM package build jobs running on Eucalyptus compute nodes




















Eucalyptus open source and plugin packages are made available at a staging area for quality testing. In addition, product documentation builds are also continuously available.

Eucalyptus package repository staging area

A number of past builds are retained, so that packages corresponding to a specific commit hash can be accessed programatically.





Packages corresponding to several git commit hashes are retained and can be retrieved

                 





































When package repositories are created, another downstream project triggers QA tests against the software. The quality engineering team at Eucalyptus maintains an in-house QA system that runs a number of "test units" to verify functionality. The QA system has an API that is triggered via Jenkins.


Eucalyptus QA triggered by downstream Jenkins project after packages have been built




The Eucalyptus QA system installs Eucalyptus from package repositories on top of a Linux distribution on bare metal and runs Eucalyptus through its paces.



Eucalyptus QA in action!


The Future: Continuous Deployment & Rolling Upgrades

The Eucalyptus cloud that builds packages, product documentation and QA tests is upgraded whenever the next stable release of Eucalyptus is published. With continuous deployment, we would have the option of automatically upgrading our Eucalyptus installation to the latest tested code base. This requires a high degree of confidence in automated QA, an installation process that is future proof and the ability to upgrade Eucalyptus without service interruptions. Rolling upgrades would enable us to update a Eucalyptus installation without downtime. We are not very far from this goal, but some kinks in the software are yet to be worked out.