I've been working on a couple of projects lately in which the goal is to assess, among other things, the financial viability and justification for running applications in an infrastructure-as-a-service environment such as Amazon Web Services (and particularly EC2 and S3), GoGrid or Joyent.
While I developed the general model for this ROI calculation, one of the most difficult aspects has been coming up with the equivalent of traditional server capacity in EC2. For example, one of the existing applications is currently running about 10 servers in a U.S. location and 3 servers in a European location. The servers run the app's database servers, application servers, web servers and other components, and range from Single CPU servers to Single-CPU/Dual-Core to Two-CPU/Dual-Core to 4-CPU/Dual-Core to 2-CPU/Quad-Core.
Now, how do you figure out the equivalent type of Amazon Machine Instances on EC2?
Amazon provides the following AMI information:
Instance | EC2 Compute Units | Memory (GB) | Instance Storage (GB) |
Platform | I/O Performance | Hourly Price |
Small | 1 (1 Virtual Core x 1 Compute Unit) |
1.7 | 160 | 32-bit | Moderate | $0.10 |
Large | 4 (2 Virtual Core x 2 Compute Unit) |
7.5 | 850 | 64-bit | High | $0.40 |
X-Large | 8 (4 Virtual Core x 2 Compute Unit) |
15 | 1690 | 64-bit | High |
$0.80 |
High-CPU Medium | 5 (2 Virtual Core x 2.5 Compute Unit) |
1.7 | 350 | 32-bit | Moderate | $0.20 |
High-CPU XL | 20 (8 Virtual Core x 2.5 Compute Unit) |
7 | 1690 | 64-bit | High | $0.80 |
The columns Memory, Instance Storage, Platform and Price are all quite clear in this chart. I/O Performance leaves room for further explanation, but I'll put that aside. But what is an "EC2 Compute Unit"?
Amazon explains as follows:
The amount of CPU that is allocated to a particular instance is expressed in terms of these EC2 Compute Units. We use several benchmarks and tests to manage the consistency and predictability of the performance of an EC2 Compute Unit. One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
It so happens that in 2007 AMD and Intel released both dual-core and quad-core models of the Opteron and Xeon chips, respectively. So it's already not clear what an EC2 Compute Unit compares to. But I'll assume they are referring to the dual-core (given the date this information was published and the fact that quad-cores seem too powerful).
Another thing to note is that the clock speed is quite low, but again, let's put that aside because for most common applications that's probably not the bottleneck.
In any case, I understand from this explanation then, that a High-CPU Extra-Large Instance provides the equivalent of 20 Opteron or Xeon dual-core chips. Impressive for $0.80/hour, but somehow I have a suspicion that's not the case. Perhaps they mean it is actually equivalent to the same number of cores in those machines? And not the sockets (what most people would refer to as CPUs)?
Also, the Amazon site does not explain what "virtual cores" are in the EC2 environment. And why in the High-CPU instances each virtual core has 2.5 EC2 Compute Units. 2.5? Really?
To figure out the exact capacity you will get from AMIs, Amazon recommends performing benchmarks. They say:
One of the advantages of EC2 is that you pay by the instance hour, which makes it convenient and inexpensive to test the performance of your application on different instance families and types. One good way to determine the most appropriate instance family and instance type is to launch test instances and benchmark your application.
It's great that you are paying for the instances on an hourly basis, but for a small organization, whose existing application is currently not designed to work in an EC2-like environment, conducting such benchmark is quite a large and expensive task, especially if the whole point is to assess if EC2 is an economically feasible solution in the first place.
A bit of a chicken & egg situation.
I'm wondering if anyone out there has conducted benchmarks that compare compute capacity of AMIs versus "terrestrial" servers (i.e., physical servers). If you know of such benchmarks, please link to them in the comments below. I realize that the case maybe different for different kinds of apps (cpu-intensive, data-intensive, i/o bound, etc.), but still, such information might be helpful.
UPDATE:
There is an interesting comment to this post from Wes Felter with specifics on the EC2 Compute Units.
In addition, I posted this on the Cloud Computing group on Google Groups, and got some interesting responses. (If you're interested in the topic, follow the thread).
My former colleague from GigaSpaces, Jim Liddle, writes:
Geva,
Have a look at this post from another blog I contribute to :
http://www.cloudiquity.com/2009/01/amazon-ec2-instances-and-cpuinfo/.
Here we post the results from /proc/cpuinfo for each of the EC2
instance types.
Also look at: http://www.cloudiquity.com/2009/01/amazon-ec2-network-and-s3-performa...
where we have a look at E2 network and S3 performance.
Hope this helps
Jim
And JL Valente writes:
Please check project zeppelin on sourceforge. It supports exactly what you are looking for in terms of benchmarking thru DMTF lmbench. It was designed for the same reason that there is no other way to verify availability to promise and that a user really gets what it pays for.
My other former colleague from GigaSpaces, Dekel Tankel, reminded me that we actually conducted a benchmark of EC2 at GigaSpaces and he wrote me the following:
Small AMI = Single Core, 1.2 GHz
High CPU Ex. Large = 60-75% of Quad-Core 2.3 Ghz
The rest is somewhere in the area of Dual-Core 2.3 Ghz
Need
to remember that due to the hypervisor limitations, workloads with high
I/O characteristics will practically perform worse than the above
These numbers don't fit well with the information that Wes posted below, but obviously there may be different results for different apps, depending on how I/O, memory or CPU bound they are.
In any case I see a problem here, as according to the information provided by Amazon the High-CPU-XL instance should provided 20 times the compute capacity of a Small instance. Even according to Wes, who seems to be the kindest to Amazon in his numbers, it is only about 16 times the compute capacity. I realize this is not a very sophisticated measurement but it's an indication.