The innovation just keeps on coming from the good folks at Amazon Web Services. This week they announced a new pricing model for Amazon EC2 instances: spot pricing. Spot pricing is the third pricing model Amazon is offering for EC2 instances -- with On-Demand and Reserved being the other two -- and it brings us closer to an efficient and commoditized IT infrastructure market, and it got my mind racing on the various possibilities of it, and where it goes if taken to its logical conclusion.
James has a very succinct explanation of the key tenets of the new offering:
I had to open my old finance textbook
from business school and think of all sorts of possibilities: call options, put options, futures, and other forms of derivatives and hedging techniques. It will be interesting to see if any of those evolve over time. By the way, there already is a real-time ticker for Amazon spot pricing, called Cloud Exchange. But here are some thoughts on issues that are relevant in the shorter term.
At first glance it would seem that Spot Instances are only relevant to a limited set of workloads, namely those that are not time-sensitive and can be easily stopped and re-started. The examples given by Jeff Barr are web crawling, data processing (presumably of the batch, scientific type, as his Pfizer case study implies) and data transformation (the example Jeff gives is media transcoding -- such as changing video formats).
But requesting a spot instance is accessible via API, as are other types of instance provisioning in Amazon, which means that you can programmatically write very interesting logic around which types of instances you want provisioned under what circumstances. I think this means that over time people will realize that spot instances are feasible for a wide variety of workloads. It will require large scale apps to be worth the effort, but those are increasingly commonplace on AWS.
As I said in regards to Reserved Instances, there will be a psychological barrier for a while, but it will be overcome eventually.
I posted a question on Twitter: "If I bid for EC2 spot instances at list price, doesn't it mean I'll always have a full-time instance at the lowest possible price?" That's all I could explain within the 140 character limit, but let me put it another way here.
If my objective is to have the longest possible running ec2 instance at the lowest possible price, I always want to bid at the then-current ec2 on-demand price. Here's the logic:
If the current Amazon spot price is lower than my bid, I will be charged the current price regardless of my bid (that's how the system works). I am making an assumption that the spot price will never go above the on-demand price. I am making this assumption for a very simple reason, which is that no one will ever bid equal to or more than the on-demand price because then they can simply provision an on-demand instance (which does not have some limitations that spot instances do).
No one has yet to come up with an argument (at least to me on Twitter) that contradicts this logic. Several people said that technically it's possible that spot prices will rise above the on-demand price, but as I already said, that doesn't make sense. On-demand prices are the ceiling for spot prices (except for the caveat below). If you have different thoughts on this, please leave them in the comments below.
Here are two possible issues with this logic as I see them:
- On-Demand Instances are not always available: This was the argument made by Ezra Zygmuntowics (@ezmobius), co-founder of Engine Yard. This means that On-Demand is not a real ceiling for spot prices. Although I've seen some noise about this on Twitter, I don't think there is a real case to be made that there is a problem with getting On-Demand Instances "on-demand". If there is, please correct me in the comments. But more importantly, if there is, that's a big "whoa" moment. It is completely contradictory to the whole Amazon (and cloud computing) model of, er, on-demand computing. It would also raise the biggest fear possible about the cloud computing model of relying on an external provider for your basic computing needs (i.e., servers): that capacity is unavailable. It also contradicts the rationale that Amazon gave for spot pricing in the first place -- to take advantage of unused ec2 capacity. But I really don't think that it is an actual problem. I have not heard of anyone seriously complaining about lack of availability of on-demand Amazon instances. Finally, going back to theory, if there was such a problem, Amazon would simply raise prices for on-demand and we'd have a new ceiling for spot. We're actually seeing the opposite happening: Amazon keeps lowering prices for on-demand instances.
- There is a hard cap on the desired cost for the job: This makes a lot more sense to me, but I think it is a rare real-life case. This use case assumes that I have a time-insensitive compute problem that is only worth for me to solve at X cents per hour/per instance. To make it easier to digest let's say that X equals 5 cents and the current price of an on-demand small instance (and we'll use that as the unit) is 9 cents per hour. If that's the case, then obviously I want my spot bid to be 5 cents and not the on-demand instance price.
Barring these two arguments, there is no reason to bid for a spot instance at less than the current on-demand instance price. That's it. I said it, now skewer me in the comments. :-)
Shlomo Swidler suggested on Twitter that someone run a competition to see who can create the logic that keeps the longest running spot instance. That would be cool, but I would add the condition: at the lowest possible price.
It's going to be fun to see where these innovations take us. Any thoughts? Please share in the comments.
UPDATE: there is a great comment thread to this post with interesting insights from James Watters, Guy Rosen and Shlomo Swidler.