November 07, 2008

Scalable, Low Latency Web Tier on Amazon EC2

Shay Hassidim, deputy CTO at GigaSpaces, posted an impressive write-up of a benchmark the team ran on Amazon EC2. What's nice about it is that they took a standard web app, in this case the Spring PetClinic, and dropped it into the GigaSpaces container, achieving instant low-latency and scalability, with out-of-the-box load-balancing and fail-over. Extremely cool.

The other components in the app include standard and open source components: Jetty, MySQL, Apache load-balancer, JMeter and Ant.

Also, Shay posts a screen shot (I think it's the first-ever public one) of the new GigaSpaces cloud framework. Check it out:

See the full benchmark numbers on Shay's post. And you can sign up for a GigaSpaces pay-per-use EC2 license here.

November 03, 2008

Cloud Computing. Literally.

Last week we made a very exciting announcement about Miwok Airways selecting GigaSpaces as the application server for running their reservation and pricing engine which will run on EC2. This is a great case study for cloud computing.

Miwok_logo For one thing, you have to love the fact that it is cloud computing used for a business that literally runs in the clouds (the actual meteorological kind). Second, it is an on-demand compute infrastructure for a business that has an on-demand business model in the real world. A perfect fit.

There is a great piece in the LA Times that describes Miwok, but let me give you a brief description from the software application angle. 

The idea is that for so-called ultra-short flights (typically, less than 250 miles), as a traveler you have a terrible dilemma: use commercial airlines or drive your car. I don't need to tell you the hassle and costs involved in both options these days.

Miwok overcomes the hassles of these options by providing you with an on-demand "air taxi" service. You book your flight when you need it. So, say, you want to fly from Santa Monica to Orange County or Palm Springs. You go to the Miwok web site and say when and where, you get pricing and you can book the flight on the spot. The flight you are booking is for a private Cirrus SR22. You can park 100 feet from the airplane itself (at a local airport, not just the major ones) and you don't need to go through security (imagine that!). All of this at the same cost of a commercial flight.

Cirrus_sr22

But here's the part I really like:You can connect to other people via Miwok's own social network, or through a Facebook app (and others to come). As the Cirrus can seat 3 passengers, you can split the costs with other passengers who need to make the same trip. So the flight could end up significantly cheaper than a commercial airline.

Think about it: This is the exact opposite pricing model of big airliners, where the more people go on a flight, the price goes up. From a marketing point of view, this has tremendous viral potential.

One of the biggest technology (technology as in software, not aviation) challenges Miwok was facing was developing an extremely sophisticated real-time pricing engine. It needs to take many parameters into account to offer you a price on the spot, including location, path, season, date, time of booking, number of passengers and several other criteria. It needs to be able to grow and shrink on-demand, especially because of the social networking and viral effect.

The architecture Miwok selected uses MySQL and Hibernate for the persistence layer, but the database is not used as the system of record for calculation and reservations. Instead they use GigaSpaces' in-memory data grid, which gives you in-memory speeds and can also grow and shrink dynamically in the EC2 environment. The benefit for Miwok is that having very little advance knowledge on the traffic they will get, and expecting extreme peaks and troughs in activity, they don't need to pre-plan and invest upfront in the infrastructure. They use GigaSpaces and EC2 and will only pay for hardware and software on a per-use basis -- when and if they actually need it.

They also use GigaSpaces XAP (which includes the in-memory data grid) as the container for the business logic, written in Java, and as a bus for integrating the various underlying services involved in generating pricing and booking reservations.

In short, on-demand application scalability for an on-demand air travel service.

Check out Miwok's web site.

Sign up for the GigaSpaces pay-per-use license for Amazon EC2.

January 18, 2008

Excel That Scales: The Movie

Microsoft_excel_2Back in June of last year I wrote about our partnership with Microsoft and our plans to work together on a solutions for scaling out computations on Microsoft Excel spreadsheets. Since then Microsoft and us both released joint material (see here on MSDN) and held joint events promoting the solution. The most up-to-date white paper on the solution can be found here.

But now, Owen Taylor produced a screencast that describes the Microsoft-GigaSpaces joint "Excel That Scales" solution, in which he walks you through the problem and the solution.

Listen to the presentation.

Synopsis:
In many organizations -- for example in capital markets and oil & gas exploration -- Excel is used widely for complex computations and analytics. Excel is a flexible tool that many people are familiar with, so over time huge investments have been made in creating complex analytical models in Excel. However, it was never designed to be an enterprise-grade analytical tool. As data volumes are growing, the need to have real-time information is intensifying and the number of users who wish to share the same computational logic and data is increasing, desktop-based Excel spreadsheets could no longer handle the loads. Also, the functions they perform are becoming mission-critical and valuable time and information could be lost in case of failure.

Enter the GigaSpaces solution. It combines the best of both worlds: Excel as the front-end and the power of your data center -- through GigaSpaces as the scale-out, highly-reliable application server - -at the back-end. In other words, the logic and the data are handled server-side with enterprise-grade reliability and performance.

Owen says it better and shows a demo.

August 23, 2007

Scaling Stateful Applications on Amazon EC2

A lot of people have been writing about how cool the Amazon EC2 service is -- and it is. On his blog, Mike Nicholls does a particularly good job of explaining the advantages EC2 gives to start-ups and entrepreneurs in cost-effectively and easily scaling their business.

I agree with every word Mike says, but there are is a major issue he does not address. It is that even if you have a cost-effectively scalable infrastructure, the fact remains that you need to build your app in a way that it can easily scale-out across many machines -- and grow (preferably linearly), as needed. This is a particularly big challenge when you are trying to build stateful, low-latency or data-intensive applications (or a combination thereof).

For those who have been following this blog, or GigaSpaces in general, you know that it is exactly that challenge we are addressing with Space-Based Architecture and the GigaSpaces eXtreme Application Platform.

Now, we are marrying the two together -- the scalable cost-effective infrastructure of Amazon EC2 with the linear scalability of Space-Based Architecture, including for stateful, high-performance apps. You can utilize the full benefits of SBA, or just the In-Memory Data Grid.

Dekel Tankel and Alon Lahav in my team at GigaSpaces have been doing a great job on this and just today made available a public AMI (Amazon Machine Image) for the GigaSpaces XAP platform. We're still working on improvements and optimizations, but it is ready for people to start playing around with.

  • You can find the GigaSpaces AMI here.
  • General description in the Amazon Web Services Solution Catalog here.
  • Detailed paper with set up instructions and code examples here (PDF).

We're already working with a couple of beta customers on EC2, including some folks who are building a Web 2.0 social networking type site.

And as Luke Flemmer at Lab49 points out, it's a great environment for testing distributed apps.

So please try it out and let me know what you think (and remember it's kinda beta-ish).

BTW, Jason Carreira and others asked for this on this TheServerSide thread. Well, folks, here it is, as Nati promised...

August 21, 2007

Julian Browne on Space-Based Architecture

Great series of posts by Julian Browne on why and how they used Space-Based Architecture and GigaSpaces on Virgin Mobile's award-winning web site.

And Nati's reaction.

April 30, 2007

Extreme Transactions

Iweekcover_2 In its April 23 issue, InformationWeek's cover says: "Extreme Transactions", which refers to the story entitled: Business At Light Speed

Wall Street's attempt to shave milliseconds off transactions pushes the limits of computer science.

I won't go into the article in great depth, but it basically discusses the trend of the past few years, wherein Wall Street firms on both the buy and sell-side have unprecedented needs for low latency trading execution and market data access due to changes in regulation, the proliferation of electronic trading and the advent of algorithmic trading. So basically you now have thousands and thousands of machines buying and selling stocks and other securities from other machines based on extremely complex (and automated) computer models. So it has become a latency game -- low-latency, that is.

The IW piece focuses on what firms have been doing about at the network infrastructure level, such as bringing their own data centers physically closer to the exchanges and establishing direct access to market data, as opposed to using third-parties such as Reuters and Bloomberg. All in a grand effort to shave a few milliseconds off.

It's a good story, but one bit especially caught my eye:

Therein lies the rub for ultrafast trading: Once you hit physical limits to data-transmission speeds, where do you go from there? "If anybody knows how to get a signal transmitted faster than the speed of light, I'd like talk with them," says Cummings.

There are two schools of thought on this issue. One is that traders, exchanges, and brokers must shave latency from other parts of the system--in the applications they use, for instance--and that the race will continue.

This is already happening. That's why many Wall Street customers are looking at solutions such as GIgaSpaces for their applications and are turning away from existing approaches such as J2EE. It is an attempt to shave off any possible latency in the applications themselves and it is a growing trend.

Those of you who follow this blog or the GigaSpaces Blog know how we achieve such extreme low latency with Space-Based Architecture:

  • Access and process data and events locally and in-memory
  • Collapse the tiers, thereby eliminating network hops
  • Co-locate services with shared memory
  • Scale-out by adding more processing units, as each is self-sufficient (Shared-Nothing Architecture)

We're seeing this approach widely adopted on Wall Street (and in other industries such as telco) with solutions such as GigaSpaces, and in very demanding web applications such as Flickr, Google, Twitter and others using various technologies -- memcache, mapreduce -- which all amount to the roughly the same idea of caching and partitioning. BTW, for a great blog summarizing a lot of these architectures see this from Peter Van Dijck.

Extreme Transaction Processing and Extreme Analytics Processing are growing areas and we're going to see interesting developments in the coming months and years.

UPDATE: Nati Shalom blogs on the topic over at the GigaSpaces Blog. For some reason the URL in Nati's comment belo isn't working, but you can find his post here.

March 26, 2007

It's the architecture, stupid!

I haven't posted in a while due to extensive travel during the past three weeks: San Diego for the CMP Exchange Solution Provider show, London for QCon and last week in Las Vegas for TheServerSide Java Symposium. More on some of these in future posts, but the Oracle-Tangosol acquisition news that came out on Friday (and had been anticipated by us for some time -- it's a small world...) is the big thing everyone's talking about.

Nati posted an excellent analysis on this on the GigaSpaces blog. His post seems to have resonated well with others, such as Patrick Logan and John Powers of Digipede.

So to re-emphasize some of Nati's points in my own words:

  • The Oracle acquisition of Tangosol is a strong validation of a new emerging category of middleware software by a major vendor
  • It was certainly an excellent move for Tangosol (as a non-VC backed company, a lot of people there, and especially Cameron, are to get a big fat check). Furthermore, because the market is heating up and getting more competitive with well-funded companies such as GigaSpaces, this was the right time for Tangosol to do this
  • It was a pretty good move for Oracle, and shows that they have a fair grasp of where the world is going to, but it leaves much to be desired. Nati discusses this in detail in the post referenced above, as well as touches on it in When You Need More Than Just a Data Grid.

The Tangosol approach all along has been that in order to solve performance and scalability problems, you need to solve the data problem -- i.e., move from a centralized, remote, disk-based database to a distributed, local, in-memory cache (aka Data Grid). That's fine.

The GigaSpaces approach has been all along that by only addressing the data bottleneck, you are merely taking an aspirin, not fundamentally curing your chronic migraines. In other words, the crux of the issue lies in the architecture -- n-tier architecture to be exact. Without a complete paradigm shift, you will not find the ultimate solution to the needs of the fast-growing category of what Gartner calls Extreme Transaction Processing (XTP), of real-time analytics, of high-performance SOA and of massive web applications of all sorts.

Besides the many GigaSpaces customers who are proof that this approach is being accepted. Look at the architectures of Google, Amazon, eBay, MySpace, LiveJournal and other Web stalwarts. They have all come to the same conclusion - with different nuances. They have all realized that the level of scalability, reliability and performance they need -- while keeping cost and complexity down -- will not come from a J2EE app server + database + messaging. It will not come from an n-tier architecture. Instead, they moved to a scale-out architecture, which aims for a shared-nothing approach.

So what is the GigaSpaces approach?

We call it Space-Based Architecture (SBA). I will not go into it in great detail here, because it has been explained extensively in our various blogs and white papers, but it follows the following principles:

First:

  • Collapse the tiers into a single process
  • Co-locate the services in a single process
  • Manage state and other in-flight data in memory

You have essentially created a self-sufficient process ("Processing Unit" in GigaSpaces parlance). No more network hops. No more database calls.

Now:

  • Scale-out these self-sufficient processes across your hardware infrastructure (cheap, standard hardware, mind you)
  • Have the middleware partition and load-balance incoming requests across the many processes
  • Dynamically manage the environment from a single-point-of-access (but not a single-point-of-failure) with SLAs for response times and reliability

The resulting architecture is:

  • Linearly scalable -- because there is no dependency among the Processing Units, the law of diminishing returns does not apply. Each additional unit added provides the same throughput as the one before it
  • Low latency -- because network hops between the tiers and the services that make up the application have been eliminated, and because data and events are accessed locally and in memory
  • Simple -- because you have a single clustering model to manage high-availability, load-balancing and partitioning across your entire environment

Until the Oracles of the world acknowledge that the architecture their products assume is not viable for this class of applications, the headaches will keep coming back to plague them and their customers.

December 20, 2006

Response to the Java Posse

The Java Posse podcast guys were kind enough to mention us again, specifically our 5.2 release. One of the comments they make, though, is that we seem almost afraid to mention that GigaSpaces is a Jini/JavaSpaces implementation, as we don't mention it in the press release and it is not featured on our web site. They say it's a shame for Jini/JavaSpaces fans everywhere.

Well, first, I take that as a compliment. So thanks, Java Posse.

Second, let's make it clear: we are damn proud of being a Jini/JavaSpaces implementation, and associated with the technology and the community around it.

The reason that we do not emphasize it in our marketing material is straightforward. We get more bang for the buck by talking about our features, benefits and customer results than we do by talking about the underlying technology, which sometimes ignites silly religious wars.

Another reason is that, frankly, our product goes well beyond Jini-JavaSpaces (as Patrick Logan notes), so we don't want to sell ourselves short. We definitely take advantage of the underlying Jini-JavaSpaces capabilities, but not necessarily as a programming model. That's why we prefer talking about Space-Based Architecture.

Our biggest contribution to the Jini/JavaSpaces community is creating a robust, enterprise-grade, commercially successful product.

In the meantime, and despite Sun having originally mis-marketed J/JS ("Your toaster will talk to your microwave oven and together they'll order a pizza"), the technology is clearly growing in popularity.

Speaking of mis-marketing J/JS: Fuzzy rightfully complains and Dan Creswell sets the record straight.

October 27, 2006

The Tyranny of Tiers

Ted Neward wrote an excellent piece in his new MSDN column about "layering." Essentially, Ted is making the same argument that is one of our key tenets at GigaSpaces -- separation of concerns in the architecture should be logical, not physical, as is the case with most tier-based systems. The physical separation creates huge latencies and is a scalability bottleneck. This idea of "virtualized Tiers" is the basis of our Space-Based Architecture and what lies behind our slogan: Write Once Scale Anywhere.

A while back, Nati Shalom wrote an excellent white paper about this entitled Space-Based Architecture and the End of Tier-Based Computing (registration required). By the way, it was one of the all-time most downloaded white papers on the TheServerSide. And Nati has been presenting this idea in various conferences.

We have been talking about "virtualized tiers." Ted gives it a name, which I think clearly establishes the difference between a "virtual tier" -- a "layer" in Ted's terminology, and a physical tier. I think I'll adopt Ted's version from now on.

From Nati's paper:

Tier-based computing has hit a wall when it comes to supporting performance-intensive applications. Whether the challenge is to grow these applications beyond a few hundred concurrent users or a few dozen parallel processes, issues like complexity, scalability, load-balancing, and synchronicity get in the way. The solution is not to improve the tier-based approach but to move beyond it — to a service-oriented architecture built on shared spaces within a grid computing framework.

The power of spaces comes from not having to separate various parts of an application into discrete physical runtimes — and then wiring those together in complex, hard-to-scale, and performance-consuming tangles of middleware. A space doesn’t care if an application has been “tiered.” Whether it has or not, the same program code will instantiate multiple times on the same machine or on multiple machines automatically — and even dynamically - in response to runtime parameters like CPU utilization.
Instances communicate through the space, just as if they were talking to middleware. In fact, they can use the same middleware APIs they always have — except now the middleware’s role has become virtualized, meaning that all physical message and data exchanges are handled transparently by the space.

And here’s the best part: Spaces make migrating today’s tier-based applications to tomorrow’s service oriented and grid architectures an evolutionary migration — with immediate performance boosting benefits.