June 18, 2007

Scalable SOA

Last week I had the pleasure of speaking to Dave Linthicum from the Linthicum Group ("SOA for the real world"). Dave is a prolific columnist, blogger and podcaster on SOA, and is one smart cookie.

What triggered my conversation with Dave were two of his most recent posts on his InfoWorld blog: Scalable SOA Solutions Continue to Emerge and Why SOA Governance Needs to do a Better Job with Data, both of which hit the nail on the head. In the first, Dave is saying something we've been saying at GigaSpaces for a while now, but he puts it really well:

Making solutions scale is nothing new. However, the SOA technology and approaches recently employed are largely untested with higher application and information and service management traffic loads. SOA implementers were happy to get their solutions up-and-running, however in many cases scalability is simply not a consideration within the SOA, nor was load testing, or other performance fundamentals. We are seeing the results of this neglect now that SOA problem domains are exceeding the capacity of their architectures and the technology in many instances.

As I previously wrote in The Law of Unintended Consequences scalability is even a bigger problem when it comes to SOA, because you have even less information how much and when your service will be consumed.

And as I wrote in SOA Governance - Who Cares? the large vendors are focused on selling their SOA governance products and relying on either Web Services or their same old J2EE app servers for the service implementation itself.

In his data post, Dave continues to nail it, bringing up the issue of not dealing with data and the service logic holistically in most SOA environments. Dave is obviously a guy who is out there in the real world.

At GigaSpaces, we are already seeing the need for a new approach to implementing services, especially with our capital markets customers and their very high-throughput, low-latency stateful front office trading and real-time analytics apps. And that is what Space-Based Architecture and our eXtreme Application Platform are all about.

We have recently posted a great example and explanation on implementing high-performance, scalable SOA using SBA and in the context of our new API - the OpenSpaces framework on our award-winning Wiki. Check it out. (BTW, It's still a work in progress).

Update: Check out this presentation from Guy Nirpaz at JavaPolis on Space-Based Architecture and Scalable SOA.

Update 1: Nati Shalom writes about this topic here.

April 30, 2007

Extreme Transactions

Iweekcover_2 In its April 23 issue, InformationWeek's cover says: "Extreme Transactions", which refers to the story entitled: Business At Light Speed

Wall Street's attempt to shave milliseconds off transactions pushes the limits of computer science.

I won't go into the article in great depth, but it basically discusses the trend of the past few years, wherein Wall Street firms on both the buy and sell-side have unprecedented needs for low latency trading execution and market data access due to changes in regulation, the proliferation of electronic trading and the advent of algorithmic trading. So basically you now have thousands and thousands of machines buying and selling stocks and other securities from other machines based on extremely complex (and automated) computer models. So it has become a latency game -- low-latency, that is.

The IW piece focuses on what firms have been doing about at the network infrastructure level, such as bringing their own data centers physically closer to the exchanges and establishing direct access to market data, as opposed to using third-parties such as Reuters and Bloomberg. All in a grand effort to shave a few milliseconds off.

It's a good story, but one bit especially caught my eye:

Therein lies the rub for ultrafast trading: Once you hit physical limits to data-transmission speeds, where do you go from there? "If anybody knows how to get a signal transmitted faster than the speed of light, I'd like talk with them," says Cummings.

There are two schools of thought on this issue. One is that traders, exchanges, and brokers must shave latency from other parts of the system--in the applications they use, for instance--and that the race will continue.

This is already happening. That's why many Wall Street customers are looking at solutions such as GIgaSpaces for their applications and are turning away from existing approaches such as J2EE. It is an attempt to shave off any possible latency in the applications themselves and it is a growing trend.

Those of you who follow this blog or the GigaSpaces Blog know how we achieve such extreme low latency with Space-Based Architecture:

  • Access and process data and events locally and in-memory
  • Collapse the tiers, thereby eliminating network hops
  • Co-locate services with shared memory
  • Scale-out by adding more processing units, as each is self-sufficient (Shared-Nothing Architecture)

We're seeing this approach widely adopted on Wall Street (and in other industries such as telco) with solutions such as GigaSpaces, and in very demanding web applications such as Flickr, Google, Twitter and others using various technologies -- memcache, mapreduce -- which all amount to the roughly the same idea of caching and partitioning. BTW, for a great blog summarizing a lot of these architectures see this from Peter Van Dijck.

Extreme Transaction Processing and Extreme Analytics Processing are growing areas and we're going to see interesting developments in the coming months and years.

UPDATE: Nati Shalom blogs on the topic over at the GigaSpaces Blog. For some reason the URL in Nati's comment belo isn't working, but you can find his post here.

March 26, 2007

It's the architecture, stupid!

I haven't posted in a while due to extensive travel during the past three weeks: San Diego for the CMP Exchange Solution Provider show, London for QCon and last week in Las Vegas for TheServerSide Java Symposium. More on some of these in future posts, but the Oracle-Tangosol acquisition news that came out on Friday (and had been anticipated by us for some time -- it's a small world...) is the big thing everyone's talking about.

Nati posted an excellent analysis on this on the GigaSpaces blog. His post seems to have resonated well with others, such as Patrick Logan and John Powers of Digipede.

So to re-emphasize some of Nati's points in my own words:

  • The Oracle acquisition of Tangosol is a strong validation of a new emerging category of middleware software by a major vendor
  • It was certainly an excellent move for Tangosol (as a non-VC backed company, a lot of people there, and especially Cameron, are to get a big fat check). Furthermore, because the market is heating up and getting more competitive with well-funded companies such as GigaSpaces, this was the right time for Tangosol to do this
  • It was a pretty good move for Oracle, and shows that they have a fair grasp of where the world is going to, but it leaves much to be desired. Nati discusses this in detail in the post referenced above, as well as touches on it in When You Need More Than Just a Data Grid.

The Tangosol approach all along has been that in order to solve performance and scalability problems, you need to solve the data problem -- i.e., move from a centralized, remote, disk-based database to a distributed, local, in-memory cache (aka Data Grid). That's fine.

The GigaSpaces approach has been all along that by only addressing the data bottleneck, you are merely taking an aspirin, not fundamentally curing your chronic migraines. In other words, the crux of the issue lies in the architecture -- n-tier architecture to be exact. Without a complete paradigm shift, you will not find the ultimate solution to the needs of the fast-growing category of what Gartner calls Extreme Transaction Processing (XTP), of real-time analytics, of high-performance SOA and of massive web applications of all sorts.

Besides the many GigaSpaces customers who are proof that this approach is being accepted. Look at the architectures of Google, Amazon, eBay, MySpace, LiveJournal and other Web stalwarts. They have all come to the same conclusion - with different nuances. They have all realized that the level of scalability, reliability and performance they need -- while keeping cost and complexity down -- will not come from a J2EE app server + database + messaging. It will not come from an n-tier architecture. Instead, they moved to a scale-out architecture, which aims for a shared-nothing approach.

So what is the GigaSpaces approach?

We call it Space-Based Architecture (SBA). I will not go into it in great detail here, because it has been explained extensively in our various blogs and white papers, but it follows the following principles:

First:

  • Collapse the tiers into a single process
  • Co-locate the services in a single process
  • Manage state and other in-flight data in memory

You have essentially created a self-sufficient process ("Processing Unit" in GigaSpaces parlance). No more network hops. No more database calls.

Now:

  • Scale-out these self-sufficient processes across your hardware infrastructure (cheap, standard hardware, mind you)
  • Have the middleware partition and load-balance incoming requests across the many processes
  • Dynamically manage the environment from a single-point-of-access (but not a single-point-of-failure) with SLAs for response times and reliability

The resulting architecture is:

  • Linearly scalable -- because there is no dependency among the Processing Units, the law of diminishing returns does not apply. Each additional unit added provides the same throughput as the one before it
  • Low latency -- because network hops between the tiers and the services that make up the application have been eliminated, and because data and events are accessed locally and in memory
  • Simple -- because you have a single clustering model to manage high-availability, load-balancing and partitioning across your entire environment

Until the Oracles of the world acknowledge that the architecture their products assume is not viable for this class of applications, the headaches will keep coming back to plague them and their customers.

October 27, 2006

The Tyranny of Tiers

Ted Neward wrote an excellent piece in his new MSDN column about "layering." Essentially, Ted is making the same argument that is one of our key tenets at GigaSpaces -- separation of concerns in the architecture should be logical, not physical, as is the case with most tier-based systems. The physical separation creates huge latencies and is a scalability bottleneck. This idea of "virtualized Tiers" is the basis of our Space-Based Architecture and what lies behind our slogan: Write Once Scale Anywhere.

A while back, Nati Shalom wrote an excellent white paper about this entitled Space-Based Architecture and the End of Tier-Based Computing (registration required). By the way, it was one of the all-time most downloaded white papers on the TheServerSide. And Nati has been presenting this idea in various conferences.

We have been talking about "virtualized tiers." Ted gives it a name, which I think clearly establishes the difference between a "virtual tier" -- a "layer" in Ted's terminology, and a physical tier. I think I'll adopt Ted's version from now on.

From Nati's paper:

Tier-based computing has hit a wall when it comes to supporting performance-intensive applications. Whether the challenge is to grow these applications beyond a few hundred concurrent users or a few dozen parallel processes, issues like complexity, scalability, load-balancing, and synchronicity get in the way. The solution is not to improve the tier-based approach but to move beyond it — to a service-oriented architecture built on shared spaces within a grid computing framework.

The power of spaces comes from not having to separate various parts of an application into discrete physical runtimes — and then wiring those together in complex, hard-to-scale, and performance-consuming tangles of middleware. A space doesn’t care if an application has been “tiered.” Whether it has or not, the same program code will instantiate multiple times on the same machine or on multiple machines automatically — and even dynamically - in response to runtime parameters like CPU utilization.
Instances communicate through the space, just as if they were talking to middleware. In fact, they can use the same middleware APIs they always have — except now the middleware’s role has become virtualized, meaning that all physical message and data exchanges are handled transparently by the space.

And here’s the best part: Spaces make migrating today’s tier-based applications to tomorrow’s service oriented and grid architectures an evolutionary migration — with immediate performance boosting benefits.