Rajeev Motwani

Rajeev, my friend from my Berkeley days, and a Professor of Computer Science at Stanford, passed away unexpectedly Thursday night.  He was brilliant, smart, funny, loving, caring...  He touched so many lives here in Silicon Valley. 

A few of us, who came from the various IIT's and had joined Berkeley's PhD and Master's programs in 1985, found Rajeev to be like an elder brother to us.  I learnt Computer Science theory from him, and at the other end, I acquired a taste of King Crimson from him too.  Such was the varied scope of his influence.  Even though he could ill afford it, he would spend on his "younger" brothers whenever they needed it, no questions asked. 

We miss you Rajeev, and our thoughts are with the family and friends you have left behind. 

With all the aaSes out there, can smart-alecky titles be far behind?

Whatever cloud computing unleashes, it has definitely made x-aaS a common term.  And with that has come the veritable treasure trove of funny blog postings.  As an example, I just finished reading "Get your SaaS off my Cloud."  The posting is interesting and worth a read.  But after I read it, I was going to blog about it, so I went to look at the title, and I said, wow, another pun.  Hence the title of my posting too :)

Substantively, the assertion in the posting is: do not confuse software delivery with infrastructure delivery and do not call it a cloud, or at least "the cloud."  Fair enough.  But the semantics of what "the cloud" is and isn't is just that -- semantics.  All the clients I talk to couldn't care less what something is called -- as long as it delivers an "easy consumption model." and takes away some pain (procurement, skills, time to value, ...) For me, that can happen at any layer -- IaaS, PaaS, SaaS, BPaaS (Business Process as a Service). 

IBM's Amazon Offerings in Information Management

In keeping with our view that cloud in general, and Amazon in particular, represents a transformative approach to IT -- from a development, IT or a LOB point of view, we in Information Management have recently made available some of our leading products on AWS.  You can find them @ the IBM partner Page on AWS.  Information of how to use them can also be obtained from IBM's Developer Work Amazon Pages.

I want to highlight some interesting aspects.  One, our two flagship databases -- DB2 and IDS, are available, free of charge, for development, and with pricing in 10's of cents to a dollar+/machine/hr, for prroduction.  There are many things in our databases that makes them the best for Amazon like envionment -- their "autonomic" support, understanding of the dynamic nature of virtualization etc., but most importantly, as James Governor put it, IBM in the Amazon Cloud: on pricing and billing innovation , it is also about flexible, pay-go pricing.

Second, some of our newer products are being (near-simultaneously) made available on Amazon, as well as a traditional software licensing model.  A great example of this, and of course my favorite for those of you who know me is Mashup Center, available here.

IBM's middleware services on Amazon

Just today we announced the availability of some of our middleware, free for developers, on Amazon.  You can get to these either from the Amazon catalog, or from our developerWorks site.  I am a firm believer that cloud will be a mixture of the economics of hardware, middleware and service offerings.  This announcement (alongwith others such as this) are the first few steps in IBM's full participation in the cloud phenomenom. 
As you know, I have been formulating over the last few months a few points of views re. cloud, and I will continue to articulate them from a vendor neutral perspective, but this was significant enough (for the importance of Amazon, and for us working through some internal business process/model issues) that I wanted to give a "shout out" to my database team and our developer relations team under Dave Mitchell for making this happen.

Cloud Substitutability: Is it Really Important?

Recently I attended a cloud interoperability forum organized by Stephen O'Grady and Dave Berlind (thanks to both of them for corralling such disparate voices).  There I got into a mini-argument with Tim Bray, who was asserting that the most important thing from an interoperability perspective is substitutability, i.e. there should be no vendor (cloud provider) lockin.  That CIO's say, if there is lockin, they are not moving stuff to the cloud. 

My argument against it is simple: there are many dimensions to why an organization may choose to move some of its workloads to the cloud.  Substitutability or no vendor locking (see my blog) is just one of them, and I would assert not even the most important one for now.  Cloud economics, security, loss of control, network effects, bandwidth, ecosystem, integration needs etc etc all important.  Every CIO is being asked to do more with less, and consequently, substitutability or lack thereof is just one of the dimensions to the complex puzzle of moving stuff to the cloud.

Cloud: Good for large number of small problems or small number of large problems?

First, to all my fans out there (hello, anyone really there, or you all have given up?):  Ok, Ok, I get it -- I cannot call myself a blogger and be absent for two months.  Sorry!  But Mr. Regularity is back on the scene (I know, this regularity term has been hijacked by some over the counter medicine folks, but so be it).

So what's been keeping me busy?  Clouds and Mashups, my twin passions.  I will try to alternate the two in my postings this year.  Let me first begin with cloud. 

I want to run a hypothesis by you all.  I see two sets of workloads in the cloud.  A large number of small problems (example, salesforce.com, where all queries/transactions come with a tenant-id attached) or a small number of large problems (example, google with its bigtable usage).  Sometimes we tend to get over excited about the latter -- infinite scalability, 1000's of nodes, a computer science student's dream.  But as salesforce.com has shown, one can make a handy billion dollars by efficiently solving a large number of small problems too.  The nice thing about managing a large number of small problems is that one does not, up and down in the stack, need to manage everything as one large server, one large storage or one large database.  Right "scaleout" models can be built at different layers of the stack, giving one a lot of flexibility.  That is why salesforce.com can run on Oracle, whereas google is custom top to bottom.

If we think this way, then we immediately understand infrastructural needs, which is (the size of the problem)*(number of {concurrent} problems).  It is clear that for google, this translates to multiple hundreds of thousands.  One reason why the size of the problem for google is large is because of the #of bits (~PB) and another is because the amount of computation needed for analytics is elastic and the larger, the better, and therefore can be easily ~1000/problem. 

For salesforce, my suspicion is that the total infrastructural need is considerably less.  Now you might say, salesforce is about transactional apps, and transactional apps are not very "infrastructure" intensive.  Whatever the merits of that argument, I find this way of looking at cloud workloads to be quite worthwhile.

Mashup Video

I have always made a point that enterprise mashups will succeed because of two primary reasons

  1. They create a partnership between LOB's and IT, enabling IT to do what it does best -- unlock information, and the business users to do what they do best -- build what is exactly needed by them.  In addition, it can help justify the ROI of projects so as to enable IT to focus on building those apps that have a proven ROI.
  2. And guaranteeing security, access control and the rights -- balancing them with flexibility (creativity and control) will make them "permissible" within the enterprise.

Watch this video since it says this much better than I would. 

Cloud Standards

So I got a chance to host a session in MR's Cloud Summit at the Computer History Museum yesterday on Cloud Standards.  I began by asserting that a lot of innovation/standardization activity goes through the following three phases:
Slide1

Fundamentally, for first 5 years, rapid innovation does not need standards and standards might inhibit the adoption.  In the next phase, there is a distinct fear of lockin, which sometimes leads to standards, sometimes interoperability leads to standards, but eventually, the industry consolidates in this phase and standards typically emerge (such as SQL).  In the next phase, there is another period of growth, but vendors find new ways to create lockins (e.g., stored procedure languages in the database world).

So the question is, what about cloud.  My hypothesis in the pitch was the following
Slide4

That different layers of the cloud infrastructure are in different phases.  The bottom is more amenable to standards (such as OVF), than one version of the middle tier is -- the version that is pushing for a new "stack" (such as force.com, Google app engine, Amazon S3 etc.).  However, another version of the middle-tier, which is taking traditional 3-tier apps and moving them to the cloud is already well past the standards phase, because these applications have made their decisions on Oracles, DB2's, WebSpheres and BEA's. 

It was  lively discussion and I will need to refine my hypothesis and theory, but I would welcome comments.

And more on what I saw and heard at the summit later.

SaaS advantage -- in testing?

With all the news about google's chrome, and the release of the nerdy (in a good way :) comic, what really struck me was pages 9 through 11.  Got me and Carl Kessler talking. 

SaaS = testing advantage?  In what environments? 

  1. Public data, ok yes billions of web pages in every shape and form ==> thorough testing
  2. But private data? Private "user" behavior?

Is there some written/unwritten rules on what is permissible?  We had always looked at privacy and data mining.  What about privacy and "software" testing?

BTW, I am blogging this from 37K ft above the ground, on AA's gogo wifi  Pretty cool, high time.

As with any issue, there are no problems finding people on both sides of an issue

Take SaaS and cloud for example.  The die-hards say, this is the world that will take over and kill all existing IT deployments (over time, of course, they are not silly enough to say it will happen at the speed of thought :).  The contrarians will say, "been there, done that, remember 2000 and all the business model up-endings?"  As always, the breathless excitement of new stuff dominates the ink.


So I give you, without commentary, two "contrarian" views on SaaS and Cloud.  Make sure you read the comments which are almost uniformly "contrary to the contrarian approaches."

First, on SaaS, the CEO of Lawson, in an interview.    Central thesis: "SaaS model does not compute from a profit perspective."

Second, the bits blog by David Gallagher in NYTimes here.  Central thesis: (1) Millions of email addresses makes bad things more likely compared to ones corporate mail "address completion." (2) Live documents in google docs model have more potential for mischief.