Tuesday, April 14, 2009

Interoperability in cloud computing

Interoperability, as defined by IEEE is: the ability of two or more systems or components to exchange information and to use the information that has been exchanged.

Whether it's syntactic, or semantic, it's a very important aspect of software systems. I saw many organizations lack interoperability between their internal applications, where, for example, interacting from system to another requires some effort, and will be specific to the domain of the business needs. In these organization you might notice a number of different representation of the same business entity in different applications/modules. Interoperability could be achieved via different techniques, and on different levels, using a shared database is one solution, where one application interact with the other one application database. This is on the persistence level, what about the logic level, away from persistence.

Vendors are exposing their services and data more rapidly now, making most if not all of their service available programmatic for developers/software systems to (re)use. This might be a very rich environment to build interoperable systems. Vendors must start to think this way for their internal application, building modules, with a well defined set of API's will promote interoperability between applications.

A basic, yet a very critical candidate for interoperability is the business entities, or reference data, more specifically the master reference data. This is one entity shared between systems, and could be easily standardized, and moved somewhere centric, and made available to other systems as a standard data source. This is one proposed solution for interoperable PaaS, which could be delivered as part of the cloud.


Cloud computing is a style of computing in which dynamically scalable and often virtualised resources are provided as a service over the Internet. Users need not have knowledge of, expertise in, or control over the technology infrastructure "in the cloud" that supports them.

- www.wikipedia.org

The cloud could be delivered to the user in a variety of ways, all as a service:
  • Infrastructure as a Service IaaS, something like amazon web services. Amazon delivered it's cloud services through a well defined set of web services API's, however, it needs some knowledge of the infrastructure you are building: Linux or windows, MySql or Oracle, SDB or relational database. So basically, Amazon delivered an easy to use, easy to configure and deploy environment for your servers.
  • Platform as a Service, a container where you application lives. A higher level from IaaS, which comes with some limitations. Google AppEngine is one example, easy to use, deploy and configure. More over, you don't care about the underlying infrastructure: the operating system they are using, the web server as long as your requirement are filled. Building applications with Google AppEngine needs a very small amount of effort to get your application up and running. Limitation is technology used, some API's are closed, like for example I/O in some cases, porting to other PaaS is not trivial, etc.
  • Software as a service SaaS is here for so long, sometimes called Application Service Provider, or ASP for short, is the delivery of the software as a service. Usually it's built on top of PaaS, but not necessarily.

A list of Cloud computing providers are classified in http://saaslink.googlepages.com/Laird_CloudMap_Sept2008.png.

Noticeable effort from the community in bringing cloud computing to the next level:

Please note that these attempts to standardize part of the cloud computing is not finalize yet, and there is debate on some of them, for example, amazon.com, Microsoft, Google and Salesforce.com, the big four in cloud computing business, are not signatories. Check Steven Martin blog message, it's a bit old, but will give the idea.

Achieving interoperability between application was a challenge for so long, and it will stay that way. Working with cloud computing is no difference, achieving interoperability will not be an easy task.

One rich area is the interoperability among reference data, and could be supported by the cloud, maybe PaaS is the most obvious place to do such work.

As part of the container, providers can start to integrate Master Reference Data repository for hosted applications to (re)use.

Master reference data repository in PaaS, Interoperability bridge

Before moving into details of this section, I want to point out that these are some ideas I have. I don't know if it's available in some of the PaaS available in the market or not, but what I know is that this is a vital need, and will change a lot of what might do when it's there.

PaaS provides a high level solution stack, with a number of layers on top of it abstracting the the underlying infrastructure. PaaS provides you with high level, container like environment, with variety of API's to use for deployment sometimes, and for using other services provided under the same umbrella.

As part of this service, providers can abstract a layer of a predefined representation of master reference data. These are not to include sensitive data, and they will obey to the security rules as defined by the owner of the data. This kind of data representation need a spacial data-structure, supporting dynamic schema manipulations.

Examples of such entities are User, which is shared among all software applications. Address, Work Address, Friends/Buddies, and many others?

The repository it self is a data and meta data storage, created and manipulated dynamically, and exchanged via standard XML format, or JSON. It does not matter as long as it's well defined, and well understood by user/developer. Your class can extend another class if you are working within the same environment using the same tools as the case in, for example, Google AppEngine with java support (hopefully). Java in the backend, bigtable as a database, and GWT in the front end. In this case, Google can exceed what I pointed out in this article, they can have a User Form GWT, and you can extend it and just worry about the additional data you add to that form and to the user object they provides as part of the repository.


  1. Every software domain goes through an evolution before coming to a standard. In this case, each container is still innovating a lot, and in some ways having these competing models will make it more clear what the tradeoffs are and the best designs will emerge naturally, later, when the time is right. Perhaps the time is right now, perhaps not - it's a question of where to draw the optimum line, because it will never be perfect.

    The right data model defines the logic, so there should be no need to define the logic model, in theory. The only point is that logic to achieve the same goal should not live in multiple places. At the same time, it's impossible to prevent bad programming.

    As for the predefined reference of a master data standard, perhaps some of the existing standards for social networking, like OpenSocial, can be used for this. Another way to see this type of information is not as part of the data but as part of the request or "context", in which case it should include also App, Domain, Security and possibly IP and many others, like Time.

    When it comes to data standards, you don't mention GData. That said, there are many tradeoffs with any data format, so no single one will solve all problems. They are either verbose and thus easy to read but slow to send, or terse. They are either low-level and extensible ("Object", "Property") or high-level and brittle ("User", "Email").

  2. Thanks, valid points. I do believe that it's hard to force good programming, but you can promote one through adoption of general components that could be modified and extended overtime. This will evolve to mature set of components: easier said than done!

    OpenSocial as a standard, is an interesting yet evolving standard, however, it's social centric, it's around social things like friends, and activities you can do with friends - i hope for something more on the business side also: business entities, etc.

    GData as a standard representation of data is good, webdav maybe also suite supporting master reference data representation, at the end it's not transactional data.

    My proposal is to build a layer of data representing master reference data as a start for interoperability in the cloud, with dynamic schema support, to allow user's to modify and extends this model in their applications. For example i expect to get a User object representation from google for some user Fred, i can refer to it using another XML, which is hosted in some other application either within the google cloud: AppEngine, or somewhere else: this could trigger new wave of interoperability between applications. The same way i use google authentication, it would be great if i can use other type of entities.

    In Google base, there is a predefined set of items the user can pick from, or he/she can create his/her own - imagine if this moved and made available through the AppEngine, i know it's available through API's, however, it's there to be used the way it's, not for interoperability, but with some slight modification, i might fit my proposed model.

  3. Yes, I agree - business data are fundamentally different, because they need transactions, integrity, complex access control layers, etc.

    I don't necessarily agree that there should be a "User" object. All that is needed is that a property of some object in the schema is tied to the username, in the Google or App Engine space that would be one's email before the @, but it could just as well be something else. That way, companies can have Employee, Contractor, etc. If inheritance is already implemented anyway then it's less advantageous.

    Google Base is an overlooked gem, the marketers really could have hit upon a better name.

  4. Your blog is really awesome and I got some useful information from your blog. This is really useful for me. Thanks for sharing such a informative blog. Keep posting.

    Cloud Training in Chennai

  5. more important and informative article,it was more impressive to read ,which helps to design more in effective ways
    cloud-computing Training in Chennai