Recently in Cloud Computing Category
Much has been written
about the severe Amazon EC2 outage last week. This made me think about the
necessary tools needed for deploying high-availability applications in a cloud
environment. Java Enterprise Edition is very complex to use but remains the
popular choice among enterprise application developers today and it has a huge
installed base needing some form cloud readiness. Application platform
frameworks, like Spring, provide the runtime middleware container for both
custom or packaged applications that run on a cloud service like PaaS
(Platform-as-a-Service). Features of programming languages like Java, C#, C++,
Ruby or Python; can be extended at runtime by APIs, embedded declarative
clauses or metadata patterns provided by a framework like Spring. Optimal
allocation of system resources (memory, threads, connection pools etc.),
quality-of-service (reliability, availability etc.) and connectivity
(messaging, networks & databases) are managed on behalf of the application
by the framework. With the huge investment in Java code today, many firms are
adopting Spring's Model-View-Controller (MVC) for web applications, plug-ins
for the Eclipse IDE and the many web service add-ons available. The beauty of
the framework and its relevance to cloud is that Spring separates all business
logic from application (logical or physical) infrastructure. Now we can have a
real application bus where you combine virtualized or non-virtualized
applications with structured or unstructured data. Those applications will be
exposed to both managed and unmanaged mobile devices (but that's another blog
post!). We would build applications by taking blocks (objects) of code using
Spring and populate the business logic using open source lifecycle tools
existing outside of the cloud. Dynamic programming models like Ruby can be used
as large-scale web-based front-ends with the power of Java EE under the hood
extending the life of existing legacy applications in a painless way. With the
framework in place, cloud advances in multi-tenant governance; horizontal
scaling or cloud transaction processing can take place without causing major
application reconstruction.
Beyond the hype around cloud computing and its variants, there is growing acceptance of the notion that the buying criteria of end customers has moved from a discussion of how the technology chosen and provided to a discussion of what value the technology is providing. This provides the new frame of reference for customers and IT providers alike. We are finally getting to fully connect the business objectives and requirements with IT services delivery. Given IBM's announcement today of a new data center in Auckland and a Cloud Computing application research center in Hong Kong, there is growing evidence that organizations are increasingly more interested in moving much of their data center operations to off-premise providers. When we set aside details such as massively scalable processing, virtualization, service orientation and always-on access; we have nothing short of a major evolution of business itself. The key will be in pricing strategies of such services because smart IT shops can always reverse engineer your cost structure for standard cloud platform offerings as they consider the option of "private clouds" (a term I utterly detest). IBM references "private clouds" as those configurations where customers have dedicated computing and resources for their own business. They are compute and infrastructure technology stacks, nothing more magical than that. The provider profit margin and value will result from their ability to optimize everything behind the service boundary while hiding complexity from the end customer. That is something worth paying for!
The Apache Hadoop Core is an open source platform enabling developers to write and run applications across clusters of commodity computers and process vast amounts of data. As a primary investor and developer of Hadoop, today Yahoo released their own tested distribution of the code that powers their search engines, ad systems and webmail services. There is a growing move toward design patterns that leverage the parallelism inherent in distributed systems such as Hadoop and Google's MapReduce. Applications can be developed on single servers then deployed on massively distributed cloud infrastructure without knowing the details of such distribution. Once these applications are deployed they can act as their own service provider to other systems. Indeed we see that scenario with Amazon's EC2 (with native Hadoop support), IBM's Blue Cloud Initiative and Google today. This type of database is not a relational engine like Oracle or SQL Server. Hadoop enables large scale data mining for useful applications such as fraud detection and rich media indexing. I think this release is significant because it allows developers to take advantage of all the work put in to improve Hadoop over the years. Yahoo's change log file has over 8,400 lines and contains a wealth of knowledge gained by real production experience. Can Yahoo gain cloud credibility by giving it away? I think so; it gives everyone a living benchmark.

