Ladies and gentlemen – It’s big, it’s data, it’s big data. It was the talk of the town at CloudCon Expo in San Francisco this week. CloudCon Expo was a great event. It was great to meet and talk to so many peers, partners, customers and people from across the industry. One of the highlights for me was participating in the CEO-only, keynote panel, the “POWER Panel: The Business of Cloud & Big Data” – with the likes of other industry thought leaders. From industry legend Marten Mickos, CEO of Eucalyptus (former CEO of MySQL AB); and Ron Bodkin, CEO of Think Big Analytics; to cloud Pioneer Chris Kemp, CEO of Nebula and OpenStack evangelist; and Lawrence Guillory, CEO of Racemi, the message was clear, cutting-edge and undeniable – Big Data is here, the cloud is ready and here is how to do it. We are still at the beginning of this crossover, and the enterprise as well as developers demand the many lessons our industry has to share.
As a cloud pioneer, I have had the distinguished experience of building the largest of cloud businesses at HP and Rackspace, on to the present where we witness the founding work we put into things like OpenStack changing the industry, and our growing success story now at Codero. These are exciting times where cloud technology is still getting better and Big Data is one of the most intriguing use case developments for the cloud on the scene. As I told the audience, cloud and Big Data have a very natural intersection; however, I was clear to point out that Public Cloud is not the end-all-be-all answer to big data hosting. The entire panel agreed on the following points.
Big Data, perhaps you’ve heard it’s the “next” big thing. It’s actually the “now” big thing, gaining prominence as more and more corporate, client, and personal information grows exponentially day-by-day (and much of it as cloud-data). The opportunity of Big Data and predictive analytics lies in the ability for a company to increase actionable knowledge of their business, market information, and customers. Everywhere you look companies are realizing the value of this immense source of information at their fingertips. It is data we all want, that we expect in our interactions, in smooth experiences, companies want it and it is changing how we operate, how we problem solve, and how we think. From the commercial applications like Walmart that processes billions of points of data each day, to the scientific side of the spectrum as it is being utilized in ongoing genome research, Big Data is here to stay.
The technologies that drive Big Data are diverse. There are many standard and emerging platforms in the field. One of the technologies many are familiar with is Hadoop, a leading framework that is open-source and has a large installed base. Among the many feature points of this and other platforms is that these leading frameworks are constructed with the enterprise in mind, with many points and methods of processing and built-in tolerance of node failure. Big Data infrastructure is all about scale – a wide base of computing that is fast, fault-tolerant, and highly available – after all you are aggregating and analyzing mission-critical company data. That brings up another critical point in that Big Data requires big performance because of the amount of data systems are expected to handle. That’s why you see systems enabled with flash, tons of RAM, and specialized configurations being utilized.
We at Codero have always believed that a Hybrid Approach of using traditional dedicated host infrastructure and Cloud Computing technologies is the right answer – contrary to many pundits.
- Big Data requires strong I/O performance, and a highly reliable infrastructure
- For strong performance and scalability reasons, you need to have the choice to deploy infrastructure that meets the specific needs of your Big Data application.
It’s a very specialized and custom proposition that requires an architecture that combines the best of cloud computing and traditional hosting.
In doing so, one can leverage “cloud servers” to quickly scale on-demand to meet sudden spikes in processing while keeping your Big Data on a dedicated (single-tenant) backend for cost reasons. We have seen that at constant workload levels, or for storing data at rest, dedicated solutions are as much as 3X less expensive than public cloud.
As we review the thought process that goes into Big Data architectures, keep in mind there are various flavors and varieties of Big Data and even cloud out there. So we are staying general, while being descriptive of these emerging constructs.
Considering Big Data Needs and Risks
Surely, a pure cloud solution has significant advantages, some weaknesses and a cost structure that you need to consider. Some thought points of Big Data in a pure cloud situation include:
- Integration with other applications
Big Data is not all that different from a number of cutting-edge technologies in that they require critical, enterprise-grade, high-performance systems, and they hold sensitive business data. Today’s CIO faces the same questions over and over again when it comes to contemplating cloud.
- What is the definition of cloud – Are we actually talking cloud and which flavor? Or is it simple hosting?
- Are the cloud models and providers mature enough to work with my requirements?
- How are we defining requirements – Performance, SLA, Multi-tenant, Geo-balancing, Capacity, Enterprise-readiness, Regulations, etc.
- Are we ready for the cloud? – Every situation is unique.
Lining it up: Big Data and Hybrid Cloud
In general, some apps are quite natural for pure cloud environments, and in other situations, hybrid and even dedicated hosting will win out. Big Data falls into this second category because as a general rule, when systems are put into the cloud, there should never be a compromise of services, performance or reasonable control. Big Data infrastructure demands among other things – flexible, configurable networking, workload and throughput balancing, and highly redundant systems, while maintaining the ability to scale easily and quickly. This list screams hybrid cloud environment, and this is reinforced by the following characteristics:
- Bare metal performance of dedicated servers
- Oriented as a service to systems and applications
- Not directly intended for public consumption
- Proprietary and Confidential data
- High-performance requirements
- High Availability infrastructure requirements
- Tight IT management
In the big picture, with rare exception, Big Data needs follow one or more of these general descriptions. A hybrid infrastructure solution will:
- Offer highest performance due to bare metal dedicated servers
- Offer highest configurability and flexibility in design
- Have the ability to quickly add capacity due to automation, and use of cloud as needed
- Be more secure due to dedicated servers
- Reduce risk
- Offer better management and reporting
A pure cloud environment simply cannot deliver on Big Data requirements in terms of price advantage, infrastructure capabilities, and business needs as effectively as a hybrid solution. We believe Big Data is the next killer app for hybrid clouds made of automated dedicated hosted servers, and virtualized cloud servers.
Making the Case – Hybrid Clouds are Ideal for most Big Data
The risks of putting mission-critical systems into environments that cannot meet present or future needs are too great, and if you look at it, hybrid environments are the best option. In the enterprise and across the world, mission-critical applications and especially Big Data systems have to be built on the best available architecture that best suits its needs. The case is clear, hybrid architecture delivers on all the specialized requirements that real-world Big Data systems entail. Hybrid cloud architecture are also best suited to meet business requirements, including risk, management, compliance, security, and adds considerable cost savings as well – without sacrificing a thing.
As always, let me know your thoughts, and if you have any questions by emailing me here.
About: Codero Hosting delivers a hybrid cloud offering with unparalleled support, featuring the best of cloud capabilities and dedicated hosting performance along with cost advantages that are custom-built. Chat with us for more info. Emil Sayegh is the President and Chief Executive Officer (CEO) of Codero Hosting. Emil joined Codero in January 2012, after launching and pioneering successful Cloud Computing and hosting businesses for Hewlett-Packard, and Rackspace.