When moving vital applications and services to the cloud, you can realize a lot of significant benefits, including real cost savings, but your business can’t afford guesswork, uncertainty, or surprises when it comes to service levels. So how do you gain confidence in your cloud deployments?
Nimsoft’s CTO Mark Rivington presented ‘Managing Your Cloud with Confidence’ recently at Cloud Expo Europe, the leading European event dedicated to cloud computing and virtualization. Here we’d like to take a closer look at what he shared.
First, it’s important to start with an understanding of the different types of cloud models, and the different monitoring requirements each presents.
Infrastructure as a Service
When discussing IaaS, it’s important to understand that there are two groups that need to monitor these environments, the consumer of the service and the provider of the service. The following sections offer more information on each group.
Monitoring IaaS (Service Provider)
For the service provider delivering IaaS, many of the monitoring requirements above apply. In addition, service providers need to have capabilities for:
- Providing clients with self-service monitoring access to their cloud instances, with configurations modifiable through specific APIs.
- Making performance and availability data accessible through the cloud service provider’s portal, through direct access, and through portal-to-portal integration.
- Delivering tiered levels of monitoring services, with varying price points.
- Implementing monitoring as part of the provisioning process, with templates available for each specific service tier and configuration, so appropriate policies are automatically applied at instantiation.
- Orchestrating configuration entirely through external automation or provisioning systems.
Private Cloud
When your organization runs a private cloud, you effectively need a combination of capabilities that addresses the requirements outlined for both consumer and service providers monitoring IaaS environments. Plus, you need traditional data center monitoring for the infrastructure that underpins the private cloud.
In addition, today, many organizations choose to run their private clouds on converged infrastructure stacks like Vblock and FlexPod. Clearly, this gives rise to a need to monitor these offerings. Following is an overview the capabilities required to monitor Vblock:
- Discovery and deployment. As with other virtualized environments, to effectively and efficiently monitor Vblock platforms, you need automated discovery and configuration capabilities, including templates that ensure monitoring deployment that is tailored to the specific use case.
- Operational. It’s critical that you can get the visibility needed to identify under usage and over commitment of resources within Vblock, and to do efficient root cause analysis when issues arise.
- Chassis. All aspects of the Vblock chassis need to be monitored.
- Computing. All facets of performance of Cisco UCS blades and elements need to be tracked.
- Storage. To track Vblock, it’s vital to monitor storage systems, which may include EMC CLARiiON, Symmetrix, and Celera.
- Networking and interconnects. Tracking network performance is also critical, which means you need to monitor Cisco routers, SAN switches, and Nexus switches.
Ultimately, it’s vital that you can not only track the performance of these various areas, but that you get a cohesive view of the performance of the entire stack.
Keys to a “Well-behaved” Cloud Monitoring Solution
Regardless of the environment, there are many key capabilities that ensure a monitoring solution is practical and well-aligned with the dynamic nature of cloud environments. Following are some of the most vital characteristics of a “well-behaved” cloud monitoring solution:
- Zero touch configuration and deployment. Given the dynamic nature of virtualized cloud environments, organizations need monitoring to be applied automatically, with no manual intervention, as new instances come online.
- Registration and graceful deregistration of agents. The agents used for monitoring any virtual element need to be registered when they come online, and de-register when the system is taken offline, without generating any erroneous errors or alerts.
- Consistent policy application. When instantiation occurs, current monitoring policies need to be applied. This requires having a central mechanism for ensuring policy updates are automatically propagated where needed.
- Management server integration. Your collection of monitoring data should be integrated with a central management server, which enables centralized reporting.
- Secure connections. If monitoring connections need to be established with an organization’s data center, those transmissions should be secured using such mechanisms as secure sockets layer (SSL) encryption.



Next Generation mrtniooing solutions like eG Enterprise build the baseline of a new combination of requirements that span between ITIL Service Management (Capacity and Availiblity) with ITIL Security Management.I see many vendors (and prospective users) treating cloud architecture much like a electric utility that must generate, transmit and distribute electricity. The electric utility has a SLA for delivery but no knowledge of how the electricity is used (freezer vs life support equipment).Cloud computing WILL require knowledge of how the users expect to receive the cloud computing benefit in order to support the Confidentiality, Integrity, and Availability (CIA) security elements. An example is authorized users accessing and using their data from a cloud provider. If non-authorized users gain access, normal operation mrtniooing would only see a increase in volume, with no awareness about a confidentially breach.Next Generation mrtniooing will require a platform to extend both Service and Security management and leaders like eG will be able to meet this challenge.
the albtiiy to monitor what is happening at every layer of every component in an end-to-end IT service infrastructure across private and public clouds and automatically isolate which layer of which component is the source of an anomaly (i.e., root-cause’)goes to the purpose of the Event Management process as defined by the IT Infrastructure Library: to detect Events, makes sense of them, and determine the appropriate control action whether a cloud provider views this as being in their interest is another story it will be up to the customer to insist on the transparency that they require. This can go against the appeal of cloud computing in the first place (offload complexity to the service provider) but just because the Event is in the cloud doesn’t mean that it’s not wreaking havoc with your Business, and getting buried in yet more data and having to make sense of the madness yourself by talking to your trusted partners doesn’t sound like much fun to me.I say Trust but Verify’ if you know what I mean .don’t confuse collecting data and presenting information via fancy dashboards’ with really knowing what is happening. That requires real monitoring intelligence, which is what many products are lacking.