Background

Monitoring elastic cloud services

View project onGitHub

What is an elastic service?

Note: From here onwards we will refer to services and cloud applications with the same "services" term.

We consider that any elastic service provides a series of elasticity capabilities, allowing dynamic service reconfiguration during its run-time. Leveraging on the power of cloud computing, an elastic cloud service can change its structure/topology at run-time as to maintain some predefined performance, quality and cost requirements.

To illustrate the concept of elastic service, let us consider the following example. We have a data-as-a-service (DaaS) application for an M2M cloud platform-as-a-service, for which we have user-defined elasticity requirements w.r.t service run-time performance and cost (e.g., overall response time < X, and cost/client/h < Y$).
Conceptually, we consider any cloud service as composed of:

  • service units: functional entities of a cloud service, i.e., component having some functionality,
  • service topologies: logical groupings of service units, e.g., a data or business tier
  • virtual machines (VMs): representing the underlying cloud infrastructure, each VM running one or more instances of service units

Elastic Cloud Service
Figure 1: Elastic Cloud Service

Returning to our example, the DaaS from Figure 1 has two service topologies, Data End and Event Processing , supporting horizontal scaling by addition and removal of VMs. The Data End service topology includes two service units, a Data Node holding data, and a Data Controller managing it. The Event Processing service topology also contains two service units: Load Balancer, distributing client requests, and Event Processing interacting with the Data End and processing data. A
At run-time, due to user-defined elasticity requirements, service unit instances are added/removed dynamically, triggering allocation and deallocation of virtual machines at the virtual infrastructure level. To enforce such requirements at the virtual infrastructure level, i.e. when to add/remove a VM, of what type, under which pricing scheme, the requirements need to be linked and mapped to the run-time service structure. Monitoring data from the run-time view must also be linked back to the user-defined elasticity requirements.

Challenges in monitoring and analyzing elastic cloud services

Due to elasticity requirements w.r.t. the cost, quality and performance of individual service units or the whole service, during run-time, elasticity controllers enforce the service's elasticity capabilities as to fulfill the requirements.
In Figure 2 we depict an example of cloud service configuration, in which one virtual machine runs a single instance of a service unit. In this case, when scaling the service unit, the controller adds/removes virtual machines, each running an Event Processing instance.

Scaling
Figure 2: Scaling Event Processing Service Unit

Thus, at run-time (Figure 3, the virtual machines used by the service unit change, new machines being deployed and running machines being destroyed, depending on run-time performance and cost.

VMs used
Figure 3: VMs used by Event Processing Service Unit during its lifetime

Challenges:

From the previous example, we can see the following challenges in monitoring elastic cloud services:
  • VMs used by service units change during the service's run-time -> There is a need to collect, keep and associate monitoring information with the service unit, not just VM
  • Owners of cloud services might be interested in different monitoring information at different levels (topology or unit)
In conclusion, the VM monitoring paradigm in which the virtual machine is the core monitored entity is not sufficient for supporting elasticity controllers. We need something more than just monitoring information associated with VMs possibly grouped in clusters, as more service topologies could have different service units deployed over virtual machines in the same or different virtual clusters (e.g, high network performance clusters). With this in mind, we have developed MELA for monitoring and analyzing the elasticity behavior of cloud applications.

Elastic Cloud Service prototype

We have implemented an elastic cloud service used in validating our tools. The current prototype is available on GitHub here. Detailed videos explaining how to configure and deploy our prototype are available on the DaaSM2M project wiki.