Skip to main content


Testing the robustness of controllers for self-adaptive systems


Self-adaptive systems are software-intensive systems endowed with the ability to respond to a variety of changes that may occur in their environment, goals, or the system itself by adapting their structure and behaviour at run-time in an autonomous way. Controllers are complex components incorporated in self-adaptive systems, which are crucial to their function since they are in charge of adapting the target system by executing actions through effectors, based on information monitored by probes. However, although controllers are becoming critical in many application domains, so far very little has been done to assess their robustness. In this paper, we propose an approach for evaluating the robustness of controllers for self-adaptive software systems, aiming to identify faults in their design. Our proposal considers the stateful nature of the controller and identifies a set of robustness tests, which includes the provision of mutated inputs to the interfaces between the controller and the target system (i.e. probes). The feasibility of the approach is evaluated on Rainbow, a framework for architecture-based self-adaptation, and in the context of the case study.


One of the main traits of a self-adaptive software system, compared to any other kind of system, is its ability to deliver its services in spite of changes that may occur in the system, its environment, or even in its goals. A key component that enables self-adaptive systems to handle changes at run-time (e.g. repairing anomalies and improving operation) is a controller that relies on a feedback control loop for managing adaptations [1] by executing actions through system-level effectors on the target system, based on information monitored by probes. In the context of complex software systems, these controllers typically implement the traditional sense-plan-act architectures. An example of such controllers is the MAPE-K model, which includes four distinct operational stages, namely monitoring, analysis, planning, and execution [2]. Despite major achievements in the area, existing approaches in autonomic systems and self-adaptation do not systematically address the need to determine if a self-adaptive system can deliver a service that can justifiably be trusted when facing changes (i.e. that it will be resilient[3]). This lack of assurances is an issue that has hampered the widespread adoption of self-adaptive systems, which are often regarded as unreliable by industry. A major problem associated with the provision of evidence is the combinatorial nature of the stateful aspects of a controller and the changes that may affect the system being controlled. Since the different operational stages in the feedback control loop should be functionally independent from each other, a change might have a different impact on the controller depending on the state of the controller. Moreover, if the controller is expected to act upon a change when it occurs, there is a wide range of issues that needs to be considered when producing the appropriate action, including the place in which the change has occurred, the type and the frequency of the change, and whether it can be anticipated [4]. These factors have to be considered regarding the provision of assurances about the services to be delivered by the target system. Hence, novel techniques need to be devised in order to uncover potential faults in the controller.

The present paper describes an approach for evaluating the robustness of controllers for self-adaptive systems by abstracting away, in a first instance, from the state of the target system being controlled. The rationale behind this is the fact that the complexity associated with these controllers is such that we need first to devise novel means for evaluating the core logic that enables adaptation, before exploring the ensemble target system plus controller. Moreover, if the robustness evaluation is performed on the ensemble, some of the controller faults could be masked by the target system, or their effects upon the system could be more difficult to analyze. Hence, the decision to define an approach can be used in the robustness evaluation of different controllers, assuming that the core logic of the different operational stages is basically the same on the different controllers [5]. In such a way, we restrict the robustness tests in our approach to the inputs of the controller, which are characterized by the probes. Although the proposed approach abstracts away from the target system, we need to consider the stateful aspects of the controller, which are related to its different operational stages.

The primary contribution of this paper is the definition of an approach for evaluating the robustness of controllers, which is part of a bigger initiative that is looking into the resilience evaluation of self-adaptive systems. Our proposal considers the stateful nature of the controller by defining how the controller interface should be tested according to a target system changeload [6, 7], and the operational stage of the controller. To achieve our goal, the approach defines a set of mutation rules that should be applied to the inputs of the controller, a tailored version of a classification of the different controller failure modes, and an experimental setup and testing procedure that is specific to self-adaptive systems. A preliminary evaluation [8] of the feasibility of our approach was carried out using the Rainbow framework, which consists of a controller that supports architecture-based self-adaptation [5], and in the context of a simplified version of the case study [9]. Experimentation using Rainbow is very convenient, since its software has been widely available, its structure facilitates access to its internal components, its design is amenable to the injection of faults, and the logs Rainbow produces are suitable for analyzing the effects of the injected faults upon the controller.

The present paper extends our preliminary study by reporting on an exhaustive evaluation of the approach on a full-fledged deployment of Rainbow/, including extensive tests carried out on a comprehensive set of probes implemented using different technologies.

The rest of this paper is structured as follows. The ‘Background and related work’ section provides some background on self-adaptive systems and related work in the area of robustness testing. The ‘Case study’ section introduces the case study, which is used throughout the paper for illustrating the proposed approach. The ‘Methods’ section describes our approach that is focused specifically on evaluating the robustness of controllers for self-adaptive systems. The ‘Experimental evaluationExperimen-tal evaluation’ section presents the experimental results obtained from the evaluation of our approach. Finally, the ‘Conclusions’ section concludes the paper and indicates future research directions.

Background and related work

The run-time management of increasingly complex software-intensive systems has become a central concern in Software Engineering over the last few years [10, 11]. Specifically, a major issue in the area concerns achieving conformance to functional and non-functional requirements in a dependable and cost-effective manner despite the influence of changes that may affect the system, its environment, and system goals.

One of the seminal works addressing this concern was IBM’s autonomic computing initiative [2], which introduced a layer implementing what is known as the MAPE-K control loop to monitor, analyze, plan, and execute adaptation (with a knowledge base that supports the different activities in the control loop) for the purpose of managing a target system. In particular, some successful approaches that rely on this closed-loop control paradigm for self-adaptation exploit architectural models for high-level reasoning about the target system under management [5, 12]. In particular, Rainbow [5] is a framework which provides a base of reusable infrastructure that can be applied to a wide range of systems through customization. The framework defined by Rainbow includes mechanisms for monitoring a target system and its environment (using the observations for updating the architectural model of the target system), detecting opportunities for improving the system’s quality of services (QoS), and deciding the best course of adaptation based on the state of the system. The ‘Testing procedure’ section provides further details about the Rainbow framework, which is used for the experimental validation of our approach.

Resilience evaluation in self-adaptive systems

Despite the fact that research in the field of autonomic and self-adaptive systems is relatively new, there are already some contributions regarding their provision of assurances. However, the applicability of these contributions has been focused on the ensemble target system plus controller. To the best of our knowledge, no approaches have been proposed regarding the evaluation of controllers, although there is already some ground work that put forward the need to follow this direction [10, 11].

One of the areas that are related to that of resilience evaluation is that of resilience benchmarking, which encompasses techniques from previous efforts in performance benchmarking [13], dependability benchmarking [14], and security benchmarking [15], due to its inherent relation to performance, dependability and security. Compared to established benchmarks, a resilience benchmark may be specified following the same basic approach, but comprising a wide-ranging changeload (which will include, but will be not limited to, faults), as well as resilience metrics [6].

Other approaches deal with resilience evaluation through quantitative analysis using probabilistic model checking [16], considering the system environment as the only source of change and leaving out changes that are internal to the system. The cited approaches quantitatively measure resilience in the self-adaptive system when facing changes either internal or external to the system. However, they do not deal with an additional source of problems from the perspective of resilience, which are robustness issues addressed by the techniques presented in the current paper.

Robustness testing

Robustness testing consists in stimulating a system with erroneous input conditions with the goal of triggering internal errors. This allows testers to differentiate systems according to the number and type of errors uncovered and provides developers with information to solve or wrap the identified problems [17].

Ballista [18] uses a set of tests that combine acceptable and exceptional values on calls to kernel functions of operating systems. The parameter values used in each invocation are randomly extracted from a set of predefined tests and for each parameter a set of values of a certain data type is associated. Each operating system is classified in terms of its robustness and according to a predefined scale (the CRASH scale [18]) that distinguishes several failure modes.

Initially, Ballista was developed for POSIX APIs (including real-time extensions). Further work has been developed to adapt it to Windows operating systems [19]. In that study, the authors present the results of executing Ballista-generated exception handling tests over several functions and system calls in Windows 95, 98, CE, NT, 2000, and Linux. The authors were able to trigger system crashes in Windows 95, 98, and CE. The other systems also revealed robustness problems, but not complete system crashes.

MAFALDA (Microkernel Assessment by Fault Injection Analysis and Design Aid [20] is a tool that enables the characterisation of the behaviour of microkernels in the presence of faults. Fault injection is performed at two levels: in the parameters of system calls and in the memory segments holding the target microkernel. However, only the former is relevant when the goal is robustness testing.

The robustness testing techniques have been applied not only at the operating system level but also at the middleware layer and targeting different types of systems. The problem of robustness testing of high availability middleware is discussed in [21]. The paper presents a testing framework that integrates previous testing techniques (e.g. scenario-based testing and test result classification). The case study conducted on OpenAIS (an open implementation of the Application Interface Specification (AIS) provided by the Service Availability Forum) showed that simple techniques can identify robustness problems. However, the implementation of more complex techniques is required since these are able to find faults not detected by the simple ones.

Ballista was also adapted to be applied to middleware systems. In particular, the authors in [22] studied the robustness of various CORBA ORB implementations. In this case, the failure modes were adapted to better characterize the CORBA context, and the authors were able to reveal several issues in the middleware being tested.

In [23], we propose an experimental approach for the robustness evaluation of JMS middleware. The technique is applied successfully to three major JMS middleware providers exposing serious robustness problems, including severe security issues, which also highlights the importance of the application of robustness testing to real-world systems.

The abovementioned works implement robustness testing approaches that do not consider the state of the system under test. In [24], the impact of state on robustness testing of a safety-critical operating system (OS) is investigated by including the OS state in test cases definition. Although system-specific, results show that the state can play an important role in testing since they are able to cover more cases when compared to the traditional approaches.

An approach for robustness testing method of stateful Web services, modelled with Symbolic Transition Systems, is presented in [25]. A test case generation method is proposed using unusual values and replacement and additions of operation names. States are transversed using different operations and starting from a system specification which, depending on the system being tested, may not always be available. The authors assume that messages sent and received are only SOAP messages and suggest that a Web service could be considered as a grey box from which any type of message could be observed, increasing the potential of the technique.

In [8], we present an approach to evaluate the robustness of controllers for self-adaptive software systems, aiming at the identification of design faults. The approach is based on a set of robustness tests that include the provision of mutated inputs to the interfaces between the controller and the target system (i.e. probes). The feasibility of the approach is evaluated in the context of, a case study implemented using the Rainbow framework for architecture-based self-adaptation.

Case study

To illustrate our approach for robustness testing, we use the case study [9], which is implemented using Rainbow, and is able to reproduce the typical infrastructure for a news website. It has a three-tier architecture consisting of a set of servers that provide contents from back-end databases to clients via front-end presentation logic. Architecturally, it is a web-based client-server system that satisfies an N-tier style, as illustrated in Figure 1. The system uses a load balancer to balance requests across a pool of replicated servers, the size of which can be adjusted according to service demand. A set of client processes makes stateless requests, and the servers deliver the requested contents (i.e. text, images and videos).

Figure 1
figure1 system architecture.

The main objective for is to provide content to customers within a reasonable response time while keeping the cost of the server pool within a certain operating budget. It is considered that from time to time, due to highly popular events, experiences spikes in requests that it cannot serve adequately, even at maximum pool size. To prevent losing customers, the system can provide minimal textual contents during such peak times, instead of not providing service to some of its customers. Concretely, there are two main quality objectives for the self-adaptation of the system: (1) performance, which depends on request response time, server load, and network bandwidth, and (2) cost, associated to the number of active servers.

In the case of, Rainbow is capable of analysing trade-offs among the different objectives and executes different adaptations according to the particular run-time conditions of the system. For instance, when response time becomes too high, the system should increment server pool size if it is within budget to improve its performance; otherwise, servers should be switched to textual mode (start serving minimal text content) if cost is near budget limit.


Our approach for robustness evaluation of controllers in a self-adaptive software system considers the model depicted in Figure 2. The environment consists of all non-controllable elements that determine the operating conditions of the system (e.g. hardware, network, physical context, etc.). Regarding the system itself, we distinguish two main subsystems: a target system, which interacts with the environment by monitoring relevant variables associated with operating conditions, and a controller that manages the target system, driving adaptation whenever it is required. Concretely, the controller carries out its function by (1) monitoring the target system and its environment by means of probes that provide information about the value of relevant variables, (2) deciding if the current state of the target system and environment demands adaptation, and if this is the case, (3) applying a sequence of control actions through system-level effectors.

Figure 2

Self-adaptive software system.

In this work, we focus exclusively on the robustness of the controller, i.e. we modify the probes’ inputs into the controller with the intent of evaluating how robust is the controller regarding changes that may affect its interface when exceptional input is provided. The controller is considered as a stateful entity, regarding evaluation purposes, since for the same input, the controller’s internal state may influence its output. In order to tackle this issue, we consider input mutation during the different operation stages of the controllera to create an appropriate context for evaluating its robustness.

The key elements of our approach are as follows: changeload, which is a set of representative change scenarios, where changes are based on controller input mutations; failure mode classification, that characterizes the run-time behaviour of the controller while the target system is running in the presence of the changeload; and robustness tests, the mutation rules that are applied to the input probes into the controller.

In the following, in addition of describing the key elements of our approach, we also present how robustness tests are performed at the controller interface by mutating the inputs provided by the probes. To exemplify the principles of our approach, we have instantiated it into Rainbow [5].

Changeload model

This section describes the proposed model for the changeload, presenting the definitions adopted for the fundamental concepts that form the basis of its structure.

Definition 1(Change type)

A change type is a tuple (src,m,A) that characterizes a change, where

  • src identifies the source probe type where a mutation rule is applied.

  • m identifies the mutation rule applied on probe input.

  • A = 〈a1,…,a n 〉 (possibly empty) is a vector of attributes that holds specific information about the mutation rule.

Example 1

In, consider the change ‘Set an invalid timestamp date on a response time probe (type ClientProxyProbeT)’. A possible change type definition for this would be

invalidDateCPP _ CT = ( ClientProxyProbeT, TSInvalidDate, date )

Definition 2(Change).

Given a set of change types CT, a change is a tuple (ct,srcinst,V A ,ti,d) that corresponds to an instantiation of a change type, where

  • ct = (src,m,A)  CT determines the change type to be instanced as a change.

  • srcinst is the probe instance that is the source of change (i.e. in which the input is mutated).

  • V A  = 〈vA 1,…,v An 〉 is a vector of attribute values instantiating the attributes in A.

  • ti R 0 +

    determines the time instant in which the change is triggered.

  • d R +

    is the duration associated with the change.

It is worth observing that while some specific changes may be transient, impacting the controller’s input during a particular amount of time, in the definition above, duration can be considered equal to if the change is permanent.

Example 2.

If we consider the change type described in Example 1, a possible instantiation of it could be

invalidDateCPP _ CT, ClientProxyProbe1, ʹ2/29/1985ʹ , 10, 2 )

The systematic identification and classification of change types is fundamental to support the definition of change scenarios, which is discussed in the next paragraphs.

The main base concept in our changeload model is the scenario. A scenario is a postulated sequence of events that captures the state of the system and its environment, system goalsb, and changes affecting all the aforementioned elements. It is defined in terms of state (system and environment) and changes applied to that state.

Definition 3(Scenario).

A scenario is a tuple (wl, oc, C), where

  • wl represents the workload, that is, the amount and type of work assigned to the system (not necessarily static).

  • oc are the operational conditions of the system (including software and hardware resources needed for the system to perform its service).

  • C is a set of changes applied to controller input in the presence of the workload and operational conditions.

Based on the definition above, a change scenario is one which includes a non-empty set of changes (C ≠ ).

Definition 4(Changeload).

A changeload is a set of change scenarios.

Controller failure modes

The robustness of a controller for a self-adaptive system can be classified according to an adapted version of the CRASH scale [18], which distinguishes the following failure modes:

1. Catastrophic: the whole controller crashes or becomes corrupted (this might include the OS or machine on which the controller is running). No output is produced.

2. Restart: the controller execution hangs and may not issue any output commands, or send always the same command, within the worst case execution time associated with the adaptation cycle. The controller needs to be externally re-booted.

3. Abort: abnormal behaviour in the controller occurs due to an exception raised at run-time inside of the controller.

4. Silent: the controller fails to acknowledge an error, for instance by signalling an exception, which causes the controller to continue operating improperly.

5. Hindering: the controller fails to return a correct error code, which may hinder error recovery. The difference between a silent failure and this case is that, here, an error is acknowledged by the controller but the returned error code is incorrect.

In particular, it is worth observing that the tailored version of the CRASH scale for controllers in self-adaptive software systems includes a specific adaptation which is related with time (2).

Robustness tests

The basis of the proposed approach for evaluating the robustness of controllers for self-adaptive software systems relies on stimulating the interface of the controller, which consists of probes that monitor both the target system and its environment (see Figure 2). For evaluating how robust is the controller, regarding changes that may affect its interface, the probes’ inputs into the controller are modified according to a comprehensive set of mutation rules. Moreover, since the inputs of these probes may affect the different stages of a MAPE-K control loop, the evaluation needs to consider the controller as stateful. Although for evaluating the robustness of a controller we are able to abstract away from the application (target system), we nevertheless use the application to drive the evaluation.

Mutation rules

The set of robustness tests performed is automatically generated by applying a set of predefined mutation rules to the messages sent by probes, which characterizes the monitoring stage of the controller. Although concrete message formats and additional elements may exist depending on the case, the basic input supplied by probes to the controller typically consists of three basic elements: (1) an identifier of the variable being monitored, (2) the actual value for the variable, and (3) a timestamp that provides a temporal context for the variable being monitored. For example, in the case of Rainbow, the kind of input received by the controller consists of simple messages encoded as text strings with the following format:

[ timestamp ] variable_name : variable_value

Based on this general description of probe input, we propose a set of rules (Table 1), which have been defined based on previous works on robustness testing [18, 20, 26], and explore limit conditions that are typically the source of robustness problems.

Table 1 Mutation rules for probes

Probe usage categories

The effect of applying mutation rules on the outputs generated by the probes may manifest in different ways (or not manifest at all) in the controller, depending on its internal state. This results from the stateful nature of the controller, which may use different inputs and in a different way, depending on its operation stage (i.e. analysis, planning, or execution). Changes in the internal operation stage of the controller are also induced by input obtained from probes.

Table 2 distinguishes different probe categories, according to their use in the different operation stages of the controller. Different robustness issues may arise in the controller, depending on the particular stage/probe in which mutation rules are applied, even if the set of mutation rules applied are the same. The same probe can belong to different usage categories and be used during different stages in the controller. We consider the controller to be a gray box, while its different operation stages are black boxes on which probe mutation is applied. For the time being, we assume that each of these black boxes are stateless, even if that is not the case as far as the target system is concerned. The different stages in the controller are sequential, while monitoring is transversal to all of them.

Table 2 Probe categories

Testing procedure

As discussed previously, inputs to the Rainbow controller are delivered with the use of probes, which provide important system and environment information such as experienced response time, network latency, or server load. Robustness testing focuses on the controller’s input points (i.e. the probe information). Therefore, a complete robustness experiment must include a set of tests that focuses precisely on the information provided by each of the input probes.

Figure 3 represents the complete experimental procedure and, as we can see in the figure, each experiment includes several tests, each one focusing on a given probe. For each probe (which, at run-time, is continuously delivering information to the controller under test), we apply a single change for each probe data sample. However, we apply (in the subsequent probe data samples) the same change for a given period of time, which potentially gives us the possibility of further disturbing the system under test.

Figure 3

Robustness testing procedure.

Each robustness test focuses on a single mutation rule type, and having identified the three major controller operational stages (analysis, planning, and execution), we must execute the tests with the controller in each of these stages, as it allows us to cover more cases and potentially disclose more robustness problems. Therefore, in each test, we must drive the system from an initial state to a target state by submitting the system to a workload (i.e. changeload) for a given amount of time (ramp-up period in Figure 3). This target state is the one in which the system should be in order to start testing and can correspond to any entry point to any of the three controller stages previously mentioned. With the controller in the target state, we can start applying the changes (of the same type) specified within the changeload during a change period (see Figure 3) and while the controller is on the target state. This period of time should be set to the typical time required to transition from the target controller state for the test to the next state. After this probe mutation, there is a keep time period which is the time required for the system to reach a final state, which marks the end of the current test, and corresponds to the completion of the controller’s execution stage. At most, the keep period should be set to the worst case execution duration found in the adaptation strategies’ specification. The observation period is composed by the change period and the keep time and is used to register any potential deviations from expected controller behaviour.

Experimental evaluation

The aim of our experiments is to assess the validity of our approach to evaluate controller robustness in self-adaptive systems. In particular, we evaluate the robustness of Rainbow’s controller (i.e. Rainbow master) on an implementation of the case study described in the ‘Case study’ section.

The Rainbow framework

In this paper, we focus on Rainbow [5], an architecture-based platform for self-adaptation, which provides a substantial base of reusable infrastructure through customization, which aims to reduce the cost of self-adaptive system development. Rainbow has distinctive features: an explicit architecture model of the target system, a collection of adaptation strategies, and utility preferences to guide adaptation.

The framework defined by Rainbow includes mechanisms for (Figure 4): monitoring a target system and its environment (using the observations for updating the architectural model of the target system), detecting opportunities for improving the system’s QoS, deciding the best course of adaptation based on the state of the system, and effecting the most appropriate changes.

Figure 4

The Rainbow framework.

Rainbow’s component-and-connector architectural model of the target system is one of the main elements used in its decision-making process, using it to update monitored system information and reason about appropriate adaptation mechanisms for a particular situation.

The main components of the framework are as follows:

  • Architecture evaluator: It evaluates the model upon update to ensure that the system is operating within an acceptable range. If the evaluator determines that the system is not operating within the accepted range, it triggers the adaptation.

  • Adaptation manager: It chooses a suitable strategy based on the current state of the system (reflected in the architectural model).

  • Strategy executor: It executes the strategy chosen by the adaptation manager on the running system via system-level effectors.

  • Model manager: It updates the architecture model using the information observed in the system via probes.

Experimental setup

For our experimental setup, we deployed Rainbow and the corresponding implementation of across seven different machines (Figure 5): znns0-3 are the four content servers running Apache v2.2.16, znndb is a common backend database running mySql v14.14d5.1.61, from which the different servers extract the contents, and znnproxy is the proxy machine that runs the load balancing software (Apache running mod_proxy_balancer v2.1). The controller is deployed in a separate machine (znnmaster). All machines run Debian Linux v6.0.4 and have 512 MB of memory. Moreover, an additional machine znnclient running JMeter v2.5.1 generates the traffic during the execution of the system.

Figure 5
figure5 experimental setup.

To build the changeload used for our experiments, we identified the following:

1. Workload and operating conditions for our change scenarios, which is characteristic of a slashdot-type effect, based on a sample collected by Juric [27], previously used for a general evaluation of the effectiveness of Rainbow in [9]. In this case, scenarios have been scaled down to a duration of 5 min, which is enough to drive the controller through its different operational stages and apply the robustness tests.

2. Sets of probes for the different controller operational stages. The three last columns of Table 3 indicate the set of probes used during the analysis, planning, and execution stages of the controller. This information was identified by inspecting the specification of architecture models and adaptation strategies. Specifically, the use of a probe during the analysis stage can be determined by checking whether the constraints specified for the architecture model are defined over variables updated by a given probe. Moreover, an analogous process can be followed to identify probes used during the planning stage, which update information in variables used to specify applicability conditions of adaptation strategies. Finally, probes used during execution are identified by inspecting the predicates included in the code of the adaptation strategy itself.

3. Set of changes to be applied on our set of probes. Table 3 also indicates the set of non-applicable mutation rules to each of the probes, which are determined by the type of probe implemented, as well as the data type and value range of the variables they update. Regarding probe implementation type, probes ServerLoadProbeT and ServerFidelityProbeT are implemented in Perl, whereas the ClientProxyProbeT is implemented in Java. In both cases, the length of strings is unrestrained, therefore mutation rule MsgOverflow (A6) is not applicable to any of the probes. In the particular case of Perl probes, the null data type does not exist, disallowing the applicability of mutation MsgNull (A1). Regarding data types, all of the studied probes update numerical variables, disallowing the applicability of mutation rule VVBoolPredefined (D16). The only exception is the ServerLoadProbeT, which is not associated with a simple data type and reports a message with a custom format in the variable value, therefore preventing the use of mutation rules D3 to D16. Finally, the variables updated by some of the probes do not have a value range explicitly defined. In the case of probe ClientProxyProbeT there is an implicit lower bound of zero, due to the semantics of the information contained in the variable (e.g. negative times would make no sense), but there is no upper bound, discarding the use of rules D12 and D14 that involve the maximum value range.

Table 3 Probe use per controller stage and applicable mutation rules for

Results and Discussion

Each change scenario of the changeload results from combining the workload and operating conditions with a single change type based on an applicable mutation rule. In our changeload, each mutation rule gives way to up to three change scenarios (i.e. applied during the analysis, planning, and execution stages, respectively), which are triggered at the time instant in which the controller enters the corresponding stage, and their duration is permanent. Overall, we run 209 robustness tests using our experimental setup (33 × 3 applicable mutation rules for ClientProxyProbeT, 21 × 2 applicable mutation rules for ServerLoadT, 34 × 2 applicable mutation rules for ServerFidelityProbeT,).

Table 4 details the experimental results obtained from the tests that apply the change scenarios based on each of the identified applicable mutation rules at each one of the controller stages. To begin with, 108 out of the 209 conducted tests uncovered robustness issues (51.6%). Moreover, one of the first observations that can be made is that no catastrophic, restart, nor hindering failures were identified during the tests. Although these types of failures have not been identified during our tests, these are still needed since they portray relevant behaviours of the controller. Specifically, only 2.7% of the issues uncovered correspond to abort failures, which only occur on tests based on the mutation MsgNull (in this case, in the ClientProxyProbeT probe type, which is the only one implemented in Java). Specifically, this abort case consists of the same java.lang.NullPointerException, which is unhandled in each of the three stages of the controller during the parsing of probe response with a regular expression matcher. It is worth mentioning that additional unhandled exceptions have been detected during the course of the experiments. However, these have not been considered in the results table since they have been originated outside of the controller (concretely, on the response time probe itself).

Table 4 Robustness issues uncovered by the experiments

Silent failures are by far the most frequent failure type discovered during the tests (97.3%). These mostly correspond to incorrect updates (or the lack thereof) of property values in the architecture model of the target system, which are not acknowledged by the controller. In the case of the probes implemented in Perl (ServerLoadProbeT and ServerFidelityProbeT), when incorrect input is received by the controller, the update is ignored in all cases, and the property in the model is not updated. In contrast, in the Java probe (ClientProxyProbeT), properties are updated with clearly incorrect values (such as negative values in the case of the ClientProxyProbeT with mutations VVNumAbsoluteMinusOne or VVNumMin) or not updated in some other cases (e.g. mutations MsgNonPrintable or VNRemove).

As it can be observed, mutations that pertain the overall probe response message and the variable value (first and fourth group in Table 4, respectively) present the highest concentration of silent failures. In contrast, mutations that concern timestamps present silent failures only in cases in which the concrete element is removed (mutations TSEmpty and TSRemove). This is a consequence of the way in which the Rainbow master processes inputs from the probes. Messages sent from the probes are parsed in such a way that only the presence of a timestamp in the message is assessed, but its concrete value is not checked syntactically nor semantically. Regarding mutations that affect variable names, the ClientProxyProbeT probe shows silent failures only in the case in which the variable name is removed (mutation VNRemove). Again, this happens because only the presence of a variable name in the message coming from the Java probe is assessed, and further syntactical or semantical checks are not performed on this part of the message. However, this does not prevent the correct update of values in the architectural model of the system inside of the controller in the case of Java probes, which uses a unique probe identifier to update the value in the correct place in spite of incorrect variable names or timestamps in probe input. In contrast, we can observe that further silent failures occur in the case of Perl probes with mutations (VNSwap, VNInvalidFormat, and VNNotExist), since in this case, the variable name in the message is effectively used to carry out the appropriate update in the architecture model.

In spite of the similarity of failure patterns across probes, we have been able to observe that there are slight differences in them, also directly related with their type of implementation: (1) all instances of abort failures are given when mutating the Java probe, and (2) silent failures when mutating Perl probes always stops the updates of property values in the architecture model, in contrast with the Java probe, in which incorrect updates of values in the architectural model can also appear.

It is also worth mentioning that in spite of the similar failure patterns for the same probe across different controller stages, the specific failure instances discovered in the different controller stages are different. An instance of this is the mutation of the Java probe with the MsgNull, which results in the properties of the architecture model being updated with null values in tests conducted during the analysis stage. However, in the planning and execution stages, the last valid value on the model becomes frozen when the mutation rule is applied on the probe, and this can lead to completely different effects when considering the ensemble controller plus target system.

Summarizing, although in general terms Rainbow master is fairly robust, experimental results have shown that our approach has been able to uncover a relevant set of robustness issues in the controller. Although in this particular case the identified pattern of robustness issues at the different stages of the controller differs only to a limited extent, this can be attributed to the particular architecture of Rainbow, which uses its model manager as a safeguard for the logic in the rest of the components used throughout the different operational stages. Moreover, the obtained results align with previous research, which has shown that robustness testing may disclose a small number of different issues, despite of their potentially high relevancy to the particular system being tested [23].


In this paper, we have presented a novel approach for testing the robustness of controllers for self-adaptive software systems. The approach consists in mutating the inputs provided by probes to the controller, according to a set of mutation rules and a target system’s changeload, and taking into account the stateful nature of the controller. The proposal also includes an experimental setup and testing procedure specific to self-adaptive systems, as well as an adapted version of the CRASH failure scale that characterizes the different failure modes of a controller for self-adaptive software systems. We have evaluated the feasibility of our approach using Rainbow as a controller, which is based on an architecture-based self-adaptation framework, and in the context of the case study, which reproduces the typical infrastructure for a news website.

Our experimental results have shown that the proposed approach has been able to discover a relevant number of controller failures that might impact negatively on the resilience of the self-adaptive system. However, despite the relevant number of failures uncovered, our approach has been unable to identify any catastrophic, restart, or hindering failures in the controller. Although this might be related to the restricted observability of the controller’s internal behaviour, other factors such as the architectural robustness of the controller might be a plausible explanation for such results. Indeed, the obtained results align with previous research on robustness testing, which has shown that these techniques may disclose a narrow range of different issues, despite of their potential relevance to the particular system being tested [23]. Regarding discovered failures, most of them correspond to silent ones and are distributed in similar patterns across the different probes and controller operational stages. However, it is worth observing that even if the failure categories coincide, the specific issues discovered are different between probes implemented with different technology, i.e. Java and Perl. This is also true in some specific cases in which mutations on the same probe in different operational stages of the controller result in different kinds of silent failures.

Concerning future work, there are different lines of research that we intend to exploit based on the groundwork setup by this paper:

  • Employ different controllers and additional case studies for assessing our approach in terms of its efficiency in uncovering faults in the controller of a self-adaptive software system

  • While the focus of this paper was the evaluation of the controller, there is also the need for considering the self-adaptive system in its entirety, and this would inevitably lead to new challenges, such as the necessity to consider the full state of the target system when evaluating the robustness of the entire system, i.e. the controller plus the target system

  • Develop a framework for resilience evaluation of self-adaptive software systems based on our technique for evaluating the robustness of controllers. This work will be based upon previous work conducted on resilience evaluation of self-adaptive software systems [7, 16] and will enable us to explore how robustness issues in the controller can influence the resilience of the overall self-adaptive system

  • Extend our robustness evaluation approach into the internal components of the controller that implement the MAPE-K loop. The idea is to test the interfaces between its components, in contrast with just focusing on the interface between the controller and the target system

A long-term goal is to perform the type of evaluation described in this paper at run-time rather than development time since the structure of a self-adaptive software system is expected to evolve during run-time.


a Specifically, during analysis, planning, and execution. Monitoring is transversal to the rest of the activities in the MAPE-K loop.

b For the sake of simplicity, in this paper, we abstract away from system goals, which are not required to deal with robustness evaluation of the controller.


  1. 1.

    Brun Y, Serugendo GDM, Gacek C, Giese H, Kienle H, Litoiu M, Müller H, Pezzè M, Shaw M: Engineering self-adaptive systems through feedback loops. In Software engineering for self-adaptive systems. Heidelberg: Springer-Verlag; 2009:48–70.

  2. 2.

    Kephart JO, Chess DM: The vision of autonomic computing. Computer 2003, 36: 41–50. 10.1109/MC.2003.1160055

  3. 3.

    Laprie JC: From dependability to resilience. In DSN Fast abstracts. IEEE Computer Society; 2008.

  4. 4.

    Andersson J, de Lemos R, Malek S, Weyns D: Modeling dimensions of self-adaptive software systems. In Software engineering for self-adaptive systems. Heidelberg: Springer-Verlag; 2009:27–47.

  5. 5.

    Garlan D, Cheng SW, Huang AC, Schmerl BR, Steenkiste P: Rainbow: architecture-based self-adaptation with reusable infrastructure. IEEE Comput 2004, 37(10):46–54.

  6. 6.

    Almeida R, Vieira M: Benchmarking the resilience of self-adaptive software systems: perspectives and challenges. In 6th international symposium on software engineering for adaptive and self-managing systems (SEAMS 2011). Honolulu, 23–24 May 2011; 2011:190–195.

  7. 7.

    Cámara J, Vieira M, Almeida R, Ventura R, de Lemos R: Architecture-based resilience evaluation for self-adaptive systems. Computing 2013. doi:10.1007/s00607–013–0311–7 doi:10.1007/s00607-013-0311-7

  8. 8.

    Cámara J, de Lemos R, Laranjeiro N, Ventura R, Vieira M: Robustness evaluation of controllers in self-adaptive software systems. In 6th Latin American symposium on dependable computing (LADC 2013). Rio de Janeiro, 1–5 Apr 2013; 2013:411–420.

  9. 9.

    Cheng SW, Garlan D, Schmerl BR: Evaluating the effectiveness of the rainbow self-adaptive system. In 4th international workshop on software engineering for adaptive and self-managing systems (SEAMS 2009). Vancouver, 18–19 May 2009; 2009:132–141.

  10. 10.

    Cheng BH, de Lemos R, Giese H, Inverardi P, Magee J: Software engineering for self-adaptive systems: a research roadmap. In Software engineering for self-adaptive systems. Edited by: Cheng BH, de Lemos R, Giese H, Inverardi P, Magee J. Heidelberg: Springer-Verlag; 2009:1–26.

  11. 11.

    de Lemos R, Giese H, Müller HA, Shaw M, Andersson J, Litoiu M, Schmerl B, Tamura G, Villegas NM, Vogel T, Weyns D, Baresi L, Becker B, Bencomo N, Brun Y, Cukic B, Desmarais R, Dustdar S, Engels G, Geihs K, Göschka KM, Gorla A, Grassi V, Inverardi P, Karsai G, Kramer J, Lopes A, Magee J, Malek S, Mankovskii S, et al.: Software engineering for self-adaptive systems: a second research roadmap. In Software engineering for self-adaptive systems II Dagstuhl Castle 24–29 October 2010. Lecture notes in computer science, volume 7475. Heidelberg: Springer-Verlag; 2012.

  12. 12.

    Oreizy P, Gorlick MM, Taylor RN, Heimbigner D, Johnson G, Medvidovic N, Quilici A, Rosenblum DS, Wolf AL: An architecture-based approach to self-adaptive software. IEEE Intell Syst 1999, 14: 54–62. 10.1109/5254.769885

  13. 13.

    Gray J: Benchmark handbook: for database and transaction processing systems. San Francisco: Morgan Kaufmann Publishers Inc; 1992.

  14. 14.

    Kanoun K, Spainhower L: Dependability benchmarking for computer systems. Hoboken, NJ, USA: Wiley-IEEE Computer Society Press; 2008.

  15. 15.

    Madeira H, Vieira M: Towards a security benchmark for database management systems. In Proceedings of the 2005 International conference on dependable systems and networks, DSN ’05, 28 June–1 July 2005. Yokohama, Japan; 2005:592–601.

  16. 16.

    Cámara J, de Lemos R: Evaluation of resilience in self-adaptive systems using probabilistic model-checking. In 7th international symposium on software engineering for adaptive and self-managing systems (SEAMS 2012). Zürich, 4–5 June 2012; 2012:53–62.

  17. 17.

    Mukherjee A, Siewiorek D: Measuring software dependability by robustness benchmarking. Trans Softw Eng 1997, 23(6):366–378. 10.1109/32.601075

  18. 18.

    Koopman P, DeVale J: Comparing the robustness of POSIX operating systems. In Proceedings of the 29th annual international symposium on fault-tolerant computing, FTCS ’99. Madison, 15–18 June 1999; 1999:30–30.

  19. 19.

    Shelton C, Koopman P, DeVale K: Robustness testing of the microsoft win32 API. In International conference on dependable systems and networks DSN. New York, June 2000; 2000:261–270.

  20. 20.

    Rodríguez M, Salles F, Fabre JC, Arlat J: MAFALDA: Microkernel Assessment by Fault Injection and Design Aid. In Proceedings of the third European dependable computing conference on dependable computing. Prague, 15–17 September 1999; 1999:143–160.

  21. 21.

    Micskei Z, Majzik I, Tam F: Robustness testing techniques for high availability middleware solutions. In Proceedings of the international workshop on engineering of fault tolerant systems. Luxembourg, 12–14 June 2006; 2006.

  22. 22.

    Pan J, Koopman P, Siewiorek DP, Huang Y, Gruber R, Jiang ML: Robustness testing and hardening of CORBA ORB implementations. In The 2001 international conference on dependable systems and networks (DSN 2001). Goteborg, 1–4 July 2001; 2001:141–150.

  23. 23.

    Laranjeiro N, Vieira M, Madeira H: Experimental robustness evaluation of JMS middleware. In IEEE international conference on services computing (SCC 2008). Honolulu, 7–11 July 2008; 2008:119–126.

  24. 24.

    Cotroneo D, Di Leo D, Natella R, Pietrantuono R: A case study on state-based robustness testing of an operating system for the avionic domain. In Computer safety, reliability, and security. Proceedings 30th international conference, SAFECOMP 2011, Naples, 19–22 Sept 2011 Lecture notes in computer science, vol 6894. Heidelberg: Springer; 2011:213–227.

  25. 25.

    Salva S, Rabhi I: Stateful web service robustness. In Fifth international conference on internet and web applications and services (ICIW). Barcelona, 9–15 May 2010; 2010:167–173.

  26. 26.

    Vieira M, Laranjeiro N, Madeira H: Benchmarking the robustness of web services. In 13th IEEE Pacific Rim Dependable Computing Conference (PRDC 2007). Melbourne, 17–19 December 2007; 2007:322–329.

  27. 27.

    Juric M: Slashdotting of mjuric/universe. 2004. . Accessed 23 May 2013

Download references


This research is co-financed by the Foundation for Science and Technology via project CMU-PT/ELE/0030/2009 and by FEDER via the ‘Programa Operacional Factores de Competitividade’ of QREN with COMPETE reference: FCOMP-01-0124-FEDER-012983.

Author information

Correspondence to Javier Cámara.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors read and approved the final manuscript.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Cámara, J., de Lemos, R., Laranjeiro, N. et al. Testing the robustness of controllers for self-adaptive systems. J Braz Comput Soc 20, 1 (2014).

Download citation


  • Robustness testing
  • Controller
  • Self-adaptive system
  • Autonomic system