In this section, we describe the proposed exception handling model in terms of the abstractions and primitives of the Guardian model. Concomitantly, we present our implementation of these abstractions and primitives, the EH-SCA framework, as an extension to the Apache Tuscany platform. Then, in the next section, we present the EH-SCA programming model, which leverages aspect-oriented programming techniques to simplify the definition of handlers and their association with specific points of an SCA system.
In the current version of EH-SCA, the components that signal, raise, and handle exceptions should be written in Java. To make EH-SCA language-independent would require more in-depth understanding of Tuscany’s internals and a larger implementation effort. Since we organize EH-SCA in terms of the primitives of the guardian model, which is language-independent, we can say that it does not rely on the specifics of the Java language. At the same time, an SCA application whose components employ multiple technologies can still use EH-SCA, since Apache Tuscany can make these components communicate employing different technologies, such as WS-BPEL, JSON-RPC, and Web Services. In scenarios where third-party services must be integrated in a fault-tolerant manner by using EH-SCA, service components written in Java are responsible for raising and handling the exceptions. This does not limit the applicability of the proposed model and implementation. The composition of third-party services can be performed by using EH-SCA-based service components as proxies for the composed services, since it is not be possible to modify the composed services.
Figure 5 presents the overall structure of a fault-tolerant composition that uses EH-SCA. The Guardian is an instance of a new implementation type, impl.guardian, responsible for implementing the exception handling model. It manages exception handling contexts, exception propagation, and exception resolution. Each participant component (P1-N) is connected to the guardian by means of a guardian member (GM1-N). EH-SCA implements guardian members as policies (Sect. 2.2) and exceptions as regular Java types. In the remainder of this section, we present the elements of EH-SCA.
Exception representation
We define exceptions as classes that extend the GlobalException class. These exceptions are global, i.e., they flow between different service components. Instances of GlobalException carry the information of which participant has raised the exception, the context in which this exception was raised (the signaling context), as well as the context in which the exception should be handled (target context). The model also predefines some membership global exceptions, such as JoinException and LeaveException. A JoinException is raised when a new participant joins a group (Sect. 3.2). In a similar way, a LeaveException is raised when a participant leaves the group. These exceptions are useful to maintain compatibility with the Guardian model.
A program raises a global exception by calling the gthrow() method (Sect. 3.2). This method is homonym to the primitive of the Guardian model responsible for signaling a global exception. In our Java implementation, local exceptions are raised by using the throw statement. We do not impose any constraints on how exceptions are represented within a component since programming languages employ different approaches for exceptions. Some of them, such as C, do not have the concept of exception nor anything similar to it.
The signatures of the services in component interfaces should explicitly indicate the exceptions they raise. During composition, the compiler should check whether clients of these services handle these exceptions and, if they do not, produce an error message. In other words, global exceptions are checked exceptions. A number of modern programming languages, such as C#, Scala, and Go, do not use checked exceptions because of their well-known maintainability issues [24]. Nonetheless, we consider that explicitly indicated exceptions provide useful documentation. In addition, they represent part of the contract that users of a service must be aware of. Furthermore, due to compiler support, they can improve system reliability by emphasizing the need for errors to be handled. Finally, if checked exceptions are employed only at the component interface level, changes to such interfaces do not necessarily imply global changes as would be the case for finer-grained checked exceptions.
Exception handling contexts and coordination
The guardian group is the central entity responsible for mediating the interaction between participants of a composition when errors occur during its execution. The guardian group provides an interface to enable and disable raising, signaling, and handling exception contexts. In EH-SCA, the guardian group is itself a service component.
Each participant is associated with a set of exception handling contexts, enabled dynamically, at runtime, and explicitly, by the application. Nonetheless, applications can enable and disable contexts in a nonintrusive way, without the need to modify the source code of preexisting service components. When one or more global exceptions are raised within a context, all the participants where that context is enabled are involved in coordinated error recovery. In the proposed model, contexts are always associated with sets of participants and may be nested, similarly to Java’s try blocks (although the latter define static scopes). As discussed in the previous section, for SCA systems, it does not make sense to define finer-grained, intracomponent exception handling contexts since components might be implemented using radically different technologies and languages.
The guardian group (or simply guardian) element was implemented as an implementation type called “implementation.guardian” using the Tuscany SPI (Sect. 2.2). Figure 6 depicts the pseudoschema of this new implementation type. Note that the configuration is done through the “guardianProperties” element (line 2). The recovery_rules (line 3) and resolution_trees (line 4) attributes allow the definition of the recovery rules and resolution trees, respectively.
Figure 7 shows the GuardianPrimitives interface implemented by all guardian service components. This interface defines operations that are used to establish communication between participants and guardian members, and guardian members and their respective guardians, where a participant is implemented as a service component. It is important to stress that the implementation of GuardianGroup is internal to EH-SCA; applications to not need to implement this interface.
The first two methods control exception contexts. The enableContext() method adds and enables a context c in a LAST-IN-FIRST-OUT (LIFO) context list associated with a participant. The removeContext() method removes the last added context in same list. The Context class implements the concept of exception context, aggregating a name and a list of exceptions that can be handled in the context.
The gthrow() and propagate() methods control the flow of global exceptions. The first one was explained previously and is used by a participant to throw a global exception ex to a set of participants specified in participantList. The invocation of gthrow() causes the suspension of all the participants listed in participantList, as well as the interruption of the invoker participant. The propagate() method determines whether the global exception ex should be handled in the current context or propagated to an upper level context. In other words, the method compares the current context with the target context specified in ex. Since the guardian does not have control over the exception flow inside the participant, the existence of the propagate() method is necessary.
At last, the checkExceptionStatus() method allows a participant to check the existence of any pendent global exception that needs to be handled. It is executed periodically by the participants. If there are any global exceptions to be delivered, they are raised within the participant. Otherwise, the method simply returns.
Handler attachment
In the proposed model, exception handlers can be attached to local or global contexts. Local contexts are the contexts that the underlying programming language implements. In EH-SCA, a local context corresponds to a block of statements, the only kind of exception handling context that Java supports, by means of try-catch blocks, where the catch blocks are local handlers. Local handlers address internal exceptions. Handling these exceptions does not require a coordinated approach.
In broad terms, we consider that global handlers can be attached to sets of service components taking part in a composition. At the same time, participants of a composition can have multiple global exception contexts associated with them. For each context, it is possible to attach a number of exception handlers. In fact, there are no bounds on the number of contexts per service component, nor on the number of exception handlers per global context. When a global exception is signaled by a method, it is passed on to the guardian. The latter, based on its recovery rules (Sect. 3.4), decides which exception will be raised in each participant of the composition and the context where this will happen. An exception handler is triggered in a service component if it has an exception handler attached to the selected context and targeting the raised exception.
In EH-SCA, a handler is any subclass of the AbstractHandler class. The latter defines the execute() method, which receives a single argument of type GlobalException and implements the handler logic. Components in an application that uses EH-SCA should extend the HandlerContainer class. This class implements methods for managing the handlers attached to the contexts enabled in a service component. We more carefully describe the implementation of exception handlers in EH-SCA in Sect. 4.
Exception propagation
Exception propagation is a difficult issue in service component architectures. As pointed out in Sect. 2.1, exceptions must be propagated in nonstandard ways because SCA systems are intrinsically dynamic due to user needs, heterogeneous technologies, administrative issues, and the distributed setting in which they run. As a consequence, SCA systems require more flexible policies for exception propagation, to make it possible to deal with situations such as the enabling and disabling of exception handling contexts at runtime or simply the unavailability of certain service components.
In the proposed model, recovery rules determine the exception propagation paths in an application. They establish the destination of an exception raised in a set of contexts associated with a set of participants. These rules also determine the exception that will be handled by each participant of a guardian group involved in coordinated exception handling. To better cope with the dynamism of SCA systems, recovery rules can be enabled and disabled at runtime. Hence, the propagation paths in an application can be modified dynamically, orthogonally to its structure, as required during its execution. To the best of our knowledge, this is the first exception handling model to provide this kind of flexibility to software developers.
In the EH-SCA framework, recovery rules are defined by an XML-based language whose pseudoschema is shown in Fig. 8. The rule element (line 2) defines a named (via the name attribute) rule and the exception to which that rule is applied (via signaled_exception attribute). Each rule can select one or more sets of participants in order to signal a new exception. This is accomplished via the participant element with a regular expression assigned to the match attribute (line 5).
Each participant has a dot-separated identifier defined by the elements in its context list. The same structure is applied for building the regular expression associated with the match attribute, where the character “*” can be used as a wildcard, meaning that it does not matter the context the participant is. Also, the keyword SIGNALER can be used to retrieve the participant that has raised the referred exception.
The exception that should be raised in the selected participants is determined in the throw_exception element (line 4), where the exception class is specified via class attribute and the context where the exception will be handled via the target_context attribute. The min_participants_joined and max_participant_joined optional attributes represent, respectively, the minimum and maximum number of participants that must join the guardian group for the exception to be delivered to the selected participants. Finally, the affected_participants optional element (line 9) yields a subset of the selected participants, for example, the first (FIRST keyword) or the last (LAST keyword) in the selected participant list. The order of participants in the list is determined by the order in which the guardian receives the requests for association.
Exception resolution
The proposed model, analogously to previous approaches [26], uses exception resolution trees to determine which exception represents a set of exceptions raised concurrently in a certain context. In summary, the exceptions that can be raised in a system are organized as the nodes of a tree. When two or more exceptions are raised in a given context, the tree is looked up to find a node E that has all of the raised exceptions as its children. E is then called a “resolved exception” and it is the exception sent to all the participants in involved coordinated error recovery. On the other hand, the resolved exception may undergo a transformation before it is delivered to each participant of the composition. This transformation is useful to allow independently-developed exception handlers to work as a unit because they may have been defined in terms of different exception types. As a consequence, after resolution, a number of different exceptions can be delivered to the various handlers. The transformation of resolved exceptions is defined by means of recovery rules, more specifically, the throw_exception element of Fig. 8.
The model supports the definition of various resolution trees, associated with different levels of an application. The usage of levels allows the establishment of semantic relationships among different sets of exceptions. The set of resolution trees in an application is defined using a XML-based language, as shown in Fig. 9. Currently, our implementation supports only one exception level. The resolution_tree element (line 2) defines a resolution tree for a given level, defined by the exception_level attribute. The tree itself consists of a hierarchy of exception types built using the exception element (line 3).
The guardian is responsible for finding a resolved exception when two or more exceptions are concurrently raised in a composition. Exception resolution uses the types of the raised exceptions, organized in a type hierarchy (using the exception_resolution element). Considering the types of the concurrently raised exceptions and the exception type hierarchy, exception resolution finds the lowest common ancestor of the types of all the exceptions. EH-SCA employs the algorithm of Bender and Farach [3] to find the common lowest ancestor. Considering the exception type hierarchy of Fig. 10, at initialization time, EH-SCA conducts three steps. Initially, the tree is traversed depth-first. Each node is included in a vector, E, each time it is visited. Therefore, for the hierarchy of Fig. 10, we would obtain the following:
$$E = \{ \mathsf{N0,N1,N3,N1,N4,N1,N0,N2,N5,N2,N6,N2,N0} \}$$
Afterward, we compute the depth of each node in the tree. For each position that a node occupies in E, we record its depth in the corresponding position of the L vector:
$$L = \{ 0,1,2,1,2,1,0,1,2,1,2,1,0 \}$$
For example, E[2]=N3 and L[2]=2, the depth of node N3. Finally, we build the R vector, which contains the position of the first occurrence of each node in E. Each position in R corresponds to a node in the exception type hierarchy. The order of the nodes in R is the order in which the nodes would be traversed in a breadth-first search.
$$R = \{ 0,1,7,2,4,8,10 \}$$
In the event that two exceptions are concurrently raised, EH-SCA first obtains the values stored at the corresponding positions of R. It then uses these values as indexes for vector L. The lowest depth appearing between these values in L is the depth of the resolved exception. EH-SCA obtains it by inspecting the corresponding position in vector E. Both the time required to construct the E, L, and R vectors and the time to perform exception resolution are linear with the number of nodes in the tree.
Guardian member
In EH-SCA, the guardian member is an infrastructure element responsible for mediating communication between participants of a composition and the corresponding guardian. It is not part of the exception handling model, strictly speaking, but it is necessary to ease the burden on software developers. We implemented guardian members in Apache Tuscany SCA as a new policy type (Sect. 2.2), more specifically, a new intent: guardianExceptionHandling. An intent is an abstract policy that can affect component interactions but does not include any deployment information. Figure 11 presents the definition of the guardianExceptionHandling intent. The constraints attribute (line 3) specifies that the intent applies to the references of a Java component. When guardianExceptionHandling is enabled for a participant, an interceptor is created in the invocation chains associated with the methods of the references to which the intent is associated. This interceptor is responsible for invoking the guardian member corresponding to the participant.
Figure 12 shows how guardian members fit within the structure of an EH-SCA service composition. To obtain the instance of the guardian member associated with each participant (an instance of type GuardianMemberImpl), each interceptor employs the GuardianMemberFactoryImpl factory. The internal usage of factories ensures that a single guardian member is created for each service component participant of a fault-tolerant composition.
The usage of a policy avoids the need to explicitly declare the guardian members in the SCDL file. At the developer side, a service component participant communicates directly with the guardian (another service component). However, this communication is mediated by an interceptor that hides the guardian member logic, and the defined communication model is held: participants ↔ guardian members, guardian members ↔ guardian group.
As an alternative to using a policy to implement guardian members, we could have implemented guardian members as service components. In this scenario, GuardianMemberImpl would need to implement a hypothetical GuardianMember interface which extends GuardianPrimitives (Sect. 3.2). Moreover, it would have a reference to the guardian service component. This approach has two drawbacks. The first one is that there is a runtime overhead due to the need to manage extra service components. A policy is much cheaper than a component. The second one is that developers of service-oriented applications would then need to be aware of guardian member components, declaring them in the SCDL file. By using policies to represent the latter, developers only indicate the participants of a fault-tolerant composition, associating them with the guardianExceptionHandling intent.