Open Access

Terminating constraint set satisfiability and simplification algorithms for context-dependent overloading

Journal of the Brazilian Computer Society201319:107

https://doi.org/10.1007/s13173-013-0107-9

Received: 24 August 2012

Accepted: 4 March 2013

Published: 9 April 2013

Abstract

Algorithms for constraint set satisfiability and simplification of Haskell type class constraints are used during type inference in order to allow the inference of more accurate types and to detect ambiguity. Unfortunately, both constraint set satisfiability and simplification are in general undecidable, and the use of these algorithms may cause non-termination of type inference. This paper presents algorithms for these problems that terminate on any given input, based on the use of a criterion that is tested on each recursive step. The use of this criterion eliminates the need of imposing syntactic conditions on Haskell type class and instance declarations in order to guarantee termination of type inference in the presence of multi-parameter type classes, and allows program compilation without the need of compiler flags for lifting such restrictions. Undecidability of the problems implies the existence of instances for which the algorithm incorrectly reports unsatisfiability, but we are not aware of any practical example where this occurs.

Keywords

Haskell Constraint set satisfiability Constraint set simplification Termination

1 Introduction

Haskell’s type class system [5, 18] extends the Hindley-Milner type system [16] with constrained polymorphic types, in order to support overloading. Type class constraints may occur in types of expressions involving overloaded names (or symbols), and restrict the set of types to which quantified type variables may be instantiated, to those types for which these type constraints are satisfied, according to types of definitions that exist in a relevant context.

A type class declaration specifies the name and parameters of the class, and the principal type of names which can then be overloaded in instance definitions. For example:

is a declaration of type class Eq, with parameter \(a\), that specifies the principal types of (==) and (/=). Function (==) has type \(\forall \,a\)Eq\(a \Rightarrow a \rightarrow a \rightarrow \)Bool, where constraint Eq\(a\) indicates that type variable \(a\) cannot be instantiated to an arbitrary type, but only to a type that has been defined as an instance of class Eq.

An instance of a type class specifies instance types for type class parameters, and gives definitions of the overloaded names specified in the class. The type of each overloaded name in an instance definition is obtained by substituting type class parameters with corresponding instance types. For example, the following instance declarations specify definitions of the equality operator for types Int and for polymorphic lists, respectively:

For a base type, like Int, a corresponding predefined operation is provided. The definition of equality for lists of elements of an arbitrary type uses the equality test for elements of this type. Constraint Eq\(a\) must be specified as the context for the headEq[\(a\)] of the instance declaration. A context is a set of type class constraints, and constraint \(\pi \) is the head of a qualified constraint \(P\Rightarrow \pi \), where \(P\) is a set of type class constraints.

As an aside, type classes in Haskell may also contain default definitions of the overloaded names, in order to avoid repeating the same definitions in instances.

Class constraints introduced on the types of overloaded symbols occur also on the types of expressions defined in terms of these symbols. For example, consider the following function that tests list membership:

The principal type of elem is \(\forall a.\, Eq \,a \Rightarrow a \rightarrow \mathtt{{[}}a\mathtt{{]}} \rightarrow Bool\). Constraint Eq\(a\) occurs in the type of elem due to the use of the equality operator (==) in its definition.

Haskell restricts type classes to have a single parameter but the extension to multi-parameter type classes, called Haskell+mptcs in the sequel, is widely used.

Type inference for constrained type systems rely on constraint set simplification, which, for the case of type classes, essentially amounts to performing (so-called) context reduction. Constraint set simplification yields equivalent constraint sets, and are useful for providing simpler types for expressions. Context reduction simplifies constraints by substituting constraints or removing resolved constraints according to available instance definitions, besides removing duplicate constraints or substituting constraints according to the class hierarchy.

As an example, context Eq[\(t\)] is reduced to Eq\(t\), for any type \(t\), in the presence of instance Eq[\(a\)] with context Eq a.

Improvement [13] is also a process of simplification of constrained types, but it is of a different nature, and is used in type inference to avoid ambiguity and to infer more informative types. Improvement is fundamentally based on constraint set satisfiability: it is a process of transforming a constraint set \(P\) into a constraint set obtained by applying a substitution \(S\) to \(P\) so that the set of satisfiable instances of \(P\) is preserved.

The mechanism of functional dependencies and other alternatives have been proposed to deal with improvement [4, 7, 10, 11, 14], for detection of ambiguity and for specialization of constrained types in the presence of multi-parameter type classes. We do not discuss improvement specifically in this paper, but focus on constraint set satisfiability, which is only used for the implementation of improvement or any alternative approach.

Unfortunately, both constraint set satisfiability and simplification are in general undecidable problems [6], and the use of computable functions for solving these problems may cause non-termination of type inference.

This paper presents algorithms for constraint set satisfiability and simplification that use a termination criterion which is based on a measure of the sizes of types in type constraints. The sequence of constraints that unify with a constraint axiom in recursive calls of the function that checks satisfiability or simplification of a type constraint is such that either the sizes of types of each constraint in this sequence is decreasing or there exists at least one type parameter position with decreasing size.

The use of this criterion eliminates the need for imposing syntactic conditions on Haskell type class and instance declarations in order to guarantee termination of type inference in the presence of multi-parameter type classes, and allows program compilation without the need of compiler flags for lifting such restrictions.

The use of a termination criterion implies that there exist well-typed programs for which the presented algorithm incorrectly reports unsatisfiability. However, practical examples where this occurs are expected to be very rare. The algorithms have been implemented and tested by using a prototype front-end for Haskell, available at the mptc github repository. The algorithm works as expected when subjected to examples mentioned in the literature, Haskell libraries that use multi-parameter type classes and many tests, including those used by the most commonly used Haskell compiler [19], GHC, involving all pertinent GHC extensions.

Restrictions imposed on class and instance declarations in Haskell, in Haskell+mptcs and in GHC, and GHC compilation flags used to avoid these restrictions [20], are summarized in Sect. 2. Section 3 reviews entailment and satisfiability relations on type class constraints. Section 4 gives a definition of a computable function that returns the set of satisfiable substitutions of a given constraint set \(P\), when it terminates. Subsection 4.1 defines a termination criterion and redefines this computable function in order to use this criterion. Section 5 defines a constraint set simplification computable function, based on the same termination criterion. Section 6 concludes.

2 Restrictions over class and instance declarations

This section summarizes the restrictions imposed on class and instance declarations in Haskell, Haskell+mptcs and in GHC, and GHC compilation flags used to avoid these restrictions.

By default, GHC follows the Haskell language specification (i.e., the Haskell 98 report [8]), which imposes the following restrictions.
  1. 1.

    Each class declaration must have exactly one parameter.

     
  2. 2.

    The head of a qualified constraint in an instance declaration must have the form \(C(T\,\overline{\alpha })\), where \(C\) denotes a class name, \(T\) a type constructor and \(\overline{\alpha }\) a sequence of distinct type variables. Such overbar notation is used extensively in this paper: \(\overline{x}\) denotes a possibly empty sequence of elements in the set \(\{x_1, \ldots , x_n\}\), for some \(n\ge 0\).

     
  3. 3.

    Each constraint in a context \(P\) of an instance declaration \(P\Rightarrow C\,\overline{\tau }\) must have the form \(C\,a\), where \(a\) is a type variable occurring in \(\overline{\tau }\).

     
Restriction 1 allows only single-parameter type classes, but multi-parameter type classes are widely used by programmers and in Haskell libraries and are supported in many Haskell implementations. For example, consider type class Map parameterized by the key and element types, and the type class Collection, parameterized by the type constructor and the type of elements of the collection, partly sketched below:

instanceShow (Tree Int) where ... is an example of an instance declaration that does not follow restriction (2), because the head of the constraint (which has an empty context) consists of type constructor Tree applied to Int, not to a type variable.

Flag -XFlexibleInstances can be used by GHC users to avoid enforcing condition (2), i.e.,  to allow the head of a constraint in an instance declaration to be arbitrarily nested. The next is an example that does not follow restriction (3), since \(s\)\(a\) is not just a type variable: instanceShow (\(s\)\(a\)) \(\Rightarrow \)Show (Sized\(s\)\(a\))...

Instances that do not follow these restrictions are common in Haskell programs, specially in the presence of multi-parameter type classes.

Flag-XFlexibleContexts can be used by GHC users to avoid restriction (3). With the use of this flag, contexts are restricted as follows:
  1. 1.

    No type variable can have more occurrences in a constraint of a context than in the head.

     
  2. 2.

    The sum of the number of occurrences of type variables and type constructors in a context must be smaller than in the head.

     
This restriction is known as the Paterson Condition. In some cases, it is still over-restrictive. As an example, consider the following code:

This instance of Show is rejected by GHC because it has more occurrences of type variable \(f\) in a constraint than in the head. Flag-XUndecidableInstances, which lifts all restrictions (including those related to the use of functional dependencies), is needed to compile this code. With this flag, termination is ensured by imposing a depth limit on a recursion stack [20].

3 Constrained polymorphism and type class constraints

The Haskell type class system is based on the more general theory of qualified types [12], which extends the Hindley-Milner type system with constrained types.

The syntax of types with type class constraints is defined in Fig. 1, where meta-variable usage is also indicated. For simplicity, and following common practice, kinds are not considered explicitly in type expressions, and type applications are assumed to be well kinded. Function types \(\tau _1\rightarrow \tau _2\) are constructed as the curried application of the function type constructor to two arguments, and are written as usual in infix notation.
Fig. 1

Constrained types and context

The union of constraint sets \(P\) and \(Q\) is denoted by \(P, Q\) and a slight abuse of notation is made by writing simply \(\pi \) for the singleton constraint set \(\{\pi \}\).

Function \({ tv}\) is overloaded, yielding the set of free type variables of types, constraints or constraint sets, and is defined as usual. Sequence \(\overline{\alpha }\) used in the context of a set denotes of course the set of type variables in the sequence. The set of constraint axioms \(\Theta \) is induced by class and instance declarations of a program. Each instance declaration instance\(P\Rightarrow \pi \)where ... introduces an axiom scheme \(\forall \,\overline{\alpha }.\,P\Rightarrow \pi \), where \(\overline{\alpha } = { tv}(P\Rightarrow \pi )\).

For simplicity and to avoid clutter, in this paper constraint axioms introduced by type class declarations are not considered, since they add no additional problems with respect to termination of constraint set satisfiability and simplification algorithms.

The entailment relation for type class constraints is defined in Fig. 2. Rule (Mono) expresses the property of monotonicity, (Trans) of transitivity, (Subst) of closure under type substitution (cf. [12]), (Inst) defines entailment according to a constraint axiom and (Conj) deals with sets with more than one constraint.
Fig. 2

Type class constraint entailment

A type substitution \(S\) is a (kind-preserving) function from type variables to types, and extends straightforwardly to constraints, and to sets of types and sets of constraints. For convenience, a substitution is often written as a finite mapping [\(\alpha _1\mapsto \tau _1,\ldots ,\alpha _n\mapsto \tau _n\)], which is also abbreviated as [\(\overline{\alpha }\mapsto \overline{\tau }\)]. Juxtaposition \(S^{\prime } S\) is used as a synonym for function composition, \(S^{\prime }\circ S\), the domain of a substitution \(S\) is defined by \({ dom}(S)=\{\alpha \mid S(\alpha )\not =\alpha \}\) and the restriction of \(S\) to \(V\) is given by \(S|_V(\alpha ) = S(\alpha )\) if \(\alpha \in V\), otherwise \(\alpha \).

3.1 Constraint set satisfiability

Constraint set satisfiability is central to the interpretation of constrained types and is closely related to simplification and improvement. Following [13], \(\lfloor P \rfloor _\Theta \) denotes the set of satisfiable instances of constraint set \(P\), with respect to constraint axioms \(\Theta \):
$$\begin{aligned} \lfloor P \rfloor _\Theta = \{\,S P \,\mid \, \Theta \, \Vdash S P\,\} \end{aligned}$$
Equality of constraint sets is considered modulo type variable renaming. That is, constraint sets \(P\) and \(Q\) are considered to be equal by considering also that a renaming substitution \(S\) can be applied to \(P\) so as to make \(S\,P\) and \(Q\) equal. A substitution \(S\) is a renaming substitution if for all \(\alpha \in { dom}(S)\) we have that \(S(\alpha )=\beta \), for some type variable \(\beta \not \in { dom}(S)\).

If \(S P \in \lfloor P \rfloor _\Theta \) then \(S\) is called a satisfying substitution for \(P\).

Subscript \(\Theta \) will not be used hereafter because satisfiability is always considered with respect to a set of global constraint axioms \(\Theta \).

For any substitution \(S\) and constraint set \(P\) we have that \(\lfloor S P \rfloor \subseteq \lfloor P \rfloor \). The reverse inclusion, \(\lfloor P\rfloor \subseteq \lfloor S P \rfloor \), does not always hold, and allow us to characterize improvement of the set of constraints \(P\) to an equivalent but simpler or more informative constraint set \(S P\), such that \(\lfloor S P \rfloor = \lfloor P \rfloor \). Substitution \(S\) is called an improving substitution for \(P\) if applying \(S\) to \(P\) preserves the set of satisfiable instances, that is, if \(\lfloor S P \rfloor = \lfloor P \rfloor \).

The next section presents constraint set satisfiability algorithms, including an algorithm that uses a criterion for guaranteeing termination on any given input. This termination criterion is used in Sect. 5, to define a constraint set simplification algorithm.

4 Computing constraint set satisfiability

Figure 3 presents a computable function that, given any constraint set \(P\), returns, if it terminates, the set of satisfying substitutions for \(P\). The definition uses judgements of the form \(\Theta \vdash ^\mathtt{sats }P \leadsto \mathbb S \), meaning that \(\mathbb S \) is the set of satisfying substitutions for \(P\), with respect to constraint axioms \(\Theta \). The following function is used:
$$\begin{aligned} { sats}(\pi ,\Theta ) = \{ \begin{array}[t]{ll} (S|_{{ tv}(\pi )}, S P, \pi _0) \,\mid \, (\forall \,\overline{\alpha }.\,P_0 \Rightarrow \pi _0) \in \Theta , \\ S_1\,=\,[\overline{\alpha }\mapsto \overline{\beta }],\; \overline{\beta } \text{ fresh }, \\ (P \Rightarrow \pi ^{\prime }) = S_1\, (P_0 \Rightarrow \pi _0),\\ S={mgu}(\pi = \pi ^{\prime }) \} \end{array} \end{aligned}$$
where function mgu gives a most general unifier for a pair of constraints, written as an equality. That is, \({ mgu}(C\,\overline{\tau } = C\,\overline{\tau }^{\prime })\) gives a substitution \(S\) such that, \(S\,\overline{\tau } = S\,\overline{\tau }^{\prime }\) and, for any \(S^{\prime }\) such that \(S^{\prime }\,\overline{\tau } = S^{\prime }\,\overline{\tau }^{\prime }\), it holds that \(S^{\prime }= S^{\prime \prime }\circ S\), for some \(S^{\prime \prime }\).1
Fig. 3

Constraint set satisfiability

Let \(\mathbb S \) be the returned set of satisfying substitutions for a given constraint \(P\). Since \(S \in \mathbb S \) implies \(dom (S) \subseteq { tv}(P)\) — because if \(S\) is in \({ sats}(\pi ,\Theta )\) then \({ dom}(S) \subseteq { tv}(\pi )\)—the only possible satisfying substitution to be returned for the empty set of constraints is the identity substitution (\(id\)), as defined by rule SEmpty. Rule SInst computes the set \(\mathbb S _0\) of satisfying substitutions \(S\in \mathbb S _0\) for a given constraint \(\pi \), by determining the set of constraint axioms \(\forall \,\overline{\alpha }.\,P_0\Rightarrow \pi _0\) in \(\Theta \) such that \(\pi \) unifies with \(\pi _0\), and composing these substitutions with those obtained by recursively computing the set of satisfying substitutions for contexts \(S\,P_0\). Rule SConj deals with sets of constraints. The following examples illustrate the use of these rules.

\(\mathtt{{B}}\), \(\mathtt{I}\) and \(\mathtt{F}\) are used in the sequel as abbreviations of \({ Bool}\), \({ Int}\) and \({ Float}\), respectively.

Example 1

Consider \(P = \{ A\,a\,b,\, D\,b \}\) and
$$\begin{aligned} \Theta = \{ A\,\mathtt{I}\, \mathtt{{[}}\mathtt{I}\mathtt{{]}}, A\,\mathtt{I}\; \mathtt{{[}}\mathtt{{B}}\mathtt{{]}}, C\,\mathtt{I}, \forall b.\, C\, b \Rightarrow D\, \mathtt{{[}}b\mathtt{{]}}\} \end{aligned}$$
Satisfiability of \(P\) with respect to \(\Theta \) yields a set of substitutions \(\mathbb S \) given by:
$$\begin{aligned} \frac{ \begin{array}{l} \Theta \vdash ^\mathtt{sats }A\,a\,b \leadsto \mathbb S _0\\ \mathbb S = \bigl \{ S^{\prime } S \mid \begin{array}[t]{l} S \in \mathbb S _0, \, S^{\prime } \in \mathbb S _1, \Theta \vdash ^\mathtt{sats }\, S(D\,b) \leadsto \mathbb S _1 \bigr \} \end{array} \end{array}}{\Theta \vdash ^\mathtt{sats }A\,a\,b, \{ D\,b\}\, \leadsto \, \mathbb S } {\mathtt{{SConj}}} \end{aligned}$$
Then:
$$\begin{aligned} \frac{ \begin{array}{l} \Delta _0 = \{ (S_1, \emptyset , A\,\mathtt{I}\, [\mathtt{I}]), (S_2, \emptyset , A\,\mathtt{I}\, \mathtt{{[}}\mathtt{{B}}\mathtt{{]}}) \}\\ \mathbb S _0 = \bigl \{ S^{\prime }S \mid \begin{array}[t]{l} (S,Q,\pi ^{\prime }) \in \Delta _0,\, S^{\prime } \in \mathbb S ^{\prime }, \\ \Theta \vdash ^\mathtt{sats }Q \leadsto \mathbb S ^{\prime } \bigr \} \end{array} \end{array}}{\Theta \vdash ^\mathtt{sats }A\,a\,b \leadsto \mathbb S _0 } \mathtt{{SInst}} \end{aligned}$$
where \(S_1=[a \mapsto \mathtt{I},\, b\mapsto \mathtt{{[}}\mathtt{I}\mathtt{{]}}]\), \(S_2=[a \mapsto \mathtt{I},\, b\mapsto \mathtt{{[}}\mathtt{{B}}\mathtt{{]}}]\).
Then, by rule SConj, the set of satisfying substitutions for \(S_1(D\,b)= D\, \mathtt{{[}}\mathtt{I}\mathtt{{]}}\) and \(S_2(D\,b)= D\,\text{[ }\mathtt{{B}}\mathtt{{]}}\) must be computed, and are given respectively by:
$$\begin{aligned} \frac{\begin{array}{l} \Delta _1 = \{ (S_1^{\prime }|_\emptyset , \{C\,\mathtt{I}\}, D\, \mathtt{{[}}b\mathtt{{]}}) \} \\ \mathbb S _1^1 = \bigl \{ S^{\prime }S \mid \begin{array}[t]{l} (S,Q,\pi ^{\prime }) \in \Delta _1,\, S^{\prime } \in \mathbb S ^{\prime },\\ \Theta \vdash ^\mathtt{sats }Q \leadsto \mathbb S ^{\prime } \bigr \} \end{array} \end{array}}{\Theta \vdash ^\mathtt{sats }D\,\mathtt{{[}}\mathtt{I}\mathtt{{]}} \leadsto \mathbb S _1^1} \mathtt{{SInst}} \end{aligned}$$
where \(S_1^{\prime } = [b_1\mapsto \mathtt{I}]\), \(b_1\) is a fresh type variable, \(S_1^{\prime }|_\emptyset = { id}\), and
$$\begin{aligned} \frac{\begin{array}{l} \Delta _2 = \{ (S_2^{\prime }|_\emptyset , \{ C\,\mathtt{{B}}\}, D\,\text{[b] }) \}\\ \mathbb S _1^2 = \bigl \{ S^{\prime }S \mid \begin{array}[t]{l} (S,Q,\pi ^{\prime }) \in \Delta _2, S^{\prime } \in \mathbb S ^{\prime },\\ \Theta \vdash ^\mathtt{sats }Q \leadsto \mathbb S ^{\prime } \bigl \} \end{array} \end{array}}{\Theta \vdash ^\mathtt{sats }D\, \mathtt{{[}}\mathtt{{B}}\mathtt{{]}} \leadsto \mathbb S _1^2 } \mathtt{{SInst}} \end{aligned}$$
where \(S_2^{\prime } = [b_2\mapsto \mathtt{{B}}]\), \(b_2\) is a fresh type variable, \(S_2^{\prime }|_\emptyset = { id}\). Now, \(\mathbb S _1^1 = \{ { id}\}\) and \(\mathbb S _1^2 = \emptyset \). Thus, \(\mathbb S = \{S_1\}\).

The example below, extracted from [3], illustrates non-termination of the computation of the set of satisfying substitutions by the function defined in Fig. 3. We use \(T^2\,\tau \) to abbreviate \(T(T\,\tau )\) and similarly for other indices greater than \(2\).

Example 2

Let \(\Theta = \{ \forall a,b.\,\{C\, a\, b\} \Rightarrow C\,(T^2\, a)\, b \}\) and consider computing satisfiability of \(\pi = C\,a\, (T\, a)\) with respect to \(\Theta \).

We have that \(\pi \) unifies with the head of constraint axiom \(\forall a,b.\,(C\, a\, b)\, \Rightarrow \, C\, (T^2\, a)\, b\), giving substitution \(S= [a \mapsto T^2\,a_1,\, b_1 \mapsto T^3\,a_1]\). We must then recursively compute the set of satisfying substitutions of constraint \(S(C\,a_1\,b_1) = C\,a_1\,(T^3\,a_1)\). This constraint also unifies with \(\forall a,b.\,(C\, a\, b) \Rightarrow C\, (T^2\, a)\,b\), giving substitution \(S_1= [a_1 \mapsto (T^2\,a_2), b_2 \mapsto (T^3 a_1 = T^5\,a_2)]\). Again, we must recursively compute the set of satisfying substitutions of constraint \(S_1(C\,a_2\,b_2) = C\,a_2\,(T^5\,a_2)\), and the process goes on forever.

The following theorems state, respectively, correctness and completeness of the constraint set satisfiability algorithms presented in Fig. 3, with respect to the entailment relation.

Theorem 1

(Correctness of \(\vdash ^\mathtt{sats }\)) If \(\Theta \vdash ^\mathtt{sats }P \leadsto \mathbb S \) then \(\Theta \Vdash S\,P\), for all \(S \in \mathbb S \).

Proof

By induction over the derivation of \(\Theta \vdash ^\mathtt{sats }P \leadsto \mathbb S \). The only interesting case is for rule SInst. Let \(\pi = C\,\overline{\tau }\) and \(\Delta =sats(\pi ,\Theta )\). If \(\Delta = \emptyset \), the theorem holds trivially. Thus, assume \(\Delta \ne \emptyset \) and let \((S,Q,C\,\overline{\tau }_{0})\in \Delta \). By the definition of \(sats\), this means that \(\forall \,\overline{\alpha }.\,P_0\Rightarrow \,C\,\overline{\tau }_{0}\in \Theta \), where \(\overline{\alpha }={ tv}(P_{0}\,\Rightarrow \,C\,\overline{\tau }_{0})\), and \(P^{\prime }\,\Rightarrow \,C\,\overline{\tau } = [\,\overline{\alpha }\,\mapsto \,\overline{\beta }\,]P_{0}\,\Rightarrow \,C\,\overline{\tau }_{0}\). By rule Inst we have that \(\Theta ,\,P_{0}\,\Vdash \,C\,\overline{\tau }_{0}\) is provable. We also have that \(\Theta \,\vdash ^\mathtt{sats }\,Q\leadsto \,\mathbb S _{0}\), where \(Q\,=\,S\,[\,\overline{\alpha }\,\mapsto \,\overline{\beta }\,]\,P_{0}\), and thus, by the induction hypothesis, we have that (1) \(\Theta \,\Vdash \,S^{\prime }\,Q\) holds for all \(S^{\prime }\in \mathbb S _{0}\). Also, since \(\Theta ,\, P_{0}\Vdash \,C\,\overline{\tau }_{0}\) is provable, we have, by rule Subst, that (2) \(\Theta ,\,S_{0}\,P_{0}\Vdash \,S_{0}\,C\,\overline{\tau }_{0}\), where \(S_{0}\,=\,S^{\prime }\,S\,[\,\overline{\alpha }\,\mapsto \,\overline{\beta }\,]\). From (1) and (2) we have, by rule Trans, that \(\Theta \,\Vdash \,S_{0}\,C\,\overline{\tau }_{0}\) is provable. Since \(S\,\overline{\tau }\,=\,S\,[\,\overline{\alpha }\,\mapsto \,\overline{\beta }\,]\,\overline{\tau }_{0}\), this means that \(\Theta \,\Vdash \,S^{\prime }\,S\,C\,\overline{\tau }\) is provable. \(\square \)

Theorem 2

(Completeness of \(\vdash ^\mathtt{sats }\)) If \(\Theta \Vdash S\,P\) then there exist \(S^{\prime }\in \mathbb S \) and \(S^{\prime \prime }\) such that \(S^{\prime \prime }\,S^{\prime }\,P = S\,P\), where \(\Theta \vdash ^\mathtt{sats }P \leadsto \mathbb S \).

Proof

Induction over \(S\,P\) in \(\Theta \,\Vdash \,S\,P\).

4.1 Termination

The algorithm presented in Fig. 3 is modified in this section in order to ensure termination on any given input. The basic idea is to associate a value to each constraint head of the set of constraint axioms that is unified with some constraint in the recursive process of computing satisfiability, and require that the value associated to a constraint head always decreases in a new unification that occurs during this process. Computation stops if this requirement is not fulfilled, with no satisfying substitution found for the original set of constraints. Values in this decreasing chain are a measure of the size of types in constraints that unify with each constraint head axiom: the size of each constraint in this chain is decreasing or there exists a position of a type argument in the constraint such that the type’s size is decreasing.

Let the constraint value \(\eta (\pi )\) of a constraint \(\pi \), which gives the number of occurrences of type variables and type constructors in \(\pi \), be defined as follows:
$$\begin{aligned} \eta (C\, \tau _1 \cdots \tau _n)&= \sum \limits _{i=1}^n \eta (\tau _i)\\ \eta (T)&= 1\\ \eta (\alpha )&= 1\\ \eta (\tau _1\, \tau _2)&= \eta (\tau _1) + \eta (\tau _2) \end{aligned}$$
A finite constraint-head-value function \(\Phi \) is used to map constraint heads \(\pi _0\) of \(\Theta \) to pairs \((I,\Pi )\), as follows.

The first component \(I\) is a tuple \((v_0,...,v_n)\), where \(v_0\) is the least \(\eta (S\pi ^{\prime })\) of all constraints \(\pi ^{\prime }\) that have unified with \(\pi _0\) during the satisfiability test for \(\pi \), where \(S={ mgu}(\pi _0^{\prime },\pi ^{\prime })\). Each \(v_i\), \(1\le i \le n\), is the least \(\eta (\tau _i)\) where \(\tau _i\) is a type belonging to some \(S\pi ^{\prime }\) that has unified with \(\pi _0\).

We let \(I.v_i\) denote the \(i\)-th value of \(I\) and, similarly, \(\Phi (\pi _0).I\) and \(\Phi (\pi _0).\Pi \) denote respectively the first and second components of \(\Phi (\pi _0)\).

The second component \(\Pi \) of \(\Phi (\pi _0)\) contains constraints \(\pi ^{\prime }\) that unify with \(\pi _0\) and have constraint values equal to \(v_0\). This allows distinct constraints with equal constraint values to unify with \(\pi _0\) (cf. Example 6 below).

Consider a recursive step in a test of satisfiability where a constraint \(\pi \) unifies with a constraint head \(\pi _0 = C\,\tau _1\,\ldots \,\tau _n\), with \(S={ mgu}(\pi _0,\pi )\). Let \(\Phi (\pi _0)=((v_0,...,v_n),\Pi )\) and \(\eta (S\pi )=n_0\). \(\Phi (\pi _0)\) is then updated as follows. If \(n_0 < v_0\) then only the value \(v_0\) is updated, to \(n_0\). In the case that \(n_0 = v_0\) and \(\pi \not \in \Pi \), \(\Phi (\pi _0)\) is updated to \(((v_0,...,v_n),\Pi \cup \{S\pi \})\), i.e. we include \(S\pi \) in the set of constraints that have the same value \(v_0\). Finally, if \(n_0 > v_0\), we set \(v_0\) to \(-1\) and for each \(\tau _i\) such that \(\eta (\tau _i) \ge v_i\), we update \(v_i\) with \(-1\), otherwise \(v_i\) is updated with \(\eta (\tau _i)\). In subsequent steps for constraints \(\pi ^{\prime }\) that unify with \(\pi _0\), with \(S^{\prime }\) as a unifying substitution, it is required that \(\eta (S^{\prime }\tau _i) < v_i\); if there’s no such \(i\), a failure in the termination criteria is detected.

Let \(f[x\mapsto y]\) denote the usual function updating notation for \(f^{\prime }\) given by \(f^{\prime }(x^{\prime }) = y\) if \(x^{\prime }=x\), otherwise \(f(x)\).

We define \(\Phi [\pi _0,\pi ]\) as updating of \(\Phi (\pi _0) = (I,\Pi )\) as follows, where \(I = (v_0, v_1,\ldots , v_n)\), \(\pi = C\,\tau _1\,\cdots \,\tau _n\), \(n_0 = \eta (\pi )\):
$$\begin{aligned} \Phi [\pi _0,\pi ] = \begin{array}[t]{l} \Phi [\pi _0\mapsto ((n_0,v_1,\ldots ,v_n),\Pi )] \text{ if } n_0 < I.v_0; \\ \Phi [\pi _0\mapsto (I,\Pi \,\cup \,\{\pi \})] \text{ if } n_0 = I.v_0, \pi \not \in \Pi ; \\ \Phi [\pi _0\mapsto (I^{\prime },\Pi )] \text{ if } \begin{array}[t]{l} n_0 > I.v_0, \exists \,i.\,(I^{\prime }.v_i \ne -1) \end{array}\\ \,\,\, \text{ where, } \text{ for } i=1,\ldots ,n, \\ \,\,\,\,\,\, I^{\prime }.v_i = \left\{ \begin{array}{l@{\quad }l} -1 &{} \text{ if } I.v_i < \eta (\tau _i) \text{ or } i = 0\\ \eta (\tau _i) &{} \text{ otherwise } \end{array}\right. \\ \textit{Fail} \text{ otherwise } \end{array} \end{aligned}$$
The computable function (tsat) for constraint satisfiability, defined in Fig. 4, uses judgements of the form \(\Theta ,\Phi \vdash ^\mathtt{tsat }P \leadsto \mathbb S \), with constraint-head-value function \(\Phi \) as additional parameter.
Fig. 4

Terminating constraint set satisfiability

The set of satisfying substitutions for constraint set \(P\) with respect to the set of constraint axioms \(\Theta \) is given by \(\mathbb S \), such that \(\Theta ,\Phi _0 \vdash ^\mathtt{tsat }P \leadsto \mathbb S \) holds, where \(\Phi _0(\pi _0) = (I_{0},\emptyset )\) for each constraint head \(\pi _0\,=\,C\,\tau _1\,...\,\tau _n\) in \(\Theta \) and \(I_0\) is a tuple formed by \(n + 1\) occurrences of a large enough integer constant, represented by \(\infty \).

Consider the following.

Example 3

Consider computing satisfiability of \(\pi {\,=\,}Eq \mathtt{{[[}}\mathtt{I}\mathtt{{]]}}\) in \(\Theta = \{ Eq \mathtt{I},\, \forall a.\, Eq\, a \Rightarrow Eq \mathtt{{[}}a\mathtt{{]}} \}\), letting \(\pi _0 = Eq \mathtt{{[}}a\mathtt{{]}}\); we have:
$$\begin{aligned} \frac{ \begin{array}{l} \Delta _0 = { sats}(\pi ,\Theta ) = \{ \bigl ( S |_\emptyset , \{ Eq\mathtt{{[}}\mathtt{I}\mathtt{{]}} \}, \pi _0\bigr ) \}\\ S = [a_1\mapsto \mathtt{{[}}\mathtt{I}\mathtt{{]}}]\\ \mathbb S _0 = \{ S_1\circ { id}\mid \, S_1 \in \mathbb S _1,\,\,\, \Theta ,\Phi _1 \vdash ^\mathtt{tsat }Eq\mathtt{{[}}\mathtt{I}\mathtt{{]}} \leadsto \mathbb S _1\} \end{array}}{\Theta ,\Phi _0 \vdash ^\mathtt{tsat }\pi \leadsto \mathbb S _0} \end{aligned}$$
where \(\Phi _1 = \Phi _0[\pi _0,\pi ]\), \(\Phi _1(\pi _0).I = (\eta (\pi ) = 3, \infty )\), \(S\pi =\pi \) and \(a_1\) is a fresh type variable; then:
$$\begin{aligned} \frac{ \begin{array}{l} \Delta _1 = { sats}( Eq\mathtt{{[}}\mathtt{I}\mathtt{{]}},\Theta ) = \{\bigl (S^{\prime }|_\emptyset , \{ Eq \,\mathtt{I}\}, \pi _0\bigr )\}\\ S^{\prime } = [a_2\mapsto \mathtt{I}]\\ \mathbb S _1 = \{ S_2\circ { id}\mid \, S_2 \in \mathbb S _2,\,\,\, \Theta ,\Phi _2 \vdash ^\mathtt{tsat }Eq\,\mathtt{I}\leadsto \mathbb S _2\} \end{array}}{\Theta ,\Phi _1 \vdash ^\mathtt{tsat }Eq \mathtt{{[}}\mathtt{I}\mathtt{{]}} \leadsto \mathbb S _1} \end{aligned}$$
where \(\Phi _2 = \Phi _1[\pi _0, Eq\mathtt{{[}}\mathtt{I}\mathtt{{]}}]\) and \(\eta (Eq \mathtt{{[}}\mathtt{I}\mathtt{{]}}) = 2\) is less than \(\Phi _1(\pi _0).I.v_0 = 3\); then:
$$\begin{aligned} \frac{\begin{array}{l} \Delta _2 = { sats}(Eq\,\mathtt{I},\Theta ) = \{ \bigl ( { id}, \emptyset , Eq\,\mathtt{I}\bigr ) \}\\ \mathbb S _2 = \{ S_3\circ { id}\mid \, S_3 \in \mathbb S _3,\,\,\, \Theta ,\Phi _3 \vdash ^\mathtt{tsat }\emptyset \leadsto \mathbb S _3 = \{ { id}\}\} \end{array}}{\Theta ,\Phi _2 \vdash ^\mathtt{tsat }Eq \,\mathtt{I}\leadsto \mathbb S _2} \end{aligned}$$
where \(\Phi _3 = \Phi _2[Eq\,\mathtt{I}, Eq \,\mathtt{I}]\), \(\mathbb S _3 = \{ { id}\}\) by (\(\mathtt{{SEmpty}}_1\)).

Example 4

Consider again Example 2: we want to obtain the set of satisfying substitutions for constraint \(\pi =C\,a\,(T\,a)\), given \(\Theta = \{ \forall a,b.\,C\, a\, b \Rightarrow C\,(T^2\, a)\, b \}\) (computation with input \(\pi \) by the function in Fig. 3 does not terminate). We have, where \(\pi _0 = C\,(T^2\, a)\,b\):
$$\begin{aligned} \displaystyle \frac{ \begin{array}{l} \Delta _0 = { sats}(\pi ,\Theta ) = \{ \bigl ( S\,|_{\{a\}}, \{ \pi _1 \}, \pi _0 \bigr ) \}\\ S = [a{\mapsto } T^2\,a_1, b_1\mapsto T^3\,a_1]\\ \pi _1 = C\,a_1\,(T^3\,a_1)\\ \mathbb S _0 {\,=\,} \{ S_1\circ [a{\mapsto } T^2\,a_1] \mid S_1 \in \mathbb S _1,\, \Theta ,\Phi _1 \vdash ^\mathtt{tsat }\pi _1 \leadsto \mathbb S _1\} \end{array}}{\Theta ,\Phi _0 \vdash ^\mathtt{tsat }\pi \leadsto \mathbb S _0} \end{aligned}$$
where \(\Phi _1 = \Phi _0 [\pi _0, S\pi ]\), \(\eta (S\pi ) \!=\! \eta (C\,(T^2\,a_1)\,(T^3\,a_1))\! = \!7 < \Phi _0(\pi _0).I.v_0 = \infty \); then:
$$\begin{aligned} \displaystyle \frac{ \begin{array}{l} \Delta _1 = { sats}(\pi _1,\Theta ) = \{ \bigl ( S^{\prime }\,|_{\{ a_1\}}, \, \{ \pi _2 \}, \pi _0 \bigr ) \}\\ S^{\prime } = [a_1\mapsto T^2\,a_2, b_2\mapsto T^3\,a_1 = T^5 a_2]\\ \pi _2 = C\,a_2\,(T^5\,a_2)\\ \mathbb S _1 {\,=\,} \{ S_2{\circ } [a_1{\mapsto } T^2\,a_2] \mid S_2 \in \mathbb S _2,\, \Theta ,\Phi _2 \vdash ^\mathtt{tsat }\pi _2 \leadsto \mathbb S _2\} \end{array}}{\Theta ,\Phi _1 \vdash ^\mathtt{tsat }\pi _1 \leadsto \mathbb S _1} \end{aligned}$$
where \(\Phi _2 = \Phi _1 [\pi _0, S^{\prime }\pi _1]\), \(S^{\prime }\pi _1 = (C\,(T^2\,a_2)\,(T^5\,a_2)\) and, since \(\eta (S^{\prime }\pi _1) = 9 > \Phi _1(\pi _0).I.v_0 = 7\), we have that \(\Phi _2(\pi _0).I = (-1,\eta (T^2\,a_2)=3,\eta (T^5\,a_2)=6)\); then:
$$\begin{aligned} \displaystyle \frac{ \begin{array}{l} \Delta _1 = { sats}(\pi _2,\Theta ) = \{ \bigl ( S^{\prime }\,|_{\{ a_2\}}, \, \{ \pi _3 \}, \pi _0 \bigr ) \}\\ S^{\prime \prime } = [a_2\mapsto T^2\,a_3, b_3\mapsto T^5\,a_2 = T^7 a_3]\\ \pi _3 = C\,a_3\,(T^7\,a_3)\\ \mathbb S _2 {\,=\,} \{ S_3{\circ } [a_2{\mapsto } T^2\,a_3] \mid S_3 \in \mathbb S _3,\, \Theta ,\Phi _3 \vdash ^\mathtt{tsat }\pi _3 \leadsto \mathbb S _3\} \end{array}}{\Theta ,\Phi _1 \vdash ^\mathtt{tsat }\pi _1 \leadsto \mathbb S _2} \end{aligned}$$
where \(\Phi _3 = \Phi _2 [\pi _0, S^{\prime \prime }\pi _2] = \textit{Fail}\), because \(\eta (S^{\prime \prime }\pi _2) = \eta (C\,(T^3\,a_3)\,(T^7\,a_3)) = 12 > \Phi _2(\pi _0).I.v_0 = 9\) and there’s no \(i\) such that \(\Phi _3(\pi _0).I.v_i \ne -1\), meaning that no parameter of \(S^{\prime \prime }\pi _2\) has a decreasing \(\eta \) value.

The following illustrates an example of a satisfiable constraint for which computation of satisfiability involves computing satisfiability of constraints \(\pi ^{\prime }\) that unify with a constraint head \(\pi _0\) such that \(\eta (\pi ^{\prime })\) is greater than the upper bound associated to \(\pi _0\).

Example 5

Consider satisfiability of \(\pi =C\,\mathtt{I}\,(T^3\,\mathtt{I})\) in \(\Theta = \{ C\,(T\,a)\,\mathtt{I}, \forall a,b.\,C\,(T^2\, a)\,b \Rightarrow C\,a\,(T\,b)\}\). We have, where \(\pi _0 = C\,a\,(T\,b)\):
$$\begin{aligned} \displaystyle \frac{ \begin{array}{l} \Delta _0 = { sats}(\pi ,\Theta ) = \{ \bigl ( S\,|_\emptyset , \{ \pi _1 \}, \pi _0 \bigr ) \}\\ S = [a_1\mapsto \mathtt{I}, b_1\mapsto T^2\,\mathtt{I}]\\ \pi _1 = C\,(T^2\,\mathtt{I})\,(T^2\,\mathtt{I})\\ \mathbb S _0 = \{ S_1\circ { id}\mid S_1 \in \mathbb S _1,\,\,\, \Theta ,\Phi _1 \vdash ^\mathtt{tsat }\pi _1 \leadsto \mathbb S _1\} \end{array}}{\Theta ,\Phi _0 \vdash ^\mathtt{tsat }\pi \leadsto \mathbb S _0} \end{aligned}$$
where \(\Phi _1 = \Phi _0 [\pi _0, \pi ]\), \(\eta (\pi ) = 5 < \Phi _0(\pi _0).I.v_0 = \infty \), \(S\pi =\pi \); then:
$$\begin{aligned} \displaystyle \frac{ \begin{array}{l} \Delta _1 = { sats}(\pi _1,\Theta ) = \{ \bigl ( S^{\prime }\,|_\emptyset , \{\pi _2\}, \pi _0 \bigr ) \}\\ S^{\prime } = [a_2\mapsto T^2\,\mathtt{I}, b_2\mapsto T\,\mathtt{I}]\\ \pi _2 = C\,(T^4\,\mathtt{I})\,(T\,\mathtt{I})\\ \mathbb S _1 = \{ S_2\circ [a_1\mapsto T^2\,a_2] \mid S_2 \in \mathbb S _2,\,\, \Theta ,\Phi _2 \vdash ^\mathtt{tsat }\pi _2 \leadsto \mathbb S _2\} \end{array}}{\Theta ,\Phi _1 \vdash ^\mathtt{tsat }\pi _1 \leadsto \mathbb S _1} \end{aligned}$$
where \(\Phi _2 = \Phi _1 [\pi _0, \pi _1]\), \(\Phi _1(\pi _0).I = (5,\infty ,\infty )\), \(S^{\prime }\pi _1 = \pi _1\). Since \(\eta (\pi _1) = 6 > 5 = \Phi _1(\pi _0).I.v_0\), we have that \(\Phi _2(\pi _0).I\) becomes equal to \((-1,3,3)\).

Then, consider that \(\pi _2=C\,\tau _1\,\tau _2\) where \(\tau _1 = T^4\,I\) and \(\tau _2 = T\,I\). Since \(\eta (\pi _2) > \Phi _2(\pi _0).I.v_0 = -1\), there must exist \(i\), \(1\le i \le 2\), such that \(\eta (\tau _i) < \Phi _2(\pi _0).v_i\), and such condition is satisfied for \(i = 2\), updating \(\Phi _2(\pi _0).I\) to \((-1,-1,2)\). Satisfiability is then finally tested for \(\pi _3 = C\,(T^6\,\mathtt{I}) \mathtt{I}\), that unifies with \(\pi _0 = C\,(T\,a)\,\mathtt{I}\), which returns \(\mathbb S _3 = \{ [a_3\mapsto T^5\,\mathtt{I}]|_\emptyset \} = \{ { id}\}\). Constraint \(\pi \) is thus satisfiable, with \(\mathbb S _0 = \{{ id}\}\).

The following example illustrates the use of a set of constraints as a component of the constraint-head-value function.

Example 6

Let \(\pi = C\,(T^2\,\mathtt{I})\,\mathtt{F}\), \(\pi _0 = C\,(T\,a)\,b\), \(\Theta = \{ C\,\mathtt{I}\,(T^2\,\mathtt{F}), \forall a,b.\,C\,a\,(T\,b) \Rightarrow C\,(T\,a)\,b\}\):
$$\begin{aligned} \displaystyle \frac{ \begin{array}{l} \Delta _0 = { sats}(\pi ,\Theta ) = \{ \bigl ( S\,|_\emptyset , \{ \pi _1 \}, \pi _0 \bigr ) \}\\ S = [a_1\mapsto (T\,\mathtt{I}), b_1 \mapsto \mathtt{F}],\,\, \pi _1 = C\,(T\,\mathtt{I})\,(T\,\mathtt{F})\\ \mathbb S _0 = \{ S_1\circ { id}\mid \, S_1 \in \mathbb S _1,\,\,\, \Theta ,\Phi _1 \vdash ^\mathtt{tsat }\pi _1 \leadsto \mathbb S _1\} \end{array}}{\Theta ,\Phi _0 \vdash ^\mathtt{tsat }\pi \leadsto \mathbb S _0} \end{aligned}$$
where \(\Phi _1 = \Phi _0[\pi _0,\pi ]\), \(S\pi =\pi \); then:
$$\begin{aligned} \displaystyle \frac{ \begin{array}{l} \Delta _1 = { sats}(\pi _1,\Theta ) = \{ \bigl ( S^{\prime }\,|_\emptyset , \{ \pi _2 \}, \pi _0 \bigr ) \}\\ S^{\prime } = [a_2\mapsto {\mathtt{I}}, b_2 \mapsto T\,\mathtt{F}], \,\, \pi _2 = C\,\mathtt{I}\, (T^2\,\mathtt{F})\\ \mathbb S _1 = \{ S_2\circ { id}\mid \, S_2 \in \mathbb S _2,\,\,\, \Theta ,\Phi _2 \vdash ^\mathtt{tsat }\pi _2 \leadsto \mathbb S _2\} \end{array}}{\Theta ,\Phi _1 \vdash ^\mathtt{tsat }\pi _1 \leadsto \mathbb S _1} \end{aligned}$$
where \(\Phi _2 = \Phi _1[\pi _0,\pi _1]\), \(\eta (\pi _1) = 4 = \Phi _1(\pi _0).I.v_0 = \eta (\pi )\), \(S^{\prime }\pi _1=\pi _1\) and \(\pi _1\) is not in \(\Phi _1(\pi _0).\Pi _1 = \emptyset \). We have that \(\mathbb S _2 = \{ { id}\}\), because \({ sats}(C\,\mathtt{I}\,(T^2\,\mathtt{F}),\Theta ) = \{ ({ id}, \emptyset , C\,\mathtt{I}\,(T^2\,\mathtt{F}))\}\), and \(\pi \) is then satisfiable.

Since satisfiability of type class constraints is in general undecidable [6], there exist instances of this problem for which our algorithm incorrectly reports unsatisfiability. An example that exhibits an incorrect behavior, constructed by encoding a solvable post correspondence problem (PCP) instance by means of constraint set satisfiability, using G. Smith’s scheme [6], is shown below. For all examples mentioned in the literature [15, 17] and numerous tests that include those used by GHC involving pertinent GHC extensions, the algorithm works as expected, without the need of any compilation flag.

Example 7

This example uses a PCP instance taken from [9]. A PCP instance can be defined as composed of pairs of strings, each pair having a top and a bottom string, where the goal is to select a sequence of pairs such that the two strings obtained by concatenating top and bottom strings in such pairs are identical. The example uses three pairs of strings: \(p_1 = (\text{100 }, \text{1 })\) (that is, pair 1 has string 100 as the top string and 1 as the bottom string), \(p_2 = (\text{0 }, \text{100 })\) and \(p_3 = (\text{1 },\text{00 })\).

This instance has a solution: using numbers to represent corresponding pairs (i.e., 1 represents pair 1 and analogously for 2 and 3), the sequence of pairs 1311322 is a solution.

A satisfiability problem that has a solution if and only if the PCP instance has a solution can be constructed by adapting G. Smith’s scheme to Haskell’s notation. We consider for this a two-parameter class \(C\), and a constraint context such that \(\Theta = \Theta _1 \cup \Theta _2 \cup \Theta _3\), where \(\Theta _i\) is constructed from pair \(i\), for \(i=1,2,3\):
$$\begin{aligned} \begin{array}{l} \Theta _1 = \{ \begin{array}[t]{l} C\, (1 \rightarrow 0 \rightarrow 0) \, 1, \\ \forall a,b.\, C\, a\, b \Rightarrow C\, (1 \rightarrow 0 \rightarrow 0 \rightarrow a) \, (1 \rightarrow b)\, \} \end{array} \\ \Theta _2 = \{ \begin{array}[t]{l} C\, 0\, (1 \rightarrow 0 \rightarrow 0), \\ \forall a,b.\, C\, a\, b \Rightarrow C\, (0 \rightarrow a) \, (1 \rightarrow 0 \rightarrow 0 \rightarrow b)\, \} \end{array} \\ \Theta _3 = \{ \begin{array}[t]{l} C\, 1\, (0 \rightarrow 0), \\ \forall a,b.\, C\, a\, b \Rightarrow C\, (1 \rightarrow a)\, (0 \rightarrow 0 \rightarrow b)\, \} \end{array} \end{array} \end{aligned}$$
We have that constraint \(C\,a\,a\) is satisfiable, with a solution constructed from solution 1311322 of the PCP instance. Computation by our algorithm terminates, erroneously reporting unsatisfiability. The steps of the computation are omitted. The error occurs because a constraint \(\pi _2 = C\,a_2\,(1\rightarrow a_2)\) unifies with \(\pi _{01} = C\, (1 \rightarrow 0 \rightarrow 0 \rightarrow a) \, (1 \rightarrow b)\) and \(\eta (S\pi _2)\) is greater than \(\Phi (\pi _{01}).I.v_0\), where \(S = { mgu}(\pi _2,\pi _{01})\), and there’s no \(i\in \{1,2\}\) such that \(\Phi _3(\pi _0).I.v_i \ne -1\), meaning that no parameter of \(S\pi _2\) has a decreasing \(\eta \) value.

To prove that the computation of the set of satisfying substitutions for any given constraint set \(P\) by the function defined in Fig. 4 always terminates, consider that an infinite recursion might only occur if an infinite number of constraints unified with the head \(\pi _0\) of one constraint axiom in \(\Theta \), since there exist finitely many constraint axioms in \(\Theta \). This is avoided because, for any new constraint \(\pi \) that unifies with \(\pi _0\), we have, by the definition of \(\Phi [\pi _0,\pi ]\), that \(\Phi (\pi _0)\) is updated to a value distinct from the previous ones (otherwise \(\Phi [\pi _0,\pi ]\) yields Fail and computation is stopped). The conclusion follows from the fact that \(\Phi (\pi _0)\) can have only finitely many distinct values, for any \(\pi _0\). This can be seen by considering that, for any \(\pi _0\) such that \(\Phi (\pi _0) = (I,\Pi )\), the insertion of a new constraint in \(\Pi \) decreases \(k-k^{\prime }\), where \(k\) is the finite number of all possible values that can be inserted in \(\Pi \) and \(k^{\prime }\) is the cardinality of \(\Pi \). Such a decrease causes then a decrease of \(\Phi \) (since there exists only finitely many constraint heads \(\pi _0\) in \(\Theta \)). Similarly, at each step there must exist some \(i\) such that \(I.v_i\) decreases, and this can happen only a finitely number of times. We conclude that computation on any given input terminates.

The proposed termination criteria is related to the Paterson Condition used in the GHC compiler (see Sect. 2). The constraint value is based on item 2 of this condition, but, instead of using it as a syntactic restriction over constraint heads and contexts in instance declarations, we use it in the definition of a finitely decreasing chain over recursively dependent constraints.

In comparison to the use of a recursion depth limit, our approach has the advantage that type-correctness is not implementation dependent (a constraint is or is not satisfiable with respect to a given set of constraint axioms). The use of a recursion depth limit can make a constraint set satisfiable in one implementation and unsatisfiable in another that uses a lower limit. Incorrectly reporting unsatisfiability can occur in both cases, but is expected to be extremely rare with our approach. We are not aware of any practical example where this occurs.

The main disadvantages of our approach are that it is not syntactically possible to characterize such incorrect unsatisfiability cases and it is not very easy for programmers to understand how type class constraints are handled in such a case, if and when it occurs. However, we expect these cases not to occur in practice.

The presented algorithm has been verified to behave correctly, without the need of any compilation flag, on all examples found in the literature [15], all GHC test cases, involving flags FlexibleInstances, FlexibleContexts and UndecidableInstances, and on Haskell libraries that use multi-parameter type classes, including the monad transformer library [1].

5 Constraint set simplification

The process of simplification of a constraint set, also called context reduction, consists of reducing each constraint \(\pi \) in this set to the context obtained by recursively reducing the context \(P\) of the matching instance for \(\pi \) in \(\Theta \), if such matching exists, until \(P=\emptyset \) or there exists no instance in \(\Theta \) that matches with \(\pi \). In the latter case \(\pi \) reduces to itself.

This recursive process may not terminate: as a simple example, consider reduction of constraint \(C\, a\) when \(\Theta = \{ \forall a.\, C\, a \Rightarrow C\, a\}\).

This section presents a computable function for constraint set simplification, where computation is guaranteed to terminate by using the same criterion used in Sect. 4.1.

Constraint set simplification is essentially based on instance matching. We use function \({ matches}(\pi ,\Theta )\), defined below, in order to capture the relevant information of matching constraint axioms in \(\Theta \) with a given constraint \(\pi \). Function \({ matches}\) is defined by using function \({ sats}\) (Sect. 4), through skolemization of type variables that occur in the given constraint argument (Skolem variables are non unifiable variables, that is, constants):
$$\begin{aligned} { matches}(\pi ,\Theta )&= \Big \{ (S\,P, \pi ^{\prime }) \mid \Delta ={ sats}\bigl ([\overline{\alpha } \mapsto \overline{K}]\pi , \Theta \bigr ),\\&\qquad \quad (S, S\,P, \pi ^{\prime }) \in \Delta , \,\overline{\alpha } = { tv}(\pi ), \\&\qquad \quad \overline{K} \text{ are } \text{ fresh } \text{ Skolem } \text{ variables } \Big \} \end{aligned}$$
Function \({ matches}(\pi ,\Theta )\) returns either a singleton or an empty set2.
Constraint set simplification uses a function defined in Fig. 5 by means of judgements of the form \(\Theta , \Phi \vdash ^\mathtt{simp }P \leadsto Q\). This means that reduction of constraint set \(P\) under constraint axioms \(\Theta \) either give constraint set \(Q\) as a result or fails. Failure is caused by the criterion used for ensuring termination, explained in Sect. 4.1. Using this function, context reduction is defined as follows, where \(\Phi _0\) is as defined in Sect. 4.1:
$$\begin{aligned} \frac{{\text{ for } i = 1, \ldots , n, \, Q_i = \left\{ \begin{array}{ll} \pi _i &{} \text{ if } \Theta , \Phi _0 \vdash ^\mathtt{simp }\pi _i \leadsto { fail} \\ Q_i^{\prime } &{} \text{ if } \Theta , \Phi _0 \vdash ^\mathtt{simp }\pi _i \leadsto Q_i^{\prime } \end{array} \right. }}{\Theta \vdash ^\mathtt{simp _0}\{ \pi _1, \ldots \pi _n \} \leadsto Q_1, \ldots , Q_n} {\mathtt{R}_0} \end{aligned}$$
Fig. 5

Constraint set simplification

The rules of Fig. 5 are analogous to the ones in Fig. 4, but now termination enforced by the termination criterion is reported as a failure, which must be propagated backwards along the recursive calls of the computation. Thus, reduction of a constraint \(\pi \) is now defined by two rules, (\(\mathtt{{RInst}}_1\)) and (\(\mathtt{{RInst}}_2\)) and, analogously, two different rules are used for specifying reduction of a non-singleton set of constraints.

Rule (REmpty) specifies that an empty set of constraints reduces to itself. Rule (RStop) specifies that a constraint \(\pi \) cannot be reduced if there is no instance in \(\Theta \) that matches with \(\pi \). Rule (RFail) enforces termination, expressing that reduction cannot be performed since updating of \(\Phi \) fails.

The process of constraint set simplification is illustrated by the following example.

Example 8

Let \(\Theta = \{\forall a.\, C\,(T\, a) \Rightarrow C\,a,\,D\,\mathtt{I}\}\) and \(P=\{D\,\mathtt{I},\,C\,a\}\). According to rule (\(\mathtt{R}_0\))reduction of \(P\) amounts to independently reducing constraints \(D\,\mathtt{I}\) and \(C\,a\).

Reduction of \(D\,\mathtt{I}\) is defined by rule (\(\mathtt{RInst}_1\)):
$$\begin{aligned} \frac{{\begin{array}{l} \{ (\emptyset , D\,\mathtt{I}) \} = { matches}(D\,\mathtt{I},\Theta ) \\ \Theta ,\Phi _0[D\,\mathtt{I}, D\mathtt{I}] \vdash ^\mathtt{simp }\emptyset \leadsto \emptyset \end{array} }}{{\Theta ,\Phi _0 \vdash ^\mathtt{simp }D\,\mathtt{I}\leadsto \emptyset }} \end{aligned}$$
Reduction of \(\pi =\pi _0=C\,a\) results in failure, as shown below:
$$\begin{aligned} \frac{{\begin{array}{l} \{ (C\,(T\,a_1), \pi _0) \} = { matches}(\pi ,\Theta ) \\ \Theta ,\Phi _1 \vdash ^\mathtt{simp }(C\,(T\,a_1)) \leadsto { fail} \end{array} }}{\Theta ,\Phi _0 \vdash ^\mathtt{simp }\pi \leadsto { fail } } \end{aligned}$$
where \(\Phi _1= \Phi _0[\pi ,\pi _0]\), \(\Phi _1(\pi _0).I=(\eta (\pi ) = 1,\infty )\). We have that:
$$\begin{aligned} \frac{{\begin{array}{l} \{ (C\,(T^2\,a_2), \pi _0) \} = { matches}(C\,(T\,a_1),\Theta ) \\ \Theta ,\Phi _2 \vdash ^\mathtt{simp }(C\,(T^2\,a_2)) \leadsto { fail} \end{array} }}{\Theta ,\Phi _1 \vdash ^\mathtt{simp }(C\,(T\,a_1)) \leadsto { fail}} \end{aligned}$$
where \(\Phi _2= \Phi _1[C\,(T\,a_1),\pi _0] = \textit{ fail}\) because \(\eta (C\,(T\,a_1)) \not < \Phi _1(\pi _0).I.v_1 = 1\).

By rule (\(\text{ R }_0\)), we have that \(\Theta \vdash ^\mathtt{simp _0}\{ D\,\mathtt{I},\,C\,a\} \leadsto \{ C\,a \}\), meaning that \(D\,\mathtt{I}\) can be removed and \(C\,a\) cannot be further reduced.

The following theorem states the correctness of the constraint simplification function defined in Fig. 5.

Theorem 3

[Correctness of \(\vdash ^\mathtt{simp }\)] If \(\Theta ,\,\Phi \vdash ^\mathtt{simp }P \leadsto Q\) holds, then \(\Theta , Q \Vdash P\) is provable and \(Q\) cannot be further simplified, i.e.,  \(\Theta ,\,\Phi \vdash ^\mathtt{simp }Q \leadsto Q\).

Proof

Induction over \(\Theta ,\,\Phi \vdash ^\mathtt{simp }P \leadsto Q\).

6 Conclusion

This paper presents a termination criterion and terminating algorithms for constraint simplification and improvement, based on the use of a value that always decreases on each recursive step in these algorithms. The termination criterion defined can be used in any form of constraint simplification and improvement algorithm during type inference.

The use of this criterion eliminates the need for imposing syntactic conditions on Haskell type class and instance declarations and the need for using a recursion stack depth limit in order to guarantee termination of type inference in the presence of multi-parameter type classes, in case these syntactic conditions are chosen by programmers not to be enforced.

Since type class constraint satisfiability is in general undecidable, there exist instances of this problem for which the algorithm presented in this paper incorrectly reports unsatisfiability. However, practical examples where this occurs are expected to be very rare. The algorithms have been implemented and used in a prototype front-end for Haskell, available at http://github.com/rodrigogribeiro/mptc. For all examples mentioned in the literature, Haskell libraries that use multi-parameter type classes and tests used by the Haskell GHC compiler, involving all pertinent GHC extensions, the algorithm works as expected without the need for any compilation flag.

In comparison to the use of a recursion depth limit, our approach has the advantage that type-correctness is not implementation dependent (a constraint is or is not satisfiable with respect to a given set of constraint axioms). The use of a recursion depth limit can make a constraint set satisfiable in one implementation and unsatisfiable in another that uses a lower limit. Incorrectly reporting unsatisfiability can occur in both cases, but is expected to be extremely rare with our approach. We are not aware of any practical example where this occurs.

The main disadvantages of our approach are that it is not syntactically possible to characterize such incorrect unsatisfiability cases and it is not very easy for programmers to understand how type class constraints are handled in such a case, if and when it occurs.

Footnotes
1

See, for example [2], for the general theory of unification and algorithms for computing a most general unifier for a set of term equalities.

 
2

We do not consider overlapping instances [20], since the subject is unrelated to termination of constraint set satisfiability and simplification. Supporting overlapping instances would need a modification of function \({ matches}\) so as to select a single instance if there exist overlapping matching instances.

 

Declarations

Acknowledgments

We would like to thank the anonymous reviewers for their careful work, which has been very useful to improve the paper.

Authors’ Affiliations

(1)
Instituto de Ciências Exatas, Departamento de Ciência da Computação, Universidade Federal de Minas Gerais
(2)
Instituto de Ciências Exatas e Biológicas, Departamento de Computação, Universidade Federal de Ouro Preto

References

  1. Gill A (2006) MTL–The Monad Transformer Library. http://hackage.haskell.org/package/mtl
  2. Baader F, Snyder W (2001) Unification theory. In: Robinson J., Voronkov A (eds) Handbook of Automated Reasoning, Elsevier Science Publishers, vol. 1, pp 447–533Google Scholar
  3. Camarão C, Figueiredo L, Vasconcellos C (2004) Constraint-set Satisfiability for Overloading. In: Proc. of the 6th ACM SIGPLAN International Conf. on Principles and Practice of Declarative Programming (PPDP’04), pp 67–77Google Scholar
  4. Camarão C, Ribeiro R, Figueiredo L, Vasconcellos C (2009) A Solution to Haskell’s Multi-Parameter Type Class Dilemma. In: Proc. of the 13th Brazilian Symposium on Programming Languages (SBLP’2009), pp 5–18. http://www.dcc.ufmg.br/camarao/CT/solution-to-mptc-dilemma.pdf
  5. Hall C, Hammond K, Jones SP, Wadler P (1996) Type Classes in Haskell. ACM Trans Program Lang Syst 18(2):109–138View ArticleGoogle Scholar
  6. Smith G (1991) Polymorphic type inference for languages with overloading and subtyping. Ph.D. thesis, Cornell Univ.Google Scholar
  7. Jones M, Diatchki I (2008) Language and Program Design for Functional Dependencies. In: ACM SIGPLAN Haskell, Workshop, pp 87–98Google Scholar
  8. Jones SP et al. (2003) The Haskell 98 Language and Libraries: The Revised Report. J Func Prog 13(1):0–255. http://www.haskell.org/definition/
  9. Zhao L (2002) Solving and Creating Difficult Instances of Posts Correspondence Problem. Department of Computer Science, University of Alberta, Master’s thesisGoogle Scholar
  10. Chakravarty M, Keller G, Jones SP (2005) Associated type synonyms. In: Proc. of the 10th ACM SIGPLAN International Conf. on Functional Programming (ICFP’05), pp 241–253Google Scholar
  11. Chakravarty M, Keller G, Jones SP, Marlow S (2005) Associated types withclass.In: Proc. of the ACM Symp. on Principles of Prog. Languages (POPL’05), pp 1–13Google Scholar
  12. Jones M (1994) Qualified Types. Cambridge University Press, CambridgeGoogle Scholar
  13. Jones M (1995) Simplifying and Improving Qualified Types. In: Proc. of the ACM Conf. on Functional Prog. and Comp. Architecture (FPCA’95), pp 160–169Google Scholar
  14. Jones M (2000) Type Classes with Functional Dependencies. In: Proc. of the European Symp. on Programming (ESOP’2000). LNCS 1782Google Scholar
  15. Sulzmann M, Duck G, Jones SP, Stuckey P (2007) Understanding functional dependencies via constraint handling rules. J Funct Program 17(1):83–129View ArticleGoogle Scholar
  16. Milner R (1978) A theory of type polymorphism in programming. J Comput Syst Sci 17:348–375MathSciNetView ArticleGoogle Scholar
  17. Stuckey P, Sulzmann M (2005) A Theory of Overloading. ACM Trans Prog Lang Syst (TOPLAS) 27(6):1216–1269View ArticleGoogle Scholar
  18. Wadler P, Blott S (1989) How to make ad-hoc polymorphism less ad-hoc. In: Proc. of the 16th ACM Symp. on Principles of Prog. Lang. (POPL’89), pp 60–76. ACM Press, New YorkGoogle Scholar
  19. Jones SP et al (1998) GHC–The Glasgow Haskell Compiler. http://www.haskell.org/ghc/
  20. Jones SP et al (2011) GHC–The Glasgow Haskell Compiler 7.0.4 User’s Manual. http://www.haskell.org/ghc/

Copyright

© The Brazilian Computer Society 2013