- Original Paper
- Open Access

# Ambiguity and context-dependent overloading

- Rodrigo Ribeiro
^{1}Email author and - Carlos Camarão
^{1}

**19**:103

https://doi.org/10.1007/s13173-013-0103-0

© The Brazilian Computer Society 2013

**Received:**1 October 2012**Accepted:**7 February 2013**Published:**15 March 2013

## Abstract

This paper discusses ambiguity in the context of languages that support context-dependent overloading, such as Haskell. A type system for a Haskell-like programming language that supports context-dependent overloading and follow the Hindley-Milner approach of providing context-free type instantiation, allows distinct derivations of the same type for ambiguous expressions. Such expressions are usually rejected by the type inference algorithm, which is thus not complete with respect to the type system. Also, Haskell’s open world approach considers a definition of ambiguity that does not conform to the existence of two or more distinct type system derivations for the same type. The article presents an alternative approach, where the standard definition of ambiguity is followed. A type system is presented that allows only context-dependent type instantiation, enabling only one type to be derivable for each expression in a given typing context: the type of an expression can be instantiated only if required by the program context where the expression occurs. We define a notion of greatest instance type for each occurrence of an expression, which is used in the definition of a standard dictionary-passing semantics for core Haskell based on type system derivations, for which coherence is trivial. Type soundness is obtained as a result of disallowing all ambiguous expressions and all expressions involving unsatisfiability in the use of overloaded names. Following the standard definition of ambiguity, satisfiability is tested—i.e., “the world is closed” —if only if overloading is (or should have been) resolved, that is, if and only if there exist unreachable variables in the constraints on types of expressions. Nowadays, satisfiability is tested in Haskell, in the presence of multi-parameter type classes, only upon the presence of functional dependencies or an alternative mechanism that specifies conditions for closing the world, and that may happen when there exist or not unreachable type variables in constraints. The satisfiability trigger condition is then given automatically, by the existence of unreachable variables in constraints, and does not need to be specified by programmers, using an extra mechanism.

## Keywords

- Ambiguity
- Type systems
- Semantics

## 1 Introduction

Parametric polymorphism allows instantiation of quantified variables \(\alpha ,\) in quantified types \(\forall \alpha .\,\sigma ,\) to *all* types \([\alpha :=\tau ]\sigma ,\) that is, every type generated from \(\sigma \) by replacing free occurrences of type variable \(\alpha \) in \(\sigma \) by an arbitrary type \(\tau \) (the notion of free and bound variables is well known; see, e.g., [1, 2]). Type \(\tau \) is restricted in ML and Haskell to be any simple (unquantified) type, characterizing an important restriction of the so-called ML-style or Let-polymorphism. *Constrained polymorphism* allows instantiation of quantified type variables to be restricted to *some*, instead of being possible to occur for all simple types. The set of types to which a type variable can be instantiated depends on the types of definitions of overloaded names (or symbols) that exist in the relevant context.

Constrained polymorphism is supported in programming languages like Haskell by *context-dependent overloading*, which is a form of overloading in which the decision of which definition of an overloaded name is used depends on the context where this name is used. In other words, in any expression \(f\, e,\) the decision of which \(f\) is called or which \(e\) is applied depends not only on the types of \(f\) and \(e,\) but also on the context in which \(f\,e\) is used.

In such systems, ambiguity becomes a concern. The existence of ambiguous expressions prevents a coherent semantics to be defined by induction on type system derivations, where coherence establishes a single well-defined meaning for each expression. That is, a *coherent* semantics is such that, for any well-typed expression \(e\):

“if \(\Delta \) and \(\Delta ^{\prime }\) are derivations of typing formulas \(\Gamma \vdash e:\sigma \) and \(\Gamma ^{\prime }\vdash e:\sigma ,\) respectively, and \(\Gamma \) and \(\Gamma ^{\prime }\) give the same type to every \(x\) free in \(e,\) then

where the meanings are defined using \(\Delta \) and \(\Delta ^{\prime },\) respectively.” [1]

An expression \(e\) of type \(\sigma \) for which there exist two distinct syntax-driven derivations (\(\Delta \) and \(\Delta ^{\prime },\) for which distinct semantic values might be assigned to \(e\)) is called ambiguous, in a typing context that gives the same type to every \(x\) free in \(e\) as \(\Gamma \) and \(\Gamma ^{\prime }.\) The restriction to syntax-driven derivations avoids differences that are neither related to the syntax of terms nor to the used typing information.

A type system for a Haskell-like programming language, that supports context-dependent overloading and follow the Hindley-Milner approach of providing context-free type instantiation, allows distinct derivations of the same type to be derivable for some expressions, which are then ambiguous. Such expressions are usually rejected by the type inference algorithm, which becomes then not complete with respect to the type system. This article addresses this issue by considering an alternative approach that disallows context-independent type instantiation for type systems that support context-dependent overloading.

Ambiguity in the presence of context-dependent overloading is discussed, in the traditional way of allowing context-independent type instantiation and by characterizing ambiguity as a syntactic property of constrained types, by Jones [3] and by Stuckey and Sulzmann [4]. The unfortunate lack of completeness of type inference algorithms and incoherence of semantics definitions are reported by Vytiniotis et al. [5]. Faxén [7] expresses wishes for a deterministic way to outlaw ambiguity in the type system, at the same time recognizing the need for the description to remain more abstract than that obtained directly from the type inference algorithm. In our view, the abstract view is provided by the fact that the type system is defined in terms of relations, not functions, and the type inference algorithm can be obtained by transforming these relations into functions.

presents an alternative approach to type inference in the presence of Haskell-style constrained type classes. The key feature of the type system is that it allows only context-dependent instantiation, so an expression, if typeable, has a unique type derivation. We can therefore define the semantics over these type-derivations, thereby ensuring coherence. The declarativeness of the specification of the type system is, in our view, equivalent to it being given by a relation (between typing contexts, expressions and polymorphic constrained types), not as a function. In order to transform it into a type inference algorithm, it suffices to transform all used relations (used in the definition of such a relation) into functions.

The fact that the type system allows only context-dependent type instantiation eliminates the problem of the lack of principal type caused by user-specified type signatures, reported by Faxén [6], Sect. 3].

The article also presents an alternative approach for dealing with ambiguity in the context of Haskell’s open world approach. In Haskell, an expression is considered as ambiguous without conformance to the existence of two or more distinct derivations of the same type for this expression. Ambiguity is considered in Haskell as a syntactic condition on type expressions, conflicting with the standard semantically-related definition given above. This occurs because Haskell uses an open world approach to overloading, according to which new definitions of overloaded names might be inserted without altering the typability of expressions. This paper presents an alternative approach where there is no conflict with the standard semantic definition of ambiguity, built upon the distinction between ambiguity and resolved overloading. In this approach, the possibility of inserting new definitions (i.e., openness of the world) is restricted to cases when overloading is not resolved. When overloading is resolved, existing definitions of overloaded names are then considered, in order to check ambiguity.

With the purpose of clarifying these issues, namely that ambiguity is distinct from resolved overloading, and that ambiguity should be considered if and only if (or when and only when) overloading is resolved, we present and discuss examples in the next section which consider ambiguity in the presence of multi-parameter type classes, where ambiguity becomes more relevant. In Sect. 3 we define a mini-language called core Haskell that supports context-dependent overloading and multi-parameter type classes, and define a type system that avoids ambiguous expressions to be well-typed. In Sect. 4 we define a semantics by induction on core Haskell’s type system derivations. Section 5 concludes.

## 2 Ambiguity and overloading resolution

###
*Example 1*

In Haskell+MPTC, expression \(f\,o\) is not ambiguous, because overloadings of \(f\) and \(o\) have not been resolved.

The main purpose of this example is to show that we can have, if context-independent type instantiation is allowed, two derivations of the same type (\(Float\)) with distinct semantics, for an expression that is not ambiguous: one derivation for \(f\) of type \(Int \rightarrow Float,\) and another for \(f\) of type \(Float \rightarrow Float.\)

This example also illustrates that this occurs despite the fact that \(f\,o\) has type \((F\,a\, b, O\, a)\Rightarrow b,\) where type variable \(a\) occurs in the constraints but not in the simple type (\(b\)), considering that \(f\) is a member of type class \(F\) with two parameters \(a, b,\) having (implicitly quantifiable) type \(F\, a\, b\Rightarrow a\rightarrow b,\) and \(o\) is a member of type class \(O\) with (implicitly quantifiable) type \(O\, a\Rightarrow a.\)

In Haskell, a syntactic condition on type expressions that characterizes overloading whose resolution cannot be further deferred (i.e., must have occurred), which we call *overloading resolution condition* characterizes also “type ambiguity”. It is a syntactic condition, that conflicts with the standard definition of ambiguity, based on the existence of distinct type system derivations of the same type for an expression.

The overloading resolution condition used in Haskell (Haskell 98 or Haskell 2010), which supports only single parameter type classes, has been changed in Haskell implementations that support multi-parameter type classes. In the case of single parameter type classes, the overloading resolution condition for constrained type \(P\Rightarrow \tau \) is simply \(tv(P)\not \subseteq tv(\tau )\) (i.e., there is a type variable that occurs in \(P\) but not in \(\tau \)). In the case of multi-parameter type classes, the condition considers so-called reachable type variables. A type variable occurring in \(P\) is reachable, from a set of type variables \(tv(\tau )\) in a constrained type \(P\Rightarrow \tau ,\) if it occurs in \(\tau \) or if it occurs in a constraint in \(P\) where another reachable type variable occurs (if a type variable is not reachable, it is, of course, unreachable). In the example above, type variable \(a\) in \((F\,a\,b,O\, a)\Rightarrow b\) occurs in the set of constraints (\(\{ F\,a\,b, O\, a\}\)) but not in the simple type (\(b\)). This does not characterize that overloading must have been resolved, because type variable \(a\) is reachable, since it occurs in constraint \(F\, a\, b,\) where another reachable type variable (\(b\)) occurs. This idea, used nowadays in Haskell implementations that support multi-parameter type classes, appeared firstly in [8], as far as we know.

If used in a program context that requires \(f\,o\) to be of type \(Int\)—in an expression such as, for example, \(f\,o\) + (1::Int) —overloadings of \(f\) and \(o\) in \(f\,o\) are resolved, with \(f\) and \(o\) having types \(Int ,\rightarrow Int\) and \(Int,\) respectively.

If used in a program context that requires \(f\,o\) to be of type \(Float,\) overloadings of \(f\) and \(o\) in \(f\,o\) cannot be resolved, and then, in the context of this example, we have ambiguity. If used in a program context that requires \(f\,o\) to be of a type \(\tau \) distinct from \(Int\) and \(Float,\) overloadings of \(f\) and \(o\) in \(f\,o\) cannot be resolved either, but we have then, in the context of this example, unsatisfiability, since we cannot have a derivation of such type \(\tau \) for \(e\) in this context.

###
*Example 2*

Let \(e_0\) be the expression *show*.*read* (where \(\mathtt{^{\prime \prime }.^{\prime \prime } }\) denotes function composition, so \(e_0\) can be written also as \(\mathtt{(\lambda } x\,\)-> *show*\(\mathtt{( }\)*read*\(x \mathtt{)) }\). In Haskell, this expression is considered as ambiguous, irrespective of the context in which it occurs (see, e.g., [3, 5]).

When used in a context with two or more instance definitions, of classes \(Show\) and \(Read,\) that give functions \(show\) and \(read\) types, say, \(show:Int\rightarrow String, show:Bool\rightarrow String, read:String\rightarrow Int\) and \(read: String\rightarrow Bool,\) expression \(e_0\) is ambiguous: there exist two distinct derivations of type \(String\rightarrow String\) for \(e_0\) (one using *read* and *show* with types *read*\(\mathtt{: }\) *String*\(\,\rightarrow \,\)*Int*and *show*\(\mathtt{: }\)*Int*\(\,\rightarrow \,\)*String*, and the other using *read* and *show* with types *read*\(\mathtt{: }\) *String*\(\,\rightarrow \,\)*Bool*and *show*\(\mathtt{: }\)*Bool*\(\,\rightarrow \,\)*String*, respectively), and each one would give it distinct meanings.

However, if \(e_0\) is used in a context with a single instance for both *show* and *read*, say *show*\(\mathtt{: }\)*Int*\(\,\rightarrow \,\)*String*and *read*\(\mathtt{: }\) *String*\(\,\rightarrow \,\)*Int*, then there is no ambiguity, since there exists only one derivation for \(e_0\) of type *String*\(\rightarrow \)*String*. If there are no definitions of *show* or no definitions of *read* in the typing context, again there is an error, but of unsatisfiability, not ambiguity.

###
*Example 3*

Consider expression \(\mathtt{[] == [] }\), where \(\mathtt{(==) }\) is overloaded and \(\mathtt{[] }\) denotes an empty list.

The expression is ambiguous (according to the standard definition), if and only if there exist two or more distinct type system derivations—each given \(\mathtt{(==) }\) distinct types, say \(\mathtt{[ }T_1\mathtt{] } \rightarrow \mathtt{[ }T_1\mathtt{] }\rightarrow Bool\,\) and \(\mathtt{[ }T_2\mathtt{] } \rightarrow \mathtt{[ }T_2\mathtt{] } \rightarrow Bool\)—that may thus assign distinct meanings to \(\mathtt{([] == []) : : }\)*Bool* in the relevant context. We could have instances of \(\mathtt{(==) }\) for types \(\mathtt{[ }T_1\mathtt{] }\) and \(\mathtt{[ }T_2\mathtt{] }\) for which \(\mathtt{([] == []) }\) are assigned (unexpectedly) semantic value *False* if \(\mathtt{[] }\) has type \(\mathtt{[ }T_1\mathtt{] },\) and *True* if \(\mathtt{[] }\) has type \(\mathtt{[ }T_2\mathtt{] }.\) In Haskell this is not possible without overlapping instances or without hiding the Haskell prelude definition of \(\mathtt{(==) }\) for lists.

###
*Example 4*

Let \(e_1\) be the expression *fst*\(\mathtt{( }\)*True, o*\(\mathtt{) }\), where \(o\) is overloaded (or is an expression with a constrained type). This expression has type *Bool*. Overloading of \(o\) is not resolved, but it need not be for \(e_1\) to be well-typed (and evaluated, as equal to *True*).

The ambiguity or not of \(e_1\) in GHC [9] and its interactive interpreter counterpart GHCi, the most widely used implementations of Haskell, depends on which \(o\) is used and on whether the compiler, GHC, or an interactive session of the interpreter, GHCi, is used. In GHC and in non-interactive sessions of GHCi, \(e_1\) is not well-typed. In an interactive session (i.e., if it is typed at the GHCi’s prompt), if \(o\) has only constraints on some particular type classes (namely *Eq*, *Ord*, *Num* and *Show*), it has type *Bool*.

## 3 Core language

\(\overline{x}\) denotes the sequence \(x_1,\ldots ,x_n,\) where \(n\ge 0.\) When used in the context of a set, it denotes the corresponding set of elements in the sequence (\(\{ x_1,\ldots ,x_n\}\)).

We assume for simplicity that overloaded definitions are predefined, and form a global overloading environment (cf., e.g., [3, 10, 11]). The global overloading environment is always a fixed set of closed constraints, being an unchanged part of typing contexts. We write \(\Theta _\Gamma \) to mean a fixed, global overloading environment that is assumed to be a part of typing context \(\Gamma .\)

A substitution, denoted by meta-variable \(S,\) possibly primed or subscripted, is a function from type variables to simple type expressions. The identity substitution is denoted by \({ id}.\)\(S\sigma \) represents the capture-free operation of substituting \(S(\alpha )\) for each free occurrence of type variable \(\alpha \) in \(\sigma . S\theta \) and sets of types and constraints are defined analogously. Symbol \(\circ \) denotes function composition, and \(dom(S)=\{\alpha \mid S(\alpha )\not =\alpha \}.\)

\(S[\,\overline{\alpha } \mapsto \overline{\tau }]\) denotes updating of \(S,\) that is, the substitution \(S^{\prime }\) such that \(S^{\prime }(\beta ) = \tau _i\) if \(\beta =\alpha _i,\) for \(i=1,\ldots ,n,\) otherwise \(S(\beta ).\) We use this function updating notation for other functions other than substitutions. Also, \([\overline{\alpha } \mapsto \overline{\tau }] = id[\overline{\alpha } \mapsto \overline{\tau }].\)

The restriction \(S|_V\) of \(S\) to \(V\) denotes the substitution \(S^{\prime }\) such that \(S^{\prime }(\alpha ) = S(\alpha )\) if \(\alpha \in V,\) otherwise \(\alpha .\)

A substitution \(S\) is said to be more general than a substitution \(S^{\prime },\) written \(S \le S^{\prime },\) if there is a substitution \(S_1\) such that \(S^{\prime } = S_1\circ S.\)

###
*Example 5*

As an example of a program context requiring type instantiation, which occurs as a result of function application, consider expression \(x\) and typing context \(\Gamma = \{ f: Int \rightarrow Int, x:\alpha \}\); we can derive \(\Gamma \vdash f\, x: (Int, S),\) where \(S = [\alpha \mapsto Int].\) From \(S\Gamma = \{ f: Int \rightarrow Int, x:Int \}\); we can derive \(S\,\Gamma \vdash e:(Int,id).\)

###
**Theorem 1**

If \(\Gamma \vdash e:(\phi ,S)\) holds then \(S\Gamma \vdash e:(\phi ,S_0)\) holds, where \(S_0\le S.\)

Furthermore, for all program contexts \(C[e]\) in which \(e\) occurs and all typing contexts \(\Gamma ^{\prime }\) such that \(\Gamma \le \Gamma ^{\prime }\) and \(\Gamma ^{\prime } \vdash C[e]:(\phi ^{\prime },S^{\prime })\) is derivable, for some \(\phi ^{\prime }, S^{\prime },\) we have that \(S\le S^{\prime }.\)

For each expression \(e,\) there is a unique type \(\phi \) derivable for \(e\) in a typing context \(\Gamma .\) However, expression \(e\) can have though a set of instance-types, in program contexts that require instantiation of \(\phi \) in \(\Gamma \) (in fact, in all typing contexts \(\Gamma ^{\prime }\) such that \(\Gamma \le \Gamma ^{\prime },\) cf. Theorem 1). Consider the following example where \(\mathtt{B }\) and \(\mathtt{C }\) represents abbreviations of *Bool* and *Char*, respectively.

###
*Example 6*

Let \((\mathtt{(==) }:\forall a.Eq: a\Rightarrow a\rightarrow a\rightarrow \, \mathtt{B })\, \in \Gamma ,\,\{{Eq}\,\mathtt{B }, Eq\, \mathtt{C }\} \subseteq \Theta _\Gamma \) and \(e = \mathtt{((==) } { True}, \mathtt{(==) ^{\prime }*^{\prime }) }\). Then \(\Gamma \vdash \, e : ((\mathtt{B } \rightarrow \,\mathtt{B },\,\mathtt{C } \rightarrow \mathtt{B }) ,S)\) is derivable, where \(S = [a\mapsto \mathtt{B }, b\mapsto \mathtt{C }]\), and \(a,b\) are fresh type variables. Instance-types of \(\mathtt{(==) }\) in program contexts \(\mathtt{(==) }\)*True*and \(\mathtt{(==)^{\prime }*^{\prime } }\) are respectively \(\mathtt{B } \rightarrow \mathtt{B } \rightarrow \mathtt{B }\) and \(\mathtt{C } \rightarrow \mathtt{C } \rightarrow \mathtt{B }.\)

Instance-types are formally defined as follows.

###
**Definition 1**

Given expression \(e\) and typing context \(\Gamma ,\) we have that \(S^{\prime }\phi \) is an *instance-type for*\(e\)*in*\(\Gamma \) if \(\Gamma \vdash e:(\phi ,S)\) and \(\Gamma ^{\prime } \vdash C[e]:(\phi ^{\prime },S^{\prime })\) hold, where \(\Gamma ^{\prime }\le \Gamma .\)

Furthermore, \(S^{\prime }\phi \) is a greatest (most specific) instance-type for an *occurrence* of \(e\) in \(\Gamma \) if \(S^{\prime }\phi \) is an instance-type for \(e\) in \(\Gamma \) and there is no instance-type \(S_1\phi \) distinct from \(S^{\prime }\phi \) for \(e\) in \(\Gamma \) such that \(S_1\le S.\)

Distinct occurrences of an expression can have distinct greatest instance-types. For example, the instance-types given in Example 6 are greatest instance-types for the corresponding occurrence of \(\mathtt{(==) }\).

\(mgu\) is the most general unifier relation [12–14]: \(mgu(\mathcal T,S)\) is defined to hold between a set of pairs of simple types or a set of constraints \(\mathcal T\) and a substitution \(S\) if the following hold: i) \(S \tau = S\tau ^{\prime }\) for every \((\tau ,\tau ^{\prime })\in \mathcal T\) (analogously, \(S\pi = S\pi ^{\prime }\) for every \((\pi ,\pi ^{\prime })\in \mathcal T\)), and if \(S^{\prime }\) is a unifier of \(\mathcal T,\) then \(S^{\prime }\le S.\)

When the parameter of \(mgu\) is a singleton set, following common practice it is written simply as an equality; e.g., \(mgu(\pi = \pi ^{\prime },S)\) is written instead of using a set notation like this: \(mgu\left( \{ (\pi ,\pi ^{\prime })\},S\right) .\)

\(gen(\sigma ,\phi ,V)\) holds if \(\sigma = \forall \,\overline{\alpha }.\,\phi ,\) where \(\overline{\alpha } = tv(\phi ) - V.\)

*reachable*with respect to a set of type variables \(V\) if \(\alpha \in V\) or \(\alpha \in tv(\pi )\) and there exists \(\beta \in tv(\pi )\) such that \(\beta \) is reachable (otherwise it is an unreachable type variable). Reachability is considered always with respect to \(V= tv(\tau )\) for a constraint set \(P\) that occurs on a constrained type \(P\Rightarrow \tau .\) For example, type variables \(a,b\) are reachable and \(c\) is unreachable in constraint set \(\{ C\, a\, b, D\,c\}\) of constrained type \(\{ C\, a\, b, D\,c\}\Rightarrow a.\) Reachability is defined in Fig. 5.

Given any set of type variables \(V,\) the constraints of a constraint set \(P\) can be partitioned into two disjoint subsets \(P|_{V}^{*}\) and \(P-P|_{V}^{*},\) the first containing constraints with at least one reachable type variable and the second constraints with only unreachable type variables.

\(P \oplus _V Q\) denotes the constraint set obtained by adding from \(Q\) only constraints with type variables reachable from \(V,\) i.e., \(P \oplus _V Q = P \cup Q|_{V}^{*}\) [8, 15]. This takes into account that, in an application of a function with type, say \(\tau _1 \rightarrow \tau ,\) to an expression with type \(P\Rightarrow \tau _1,\) it is not always adequate to include in the constraint set of the result all constraints from \(Q.\) This occurs because constraints in \(P\) may refer to disregarded, non-selected parts of the argument. Consider for example expression \(fst({ True},o)\) (cf., Sect. 2), where \(o\) has any type with non-empty constraints. The type of this expression should be *Bool*, that is, constraints on the type of \(o\) should not be part of the set of constraints on the type of \(fst({ True},o).\)

Relation \(>\!\!\!>_{\Theta }\) is a simplification relation on constraints, defined as a composition of improvement and context reduction, defined respectively in Sects. 3.3 and 3.4. Firstly, the more basic relations of entailment and satisfiability are defined, in Sects. 3.1 and 3.2, respectively, which are used in the definitions of improvement and context reduction.

*unreachable*). The side-condition of rule \(\mathtt {APP} \) expresses that \(Q_u\) should be empty. An empty set of constraints \(Q_u\) is obtained after checking satisfiability on the set of constraints \(P_u,\) if \(P_u\) has unreachable variables, and after removing these constraints with unreachable variables, by context reduction, if there exists a single satisfying solution for such constraints (cf., Definition of \(>\!\!\!>_{\Theta }\) in Fig. 6).

The article proposes to treat ambiguity by following a standard definition of ambiguity, that consists in: test satisfiability (i.e., “close the world”) if overloading is resolved (or should have been resolved), that is, if there exist unreachable variables in the constraints. Nowadays, satisfiability is tested in Haskell, in the presence of multi-parameter type classes, only upon the presence of functional dependencies (or a similar mechanism), that closes the world when there exist *or not* unreachable type variables. Our treatment of ambiguity thus restricts the cases where satisfiabilty is tested, in case, say, a mechanism such as that of functional dependencies is used, and allows to avoid the use of such mechanism (of functional dependencies, or a similar one). In the latter case, the satisfiability trigger condition becomes the existence of unreachable variables, which may then be instantiated if there exists a single satisfying substitution.

The type system uses relations (\(mgu, gen, >\!\!\!>_{\Theta }\)). The facts that it is syntax-directed and type instantiation occurs only if required by a program context allow a sound and complete type inference algorithm to be obtained by transforming these relations into computable functions.

The fact that the type system does not allow context-free type instantiation and allow the derivation of a single type for an expression in a given typing context makes it look closer to a type inference algorithm. Context-dependent notions of instance-types and most specific instance-type for each occurrence of an expression, in a given typing context, are introduced and used in the paper, instead of the standard context-independent notion of principal type.

Typability of function application \(f\,e\) in this type system considers \(f\,e\) to be well-typed, where \(f\) has type \(\tau _1\) and \(e\) has type \(\tau _2,\) if there exists types \(\tau \rightarrow \tau ^{\prime }\) and \(\tau \) that are respective subtypes of \(\tau _1\) and \(\tau _2,\) where subtyping is simply a matching relation, as defined in Fig. 4.

### 3.1 Entailment

### 3.2 Satisfiability

###
*Example 7*

Equality of constraint sets is considered modulo type variable renaming. That is, constraint sets \(P,Q\) are also equal if there exists a renaming substitution \(S\) that can be applied to \(P\) to make \(S\,P\) and \(Q\) equal. \(S\) is a renaming substitution if for all \(\alpha \in { dom}(S)\) we have that \(S(\alpha )=\beta ,\) for some type variable \(\beta \not \in { dom}(S).\)

If \(S P \in \lfloor P \rfloor _\Theta \) then \(S\) is called a satisfying substitution for \(P.\)

Constraint set satisfiability is in general an undecidable problem [18]. It is restricted and redefined here by using a constraint-head-value finite function, in order to obtain decidability, as described below.

Constraint set satisfiability and simplification both use the same termination criterion, which is based on a measure of the sizes of types in type constraints, given the the constraint-head-value function. The sequence of constraints that unify with a constraint axiom in recursive calls of the function that checks satisfiability or simplification of a type constraint is such that either the sizes of types of each constraint in this sequence is decreasing or there exists at least one type parameter position with decreasing size.

Constraint set satisfiability is defined so that we can obtain a sound and complete type inference algorithm, by just transforming the relations defined in the type system into functions.

- 1.
\(\nu (S\pi ^{\prime })\) is less than \(\nu (S_1\pi _1)\) or, if \(\nu (S \pi ^{\prime })=\nu (S_1 \pi _1),\) then \(S\pi ^{\prime } \not = \pi ^{\prime \prime },\) for any \(\pi ^{\prime \prime }\) that has the same constraint value as \(\pi ^{\prime }\) and unification with \(\pi _0\) is required for satisfiability of \(\pi \) to hold, or

- 2.
\(S\pi \) is such that there is a type argument position \(0 \le i \le n\) such that the number of type variables and constructors, in this argument position, of constraints that unify with \(\pi _0\) is always decreasing.

The following examples illustrate the definition of constraint set satisfiability as defined in Fig. 8. Let \(\Phi (\pi ).I\) and \(\Phi (\pi ).\Pi \) denote the first and second components of \(\Phi (\pi ),\) respectively.

###
*Example 8*

The following illustrates a case of satisfiability involving a constraint \(\pi ^{\prime }\) that unifies with a constraint head \(\pi _0\) such that \(\nu (\pi ^{\prime })\) is greater than the upper bound associated to \(\pi _0,\) which is the first component of \(\Phi (\pi _0).I.\)

###
*Example 9*

Again, \(\pi _2\) unifies with \(\pi _0,\) with unifying substitution \(S^{\prime } = [a_3\mapsto T^4\,\mathtt{I}, b_2\mapsto \mathtt{I}] ,\) and updating \(\Phi _3 = \Phi _2[\pi _0,\pi _2]\) gives \(\Phi _3(\pi _0).I = (-1,-1,2).\) Satisfiability is then finally tested for \(\pi _3 = C\,(T^6\,\mathtt{I}) \mathtt{I},\) that unifies with \(C\,(T\,a)\,\mathtt{I},\) returning \(\mathbb S _3 = \{ [a_3\mapsto T^5\,\mathtt{I}]|_\emptyset \} = \{ { id}\}.\) Constraint \(\pi \) is thus satisfiable, with \(\mathbb S _0 = \{{ id}\}.\)

The following example illustrates a case where the information kept in the second component of \(\Phi (\pi _0)\) is relevant.

###
*Example 10*

Since satisfiability of type class constraints is in general undecidable [18], there exist satisfiable instances which are considered to be unsatisfiable according to the definition of Fig. 8. Examples can be constructed by encoding instances of solvable post correspondence problems by means of constraint set satisfiability, using Smith’s scheme [18].

To prove that satisfiability as defined in Fig. 8 is decidable, consider that there exist finitely many constraints in \(\Theta ,\) and that, for any constraint \(\pi \) that unifies with \(\pi _0,\) we have, by the definition of \(\Phi [\pi _0,\pi ],\) that \(\Phi (\pi _0)\) is updated so as to include a new value in its second component (otherwise \(\Phi [\pi _0,\pi ] = Fail\) and satisfiability yields \(\emptyset \) as the set of satisfying solutions for the original constraint). The conclusion follows from the fact that \(\Phi (\pi _0)\) can have only finitely many distinct values, for any \(\pi _0.\)

### 3.3 Improvement

Improvement is a satisfiability preserving relation: improvement of constraint set \(P\) is the process of finding a least general substitution \(S\) such that \(S\,P\) preserves the set of satisfiable instances of \(P\) [3].

In this paper, improvement is used to remove unreachable type variables for resolving overloading, when overloading resolution cannot be further deferred, and for detecting ambiguity or unsatisfiability, if unreachable type variables cannot be removed (that is, overloading resolution is not possible). For any constrained type \(P\Rightarrow \tau ,\) improvement is tested only upon the presence of unreachable type variables, that is, if \(P_u = P - P|_{{tv(\tau )}}^{*} \not = \emptyset .\)

This is a consequence of the side-condition (\(Q_u = \emptyset \)) in rule (\(\mathtt {APP} \)), Fig. 3.

If the set \(\mathbb S \) of satisfiable instances of \(P_u\) has more than one element, we have ambiguity; if \(\mathbb S \) is empty, we have unsatisfiability; otherwise, if \(\mathbb S \) is a singleton \(\{S\},\) then \(P\) is improved to \(S\,P,\) which can is then reduced to a set of constraints without unreachable type variables (that is, the set of constraints in \(S\,P_u\) can be removed, since overloading is resolved).

### 3.4 Context reduction

Informally speaking, context reduction is a process that reduces a constraint \(\pi \) into \(Q\) if there is a *matching instance* for \(\pi \) in \(\Theta ,\) that is, there exists \((\forall \,\overline{\alpha }.\,P\Rightarrow \pi ^{\prime })\in \Theta \) such that \(S\pi ^{\prime }= \pi ,\) for some \(S,\) and \(S\,P\) reduces to \(Q.\) If there is no matching instance for \(\pi \) or no reduction of \(S\,P\) is possible, then \(\pi \) reduces to itself. Note that constraint sets can be reduced into larger constraint sets.

As an example of a context reduction, consider an instance declaration that introduces \(\forall a.\,{ Eq} \, a \Rightarrow Eq\,\mathtt{[ }a\mathtt{] }\) in \(\Theta \); then \(Eq\,\mathtt{[ }a\mathtt{] }\) is reduced to *Eq*\(\,a.\)

Context reduction can also occur due to the presence of superclass class declarations, but we only consider the case of instance declarations in this paper, which is the more complex process. The treatment of reducing constraints due to the existence of superclasses is standard; see, e.g., [3, 7, 10].

The third parameter of \({ matches}\) is either empty or a singleton set, since overlapping instances [19] are not considered.

Context reduction, defined in Fig. 10, uses rules of the form \(P \vdash _\mathtt{red }^{\Theta ,\Phi } Q;\Phi ^{\prime },\) meaning that either \(P\) reduces to \(Q\) under the set of closed constraints \(\Theta \) and least constraint value function \(\Phi ,\) causing \(\Phi \) to be updated to \(\Phi ^{\prime },\) or \(P \vdash _\mathtt{red }^{\Theta ,{ Fail}} P;{ Fail}.\)

Failure implies that a constraint set is updated to itself.

*sats*to guarantee that context reduction is a decidable relation.

An empty constraint set reduces to itself (\(\mathtt {RED} _0\)). Rule (\(\mathtt {CONJ} \)) specifies that constraint set simplification works, unlike constraint set satisfiability, by performing a union of the result of simplifying each constraint in the constraint set, separately.

To see that a rule similar to (\(\mathtt {CONJ} \)) cannot be used in the case of constraint set satisfiability, consider a simple example, of satisfiability of \(P = \{C\,a, D\, a\}\) in \(\Theta = \{C\,{ Int},C\, { Bool},D\, { Int},D\, { Char}\}.\) The results of computing satisfiability of \(P\) yields a single substitution where \(a\) maps to \({ Int},\) not the union of computing satisfiability for \(C\,a\) and \(D\,a\) separately.

Rule (\(\mathtt {INST} \)) specifies that if there exists a constraint axiom \(\forall \,\overline{\alpha }.\,P \Rightarrow C\,\overline{\tau },\) such that \(C\,\overline{\tau }\) matches with an input constraint \(\pi ,\) then \(\pi \) reduces to any constraint set \(Q\) that \(P\) reduces to.

Rules (\(\mathtt{{STOP} }_0\)) and (\(\mathtt{{STOP} }\)) deal with failure due to updating of the constraint-head-value function.

## 4 Semantics

A type class declaration defines overloaded names, also called class *members*, with corresponding types, and an instance declaration gives a value for each class member, referred to as a *member value* (sometimes also referred to in the literature as a “member function”).

The semantics of core Haskell, given in Fig. 11, is based on the application of (so-called) *dictionaries* to overloaded names a standard core Haskell semantics [3, 10, 20]. A *dictionary* is a tuple that corresponds to an instance declaration, and contains values that correspond to the definitions given in the instance declaration for each class member. A dictionary of a superclass contains also a pointer to a dictionary of each of its subclasses, but the treatment of superclasses is standard and is omitted in this paper (see, e.g., [3, 7, 10]).

Figure 11 defines the semantics of core Haskell by induction on type system rules, with greatest instance-types of variables explicitly annotated, that is, typing formulas for variables have the form \(\Gamma \vdash x::\phi \) where \(\phi \) is the greatest instance-type of this occurrence of \(x\) in typing context \(\Gamma \) (cf. Definition 1). The translation of the types of expressions are also defined in Fig. 11.

For each class declaration \(\mathtt{class }\,P\,\Rightarrow \overline{\alpha }\,\mathtt{where }\,\overline{x}::\overline{\tau },\) a sequence of selection functions is generated, one for each overloaded name in \(\overline{x}.\) The selection function corresponding to \(x_i\) simply selects the \(i\)-th component of the tuple parameter (..., \(x_i ,\)...) (if \(n=1,\) selection is done by the identity function).

For example, class *Eq* generates a pair of selection functions \(\mathtt{(==) }\) and \(\mathtt{(/=) }\), defined as equal to *fst* and *snd*. Module scope visibility rules of these generated names are not considered in this paper. See also Example 11 below.

Let \(\overline{P}\) denote a sequence of constraints in \(P\) in a standard, say lexicographical order.

Each instance declaration \(\mathtt{instance }\,P\Rightarrow \pi \,\mathtt{where }\,\overline{x} = \overline{e}\) of a class \(C\) generates a dictionary \(d_\pi .\) Each component in \(d_\pi \) is a function that takes one dictionary for each constraint in the (possibly empty) sequence \(\overline{P}\) and yields the translation of \(e_i,\) the value bound by \(x_i\) in the instance declaration. The instance declaration makes values \(\eta (S\pi )\) and \(\eta (x_i,S\tau _i)\) to be equal to \(d_\pi ,\) for all substitutions \(S,\) where \(\tau _i\) is the simple type in the type of \(x_i.\)

Let \(\eta \dagger (P \mapsto \overline{v})\) be equal to \(\eta [\pi _1 \mapsto v_1, \ldots , \pi _n \mapsto v_n],\) where \(P = \{ \pi _1,\ldots , \pi _n\}.\)

\({ vSeq}(\overline{P})\) denotes a sequence of fresh variables \(v_i,\) one for each \(\pi _i\) in the sequence \(\overline{P}.\)

###
**Theorem 2**

###
*Proof*

Since \(\Gamma \) and \(\Gamma ^{\prime }\) give the same type to every \(x\) free in \(e\) and the type system rules are syntax-directed, \(\Delta \) and \(\Delta ^{\prime }\) are the same.\(\square \)

Consider the following Haskell program extract:

###
*Example 11*

The translation of the first occurrence of *teq* in line \(\mathtt{(1) }\) above is equal to \(teq\,d_{TEqL}\, v_1\,v_2,\) where *teq*’s translation is the identity function, \(\mathbf{teq}_{L}\) is a function that receives the two dictionary arguments \(v_1\) and \(v_2\) passed to \(teqww\) and yields the translation of function *teq* for lists defined above. The translation is given with respect to environment \(\eta _0\dagger (P\mapsto \overline{v}),\) where \(P = \{Show a, TEq a\},\,\overline{v}\) is the sequence \(v_1\,v_2,\) and \(\eta _0\) is such that \(\eta _0(teq,\tau ) = d_{TEqL},\) where \(\tau = \mathtt{[[ }a\mathtt{]] }\!\rightarrow \mathtt{[[ }a\mathtt{]] } \!\rightarrow \!(Bool,String),\) and \(d_{TEqL}\) is a dictionary with just one component \(\mathbf{teq}_L.\)

*teq*in line \(\mathtt{(1) }\) above is equal to:

Such use of dictionaries and the ensuing selection of member values at run-time can be avoided by passing values that correspond to overloaded names that are in fact used. A common case is that of a list equality function, that can receive an equality function for list elements, instead of a dictionary containing also an unused inequality function. Passing a dictionary to select at run-time the used equality function is unnecessary and inefficient. Full laziness and common subexpression elimination are techniques used to avoid repeated construction of dictionaries at run-time [3, 7]. This and related implementation issues are however outside of the scope of this paper and are left for further work (see also [21, 22]).

*eqStar*given by:

*Eq*\(a\,\Rightarrow a\,\rightarrow \,a\,\rightarrow \)

*Bool*(we have not written a simpler expression because we want to contrast the semantics of \(\mathtt{(==) }\) with those of expressions \(\mathtt{(==) } x\) and \(\mathtt{(==) } x\,y\)); the translation of

*eqStar*is given by:

*dictEqChar*(as well as

*eq dictEqChar*) returns a primitive equality function for characters, say

*primEqChar*. Expression \(\mathtt{(==) }\) is itself a function that takes a dictionary of type \(t\) and returns the equality function from that dictionary, of type \(t\rightarrow t \rightarrow Bool.\) The translation of each occurrence of \(\mathtt{(==) }\) passes a pertinent dictionary value to \(\mathtt{(==) }\) so that the type obtained is the expected type for an equality function on values of type \(t.\) Both expressions \(\mathtt{(==) } x\) and \(\mathtt{(==) } x\, y\) have also constrained types, but a dictionary is passed only in the case of \(\mathtt{(==) }\). The semantics of an expression with a constrained type where the set of constraints is non-empty only considers this set of constraints if the expression is an overloaded variable; otherwise constraints are disregarded in the semantics. Furthermore, since each occurrence of an overloaded variable has a translation that is the application of pertinent dictionary values to that variable, translation of

*types*with constraints are never input or output values of the translation function (see Fig. 11).

Type soundness follows directly from the fact that if \([\![ \Gamma \vdash e:P\Rightarrow \tau ]\!]\eta = \mathbf e : \tau \) holds or, if \(e\) is a variable, if \([\![ \Gamma \vdash x::P\Rightarrow \tau ]\!]\eta = \mathbf e : \tau \) holds, then \(\Gamma ^{\prime } \vdash \mathbf e : \tau \) is derivable, where \(\Gamma ^{\prime }\) is appropriately defined so as to remove overloading-related data from \(\Gamma .\) This can be done by creating dictionaries and selection functions as described above, and inserting corresponding type assumptions in \(\Gamma ^{\prime }.\)

Type soundness is obtained as a result of disallowing all ambiguous expressions and all expressions involving unsatisfiability in the use of overloaded names. For example, letting \(e_0 \equiv (\lambda x\,->show(read x\))), we would not have a derivation of \(\Gamma ^{\prime } \vdash e_0:String \rightarrow String\) corresponding to \(\Gamma \vdash e_0:{ String}\rightarrow { String}\) if \(\Gamma ^{\prime } \vdash d_{Readt}: { String}\rightarrow t\) is not derivable, which would happen if \(t\) is a fresh type variable or \(t\) can be more than one simple type. In other words, \(\Gamma ^{\prime } \vdash d_{Readt}: { String}\rightarrow t\) is derivable if and only if \(t\) is a unique simple type.

## 5 Conclusion

This paper discusses ambiguity in the context of languages that support context-dependent overloading, such as Haskell.

A type system is presented that does not follow the Hindley-Milner approach of providing context-free type instantiation, as usually done in type systems for such languages. As a consequence, ambiguous expressions can be considered to be not well-typed, in conformance with type inference algorithms.

The type system does not allow context-free type instantiation and allows only a single type to be derived for an expression, in a given typing context, making it look closer and easier to be converted into a type inference algorithm. There is no notion of principal type (and thus no notion of “principal translation” of a term), in a given typing context. Related notions of instance type and most specific instance type for each occurrence of an expression, dependent on program contexts, are instead defined and used in the paper.

A semantics is defined by induction on the type system rules, for which coherence is trivial. Type soundness is obtained as a result of disallowing all ambiguous expressions and all expressions involving unsatisfiability in the use of overloaded names.

A standard definition of ambiguity is followed in the support for context-dependent overloading, where satisfiability is tested —i.e., “the world is closed”—if only if overloading is resolved (or should have been resolved), that is, if and only if there exist unreachable variables in the constraints on types of expressions. Nowadays satisfiability is tested in Haskell, in the presence of multi-parameter type classes, only upon the presence of functional dependencies or an alternative mechanism that specifies conditions for closing the world, and that may happen when there exist or not unreachable type variables in constraints. The satisfiability trigger condition is then given automatically, by the existence of unreachable variables in constraints, and does not need to be specified by programmers, using an extra mechanism.

## Authors’ Affiliations

## References

- Mitchell J (1996) Foundations of Programming Languages. MIT Press, CambridgeGoogle Scholar
- Pierce B (2002) Types and Programming Languages. MIT Press, CambridgeGoogle Scholar
- Jones M (1994) Qualified Types: Theory and Practice. Ph.D. thesis, Distinguished Dissertations in Computer Science. Cambridge Univ. Press, CambridgeGoogle Scholar
- Stuckey P, Sulzmann M (2005) A Theory of Overloading. ACM Trans Prog Lang Syst (TOPLAS) 27(6):1216–1269View ArticleGoogle Scholar
- Vytiniotis D, Jones S, Schrijvers T, Sulzmann M (2011) OutsideIn(X): modular type inference with local assumptions. J Funct Program 21(4–5):333–412MATHView ArticleGoogle Scholar
- Faxén K (2003) Haskell and Principal Types. In: Proceedings 2003 ACM SIGPLAN Haskell, Workshop, pp 88–97Google Scholar
- Faxén K (2002) A static semantics for Haskell. J Funct Program 12:295–357MATHView ArticleGoogle Scholar
- Camarão C, Figueiredo L (1999) Type Inference for Overloading without Restrictions, Declarations or Annotations. In: Proceeding 4 $${\rm th}$$ Fuji International Symp. on Functional and Logic Programming (FLOPS’99), LNCS 1722, Springer-Verlag, New York, pp 37–52Google Scholar
- Jones SP, et al (2012) GHC—the Glasgow Haskell compiler. http://www.haskell.org/ghc/
- Hall C, Hammond K, Jones S, Wadler P (1996) Type classes in Haskell. ACM Trans Program Lang Syst 18(2):109–138View ArticleGoogle Scholar
- Chakravarty MG, Keller SP, Jones S (2005) Marlow: associated types with class. In: Proceeding ACM Symp. on Principles of Programming Languages (POPL’05), ACM Press, Florida, pp 1–13Google Scholar
- Robinson J (1965) A machine-oriented logic based on the resolution principle. J ACM 12:32–41View ArticleGoogle Scholar
- Eder E (1985) Properties of substitutions and unification. J Sym Comput 1:31–46MATHMathSciNetView ArticleGoogle Scholar
- Palamidessi C (1990) Algebraic properties of idempotent substitutions. Lect Notes Comput Sci 443:386–399MathSciNetView ArticleGoogle Scholar
- Camarão C, Ribeiro R, Figueiredo L, Vasconcellos C (2009) A Solution to Haskell’s Multi-Parameter Type Class Dilemma. In: Proceeding \(13{\rm th}\) Brazilian Symp. on Programming Languages (SBLP’2009), pp 5–18. http://www.dcc.ufmg.br/camarao/CT/solution-to-mptc-dilemma.pdf
- Chakravarty M, Keller G, Jones S (2005) Associated type synonyms. In: Proceeding 10 $${\rm th}$$ ACM SIGPLAN International Conference on Functional Programming, pp 241–253Google Scholar
- Jones M (1995) Simplifying and improving qualified types. In: Proceeding ACM Conf. on Functional Programming and Computer Architecture (FPCA’95), Cambridge University Press, Cambridge, pp 160–169Google Scholar
- Smith G (1991) Polymorphic type inference for languages with overloading and subtyping. Ph.D. thesis, Cornell University, NYGoogle Scholar
- Jones SP, others (2011) GHC—The Glasgow Haskell Compiler 7.0.4 User’s Manual. http://www.haskell.org/ghc/
- Wadler P, Blott S (1989) How to make ad-hoc polymorphism less ad hoc. In: Proceeding 16 $${\rm th}$$ ACM Symp. on Principles of Programming Language (POPL’89), ACM Press, Florida, pp 60–76Google Scholar
- Peterson J, Jones M (1993) Implementing type classes. In: Proceeding ACM Conference on Programming Language Design and Implementation, SIGPLAN Notices 28(6), Florida, pp 227–236Google Scholar
- Jones M (1994) Dictionary-free Overloading by Partial Evaluation. In: ACM SIGPLAN Workshop on Partial Evaluation and Semantics-Based Program ManipulationGoogle Scholar