5.4 Type Checking and Inference

In this section we give an overview of type checking and inference. We begin by giving rules for subtyping. We use <= as the subtype relation and < for the strict subtype relation - we gave some rules for builtin types earlier in this chapter.

Note that in the code type rules some of the relations between types flip around. Perhaps an intuition that might help in understanding why this is the case is to think of the subtype relation as a “usability” relation - i.e. if T1 <= T2 then a term of type T1 can be used anywhere we are expecting a term of type T2.

So for example consider the following two function declarations

fun f1(int) -> int
fun f2(num) -> nat

The function f2 can be used in any situation where we might use f1 and so
(fun(num) -> nat) <= (fun(int) -> int)

Below we list other subtyping rules where m<= is the moded subtype relation.


list(T1) <= list(T2) iff T1 <= T2
set(T1) <= set(T2) iff T1 <= T2

atom_naming(CodeT1) <= atom_naming(CodeT2) iff CodeT1 <= CodeT2
term_naming(CodeT1) <= term_naming(CodeT2) iff CodeT1 <= CodeT2

(T1, ..., Tn) <= (T1’, ..., Tn’) iff Ti <=  Ti’ for each i: 1..n

For an enumeration or range type the subtype relation is subset
Each enumeration and range type is a subtype of the appropriate
        atomic type (e.g. nat, string, atom)

T1 <= T1 || T2
T2 <= T1 || T2

For an enumerated constructor type
CT(T1, ..., Tn) <= CT(T1’, ..., Tn’) iff
            (T1, ..., Tn) <= (T1’, ..., Tn’)
                
(fun(T1, ..., Tn) -> T) <= (fun(T1’, ..., Tn’) -> T’) iff
                     (T1’, ..., Tn’) <= (T1, ..., Tn) and T <= T’
(tel(T1, ..., Tn) <= tel(T1’, ..., Tn’) iff
                     (T1’, ..., Tn’) <= (T1, ..., Tn)
(rel(MT1, ..., MTn) <= rel(MT1’, ..., MTn’) iff
                     MTi m<= MTi’ for each i: 1..n
(act(MT1, ..., MTn) <= act(MT1’, ..., MTn’) iff
                     MTi m<= MTi’ for each i: 1..n

For moded types we have the following rules. For simplicity we only consider “simple” moded types - we ignore moded types like !list(?int).

M(T1) m<= !(T2) if T2 <= T1
M1(T1) m<= M2(T2) if T1 = T2 and  not (M1 = '??' and M2 = '?')

To explain why these rules are so restrictive we consider the following example relations.

rel r1(!int), r2(?int), r3(?num), r4(??int)

Can a call to r1 be used in a situation where a call to r2 is expected? Since a call to r2 can take a variable as its argument while r1 cannot, the the answer is no.

Can a call to r3 be used in a situation where a call to r1 is expected? Since r3 can take a ground int as its argument then the answer is yes.

Can a call to r4 be used in a situation where a call to r3 is expected? Since r4 is not guranteed to ground its argument while r3 is then the answer is no.

5.4.1 Type Checking Examples

We now give some examples to show type checking and inference in action. These examples are very simple and so do not do justice to the complexities involved.

Consider

rel r1(!int, ?int), r2(!int, ?nat), r3(!int, ?int)
r1(X, Z) <= r2(X, Y) & r3(Y, Z) 

For this type checking example we use the notation X:!int which we call a variable type binding to mean that X is constrained to have moded type !int. The type checker maintains a list of variable type bindings (VTB) as it processes the rule.

First the rule head is considered and since we have

rel r1(!int, ?int)

then the type checker generates an inital VTB:

[X:!int, Z:?int]

Since Z needs to be ground by the end of the rule body we also produce the VTB

[Z:!int]

for checking when the body processing is finished.

We now consider the first call in the body relative to the initial VTB.

First we require X to be of moded type !int, which is a constraint in the VTB. Since Y is not mentioned in the VTB then this does not need to be checked. We then generate the updated VTB:

[X:!int, Z:?int, Y:!nat]

Note that the moded type for Y has changed from ?nat to !nat because we know that the call on r2 will ground Y.

Now we check the call r3(Y, Z). Again we compare the required moded types of the variables with the corresponding moded types in the VTB. Since nat <= int then the requirements are again satisfied. The unpdate VTB is:

[X:!int, Z:!int, Y:!nat]

This completes the body checking and we just need to check that Z has been ground by the body calls - and it has since Z:!int is in the VTB.

Now consider the same situation but with the calls in the body reversed:

r1(X, Z) <= r3(X, Y) & r2(Y, Z) 

The type checker proceeds in the same way as above and the VTB before the final call is

[X:!int, Z:?int, Y:!int]

When checking the final call we require Y:?nat and have the constraint Y:?int. Since it is not the case that ?int m<= ?nat then we get a type error.

Superficially, it looks like this rule should be OK. Afterall, if we call the rule with the second argument being a variable then the rule will bind that vaiable to something of type nat (which is also of type int as required).

The problem is we can call this rule with the second argument ground to a negative integer. This will violate the type requirements for r2.

This example partly explains why the constraints on moded types are quite strong.

If the programmer wanted to write a type correct variant of the rule that had essentially the same semantics then there is a simple solution:

r1(X, Z) <= r3(X, Y) & r2(Y, W) & Z = W

In this rule, because we call r2 with a new variable, we avoid the type checking error above. The VTB before the unification is

[X:!int, Z:?int, Y:!int, W:!nat]

The unification is now comparing something of moded type ?int with something of moded type !nat and because these are type compatible then the type checker is happy with this rule.

The following example provides another illustration of why the constraints on moded types is so strong. It is tempting to think we could, for example, ease the restrictions on ? moded types in the rule head by allowing the type to increase in the body as long as we use a run time type check to “bring the type back down” before we get to the end of the rule as in the following.

rel p(?nat, ?nat), q(?int), r(?nat)

p(X, Y) <= q(X) & r(Y) & type(X, !nat)

If we run the type checker on this (with the more relaxed view on ? moded types) then, just before the type check we get a VTB of

[X:!int, Y:!nat]

and after the type check we get

[X:!nat, Y:!nat]

Now when we check this against the saved VTB for the ? moded X in the head we see that not only is X ground but it also has the correct type. This means that the type checker will be satisfied that this rule is type correct.

The problem is that we are allowed to make the call p(A, A), where A is a variable, on this relation. When we make the call, the rule will first call q(A) which might instantiate A to a negative integer and now the call r(A) will be type incorrect.

The problem is that a “hidden unification” has happened - the unification happens in the caller - not in the rule itself and so it would be virtually impossible for the type checker to deal with this case.

In a similar way to the earlier example, the programmer can write the required code while avoiding the tight constraints on moded types as follows.

p(X, Y) <= q(Z) & r(Y) & type(Z, !nat) & X = Z

Note that the type checker assumes each call in the body succeeds. If, for example, the runtime type checks fails in a particular use then the call fails and so there are no implications for the types of the variables in the head.

These examples point to a strategy when dealing with ? moded types in the head - occurrences of that variable in the body can be replaced by a new variable with an added unification at the end of the body, possibly preceded by a type check, to link the head variable with its replacement in the body.

5.4.2 Type Inference Example

When the type checker processes arguments with polymorphic types then typically the type checker switches into type inference mode in order to determine types for later checking.

If we consider the body of a rule where the call

append(X, Y, Z)

is made in the context of the VTB

[X:!list(int), Y:!list(nat)]

given the relevant moded type

append(!list(T), !list(T), ?list(T))

then it will infer from the first two argments that int <= T and nat <= T which can be satisfied with T = int (the unique minimal type satisfying the requirements). The VTB after the call then becomes

[X:!list(int), Y:!list(nat), Z:!list(int)]

On This Site