This post shows how to get around the limitations of the `destruct` tactic when doing case analysis on dependent types, without resorting to `dependent destruction`, which relies on additional axioms.

New users of Coq, especially those who have worked in Agda or another dependently typed programming language, sometimes wonder why programs written in Coq don't take (more) advantage of the dependent type system to enforce invariants. It turns out it's more of a pain to use dependent types in your Coq programs than it is in Agda. This is partially due to foundational reasons (Agda uses the K axiom to make pattern matching easier), but it's also due to purely superficial annoyances on the Coq side.

Doug summed up his opinion of dependently typed programming in Coq like this:

Coq has this really powerful type system, but... don't use it.

On the other hand, it's hard to resist using a feature that's there. If you do use dependent types in your programs, you may have run into trouble with pattern matching or reasoning by case analysis over these types. There are ways to get around these problems, but the most common ones require assuming extra axioms.

The other day in the freenode IRC channel for Coq, user `osa1` asked about defining and reasoning about the cardinality of types. Similar to how it might be done in math, he defined a type to have cardinality `n` if there was a bijection between that type and `Fin n`, the type with exactly `n` elements. `osa1` was working through a (seemingly) simple example and got stuck. He needed to do case analysis on a term of type `Fin 2` but couldn't get it to work. In the process of discussing this example, I came up with a kind of design pattern for implementing case analysis on dependent types without assuming any axioms.

In this post we'll walk through a proof that the booleans have cardinality 2. I know, pretty impressive, right? First we'll prove it using the `dependent destruction` tactic, which uses an axiom that is a strict extension of Coq's underlying theory. Then, we'll show how to do it "by hand" in a way that doesn't rely on any extra assumptions. Finally, I'll give another quick example and a hand-wavy description of my design pattern for dependent case analysis without axioms.

This post is generated from a Coq source file via a hacky python script and `pandoc`. You can view the original source if you want to step through the proofs in Coq. The source also contains a link to the python script. Finally, there's also a version of the source with all the blog text and typesetting commands removed, for easy stepping.

## Cardinalities of sets

As discussed above, we're going to define cardinalities in terms of the type `Fin n`, which we define as follows.

``````Inductive Fin : nat -> Set :=
F1 : forall n : nat, Fin (S n)
| FS : forall n : nat, Fin n -> Fin (S n).
``````

`Fin n` is a set with `n` elements, namely: `F1`, `FS F1`, ..., `FS (FS (... F1))`. 1

We now formalize the statement "A has cardinality n" a la `osa1`.

``````Definition cardinality (n : nat) (A : Type) : Prop :=
exists (f : A -> Fin n) (g : Fin n -> A),
(forall x, g (f x) = x) /\
(forall y, f (g y) = y).
``````

The definition takes the two directions of the bijection and proofs that the two are inverses.

As a simple use of this definition, let's prove that the booleans have cardinality 2. We're going to need functions `bool -> Fin 2` and `Fin 2 -> bool`. The first direction is easy.

``````Definition bool_to_Fin_2 (x : bool) : Fin 2 :=
if x then FS _ (F1 _) else F1 _.
``````

The other direction is almost as easy.

``````Definition Fin_2_to_bool (y : Fin 2) : bool :=
match y with
| F1 _ => false
| FS _ (F1 _) => true
| _ => false (* bogus! *)
end.
``````

This may look like a strange definition. Since `y` has type `Fin 2`, there are only two possibilities, either `y` is `F1` or `FS F1`, right? Well, yes, but no. It is true (and provable in Coq) that this is the case, but Coq's pattern matching will not give this fact to you automatically. It is possible to convince Coq that the last case is impossible, but for now we take a simpler approach. We just supply a bogus value in the impossible case. We know that this case can never happen, so the value we put there doesn't matter as long as it type checks. In fact, during a proof by case analysis `y`, we will be able to show that the third case is impossible.

Here's the main theorem. (Pro tip: you can hover over a tactic to see the state of the goal before and after the tactic runs. Use this to step through the proofs in your browser as they are discussed.)

``````Theorem bool_cardinality_2 :
cardinality 2 bool.
Proof.
unfold cardinality.
exists bool_to_Fin_2.
exists Fin_2_to_bool.
``````

Here we just unfold the definition of cardinality and provide the two directions of the bijection. Next, we're asked to prove that they are in fact inverses.

``````  split; intros.
``````

In the first direction, everything is easy (just like in defining `bool_to_Fin_2`).

``````  - destruct x; auto.
``````

In the second direction, things are more difficult.

``````  - (* hover here to see subgoal *)
``````

Looking at the goal, we want to do case analysis on `y`, but a simple `destruct` won't work.

``````  destruct y. (* hover to see error *)
``````

This error is rather inscrutable, but the gist is that Coq's heuristic for doing case analysis has failed because the dependency in the types is too complicated (note: pretty much any dependency is too much dependency, recall Doug's Dad's wisdom).

When Coq performs a case analysis, it first abstracts over all indices. You may have seen this manifest as a loss of information when using `destruct` on predicates (try destructing `even 3` for example: it just deletes the hypothesis!), or when doing induction on a predicate with concrete indices (try proving `forall n, even (2*n+1) -> False` by induction on the hypothesis (not the `nat`) -- you'll be stuck!). Coq essentially forgets the concrete values of the indices. When trying to induct on such a hypothesis, one solution is to replace each concrete index with a new variable together with a constraint that forces the variable to be equal to the correct concrete value. `destruct` does something similar: when given a term of some inductive type with concrete index values, it first replaces the concrete values with new variables. It doesn't add the equality constraints (but `inversion` does). The error here is about abstracting out the indices. You can't just go replacing concrete values with arbitrary variables and hope that things still type check. It's just a heuristic.

Getting back to the point, the error stems from the fact that `destruct` replaces 2 with a new variable, say `n`. At which point `y` has type `Fin n`. But we pass `y` into the function `Fin_2_to_bool`, which requires an argument of type `Fin 2`. Thus the goal is ill typed, just like Coq said.

So what should we do? One option is to not use dependent types. Another (as suggested by CPDT, for example) is to use the `dependent destruction` tactic, which comes from the `Program` module.

``````    Require Import Program.
dependent destruction y.

+ reflexivity.
+ (* hover here to see subgoal *)
``````

This does exactly what we want! It gives us two goals, in the first case, `y` has been substituted out for `F1`, while in the second it has become `FS y` (confusingly reusing the variable name `y`). The first case goes through easily by computation. In the second case, we just want to case analyze `y` again.

``````      dependent destruction y.
* reflexivity.
* (* hover here to see subgoal *)
``````

More progress! We easily deal with another case (where the argument to `Fin_2_to_bool` is `FS F1`). But we are given one final case. This corresponds to the third case of the match when we defined `Fin_2_to_bool`. `reflexivity` won't work, since `bool_to_Fin_2` never returns `FS (FS F1)`. Instead, we can finish by showing that this case is actually impossible. We have an inhabitant of `Fin 0` in the context. But the set with 0 elements is uninhabited. `inversion` derives the necessary contradiction

``````        inversion y.
Qed.
``````

This works just fine. However, `dependent destruction` is a little too good to be true. Let's ask Coq to print the assumptions (ie, axioms) used in the proof of `bool_cardinality_2`.

``````Print Assumptions bool_cardinality_2. (* hover to see assumptions *)
``````

We see that our proof relies on the axiom `JMeq_eq`. This states that heterogeneous equality implies the usual propositional equality.

Heterogeneous equality (aka John Major equality, hence the name `JMeq` in Coq) lets one state equalities between terms of different types. This sounds strange, but it turns out to be very convenient, especially when you have a lot of dependent types floating around. It eliminates a lot of uses of `eq_rect` and generally makes things nice. Elimination of heterogeneous equalities is implemented in Coq using `JMeq_eq`.

How did we manage to use heterogeneous equality in our proof? It turns out that the implementation of `dependent destruction` uses it internally. Let's take another crack at proving `bool_cardinality_2` and see if we can avoid this axiom. We'll just look at the part of the proof that required `dependent destruction`.2

``````Lemma bool_to_Fin_2__Fin_2_to_bool__id :
forall y,
bool_to_Fin_2 (Fin_2_to_bool y) = y.
Proof.
intros.
``````

Here we are back at the fateful spot where we want to do case analysis on `y`. Remember that `destruct` errors out and `inversion` doesn't do what we want. Both of these tactics correspond to pattern matches in Gallina, and the trick we're going to use is to write our own matches (and more importantly, match annotations) to do the case analysis and carry the information we need.

Update: Note that the following development is slightly more complicated than necessary, but is a reasonable first cut. See the next post for a cleaner proof of the `fin_case` function below.

As a first attempt, we can try to do a simple match on `y`. (If you haven't seen `refine` before, it's a handy tactic that lets you "sketch" a proof term, using underscores to leave holes that will be filled in later. Below, the underscores on the left of the match arrow (`=>`) are "don't care" patterns, while underscores to the right of the arrow are holes for `refine`.)

``````  refine (match y with
| F1 _ => _
| FS _ _ => _
end).
``````

This gives us two subgoals, which is the number we expect. But they aren't the subgoals we expect. They're a mess! This is an example of Coq's `match` annotation inference being a little too smart (or not smart enough?) for its own good. To clean things up and see what's going on, let's explicitly give a `return` annotation.

``````  refine (match y as y' return bool_to_Fin_2 (Fin_2_to_bool y) = y with
| F1 _ => _
| FS _ _ => _
end).
``````

All I did was give a new name (`y'`) to the discriminee (the thing being matched on) and then copy-paste the type of the goal into the `return` slot. `y` and `y'` are secretly two names for the same thing, but Coq will automatically refine the type of `y'` based on what case of the match it's in. This sounds like a good thing, and it can be, but in this case Coq will refine the type from `Fin 2` to `Fin n` where `n` is a new variable. Thus, if we replace `y` with `y'` in the `return` annotation, it won't type check, similar to the problem with `destruct` that was described above. So in this case we just stick with the unrefined `y`.

Anyway, this generates two reasonable looking subgoals. The first corresponds to the case where `y = F1`, and the second to `y = FS F1`. However, we don't have enough information to proceed. We'd like Coq to generate equalities describing what `y` must be in each case. This is common when writing your own matches. To have Coq generate this equality, we use something called the "convoy pattern". The convoy pattern works by adding things to the `return` annotation. Coq automatically refines the type given in the `return` annotation based on what case of the match you're in. This will let us prove our equality.

Remember that Coq will automatically change `y'` to correspond to the current case of the match, so to state that `y = F1` in the first case and `y = FS` in the second case, we just say `y = y'`. Let's give it a try.

``````  refine (match y as y' (* hover to see error *)
return y = y' ->
bool_to_Fin_2 (Fin_2_to_bool y) = y with
| F1 _ => _
| FS _ _ => _
end _).
``````

This would work if the type of `y` wasn't dependent. As it is, though, `y` has type `Fin 2` while `y'` has type `Fin n` after refinement introduces a new variable `n`. Since these two types aren't equal, we can't even state that the terms are equal.

This is exactly where `dependent destruction` would use heterogeneous equality. Because heterogeneous equality does not require the types to be equal in order to state an equality between terms.

Instead, we'll further extend the `return` annotation to generate an equality `n = 2` and then explicitly transport one of the terms across this equality at the type level (`Fin n = Fin 2`) to state the desired term-level equality (`y' = y`).

``````  refine (match y as y' in Fin n
return forall pf : n = 2,
eq_rect n Fin y' 2 pf = y ->
bool_to_Fin_2 (Fin_2_to_bool y) = y with
| F1 _ => _
| FS _ _ => _
end eq_refl eq_refl).

``````

That's an epic `match` annotation. First, we add the clause `in Fin n` to bind a name for the index to `Fin`. Then, in the `return`, we take a proof that `n = 2` and a proof that `y'` (which has type `Fin n`) transported across the equality `n = 2` is equal to `y`. Finally, we return the same goal as before.

The `return` type of the `match` is a function of two arguments (proofs of the two equalities). The first of which asks us to prove `n = 2`. Coq automatically refines this to `2 = 2` since the discriminee (`y`) has type `Fin 2`. So we can give `eq_refl` for the first proof. The second proof is only a bit more complicated: once we've given `eq_refl` for the first proof, the type simplifies to `y = y`, so we can again give `eq_refl`. This leaves us to fill in the two branches of the match.

With the new equalities in context, we now have enough information to finish the first case.

``````  - intros.
inversion pf.
subst.
rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
reflexivity.
``````

The main trick here is to use decidable equality of the natural numbers to get rid of the `eq_rect`. This is the insight that makes the whole technique work. Decidable equality on a type means, in the language of HoTT, that the type is an h-Set. Thus there are provably no non-refl paths, and so the transports can be eliminated. The rest of the proof is easy now that we have all the necessary information in our context.

The second case starts out just the same.

``````  - intros.
inversion pf.
subst.
rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
simpl.
``````

We're left wanting to do another case analysis, this time on `f`. The same technique will work again, but instead of copy-pasting, let's take a step back and implement a general solution.

## A case analysis tactic for `Fin`

Our goal is to use the technique we just saw to define an analogue to `dependent destruction`, but without relying on any axioms and specialized to `Fin`. We'll first prove a lemma that feels kind of like an induction principle, except it doesn't give an induction hypothesis. A destruction principle, perhaps.

``````Lemma fin_case :
forall n (P : Fin (S n) -> Type),
(P (F1 _)) ->
(forall x, P (FS _ x)) ->
(forall x, P x).
``````

The lemma takes a natural number `n` and predicate `P` on `Fin (S n)` and gives sufficient conditions to prove that `P` holds for all `x : Fin (S n)`.3 In particular, it takes a proof that `P` holds on `F1` and that, forall `x : Fin n`, `P` holds on `FS x`. Intuitively, these are indeed sufficient. Let's prove it.

``````Proof.
intros.
refine (match x as x' in Fin n'
return forall pf : n' = S n,
eq_rect n' Fin x' (S n) pf = x ->
P x with
| F1 _ => _
| FS _ _ => _
end eq_refl eq_refl).
- intros.
inversion pf.
subst.
rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
auto.
- intros.
inversion pf.
subst.
rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
auto.
Qed.
``````

We use the same flavor of `match` annotation as before and are again left with three cases. In each case we simplify by getting rid of the transport using decidable equality as before. The first two cases follow from the two hypotheses of the lemma, and the proof of the last equality is trivial.

Let's now use our lemma to complete and cleanup the proof of the eloquently named `bool_to_Fin_2__Fin_2_to_bool__id`.

``````Lemma bool_to_Fin_2__Fin_2_to_bool__id :
forall y,
bool_to_Fin_2 (Fin_2_to_bool y) = y.
Proof.
intros.
``````

At this point we want to destruct `y` using our new `fin_case` lemma. If we try that directly, we don't get quite what we hoped for.

``````  apply fin_case.
``````

The problem is that the conclusion of `fin_case` is the very general `P x`. We're asking Coq to come up with a predicate `P : Fin (S n) -> Type` and a term `x : Fin (S n)` such that the goal is `P x`. Coq decides to take the second occurrence of `y` as the argument of the predicate. Thus `fin_case` only destructs that occurrence.

To get the desired result, we need to guide Coq towards choosing the right predicate that destructs both occurrences of `y`. One way to achieve this is using a `with (P := fun x => ...)` clause to apply. Another way that involves less typing is the `pattern` tactic, which "beta expands" the goal with respect to some term, thus factoring the goal into `P` applied to `x`. Once the goal has been rewritten in this way, Coq chooses the correct predicate for `fin_case`.

``````  pattern y.
apply fin_case.
``````

This time the case analysis works as expected. We are left with two cases, the first of which is trivial.

``````  - reflexivity.
``````

In the second case, we do another case analysis to consider `FS F1`.

``````  - intros.
pattern x.
apply fin_case.
+ reflexivity.
``````

Finally, we are given the absurd case with an inhabit of `Fin 0`.

``````    + intros.
inversion x0.
Qed.
``````

As a last cleanup pass, we can wrap up the `pattern ...; apply fin_case` sequence in a custom tactic. The final proof is identical to the `dependent destruction` proof, and it 100% axiom-free!

``````Ltac fin_dep_destruct v :=
pattern v; apply fin_case; clear v; intros.

Lemma bool_to_Fin_2__Fin_2_to_bool__id' :
forall y,
bool_to_Fin_2 (Fin_2_to_bool y) = y.
Proof.
intros.
fin_dep_destruct y.
- reflexivity.
- fin_dep_destruct x.
+ reflexivity.
+ inversion x.
Qed.

Print Assumptions bool_to_Fin_2__Fin_2_to_bool__id'.
``````

A few final notes. First, this tactic will not properly destruct a term that appears in hypotheses as well. The fix would be to `revert` those hypotheses before using `pattern`. This could even be achieved with a relatively low level of pain using additional Ltac hacking, but I won't go into that here. Exercise for the reader! Second, note that `fin_case` is similar to the lemma `caseS` in the standard library (the only difference being where `n` is quantified).4 There's even a recursive (inductive) version, called `rectS` that essentially implements a dependent form of induction. I actually developed this trick independently, but it's not at all surprising that such a thing already existed. What is surprising to me is that no readers of the first version of this post pointed out `caseS`!

## Another Example and a Design Pattern for Dependent Case Analysis without Axioms

We've seen how to implement dependent case analysis for `Fin`, but what about in general? As far as I can tell, the same basic trick works: prove a "destruction principle" for your type, which takes a hypothesis proving the goal on each constructor of the type. The tricky bit is giving a type to the predicate `P`. In the case of `Fin`, both constructors return elements of `Fin (S n)`. Importantly, the indices from both cases unify. This allowed us to give `P` type `Fin (S n) -> Type`, which simplified things. When the indices from different constructors don't unify, the statement of the case lemma is more complicated, and requires explicit transports to make it typecheck.

I've worked this out in detail for an example from CPDT: the constant folding example from the ProgLang Chapter using the first order representation of syntax. In this situation, we have a types `type` and `term` that represent the types and expressions in the object language. The `member` family encodes dependent deBruijn indices for variable names. The `cfold` transformation (not shown below) implements constant folding, which we want to prove sound. The proof of its soundness requires case analysis on some `term`s, for which CPDT uses a variant of `dependent destruction`. Using the techniques from this post, we can eliminate the reliance on axioms.

Here is the relevant code in a standalone snippet. See the book for more details and the actual proof of soundness. I won't explain it in detail, but hopefully `term_case` will help people looking to implement their own axiom-less dependent case analysis lemmas. (I also left out the cute hover contexts because the proof isn't that interesting).

``````Set Implicit Arguments.
Inductive type : Type :=  Nat : type | Func : type -> type -> type.

Require Import List.
Import ListNotations.

Inductive member (A : Type) (elm : A) : list A -> Type :=
HFirst : forall ls : list A, member elm (elm :: ls)
| HNext : forall (x : A) (ls : list A),
member elm ls -> member elm (x :: ls).

Inductive term : list type -> type -> Type :=
Var : forall (G : list type) (t : type), member t G -> term G t
| Const : forall G : list type, nat -> term G Nat
| Plus : forall G : list type, term G Nat -> term G Nat -> term G Nat
| Abs : forall (G : list type) (dom ran : type),
term (dom :: G) ran -> term G (Func dom ran)
| App : forall (G : list type) (dom ran : type),
term G (Func dom ran) -> term G dom -> term G ran
| Let : forall (G : list type) (t1 t2 : type),
term G t1 -> term (t1 :: G) t2 -> term G t2.

Lemma ty_eq_dec :
forall x y : type, {x = y} + {x <> y}.
Proof.
decide equality.
Qed.
Lemma term_case :
forall G t (P : term G t -> Type),
(forall p, P (Var p)) ->
(forall n (pf : t = Nat),
P (eq_rect Nat _ (Const G n) t (eq_sym pf))) ->
(forall e1 e2 (pf : t = Nat),
P (eq_rect Nat _ (Plus e1 e2) t (eq_sym pf))) ->
(forall dom ran (pf : t = Func dom ran) b,
P (eq_rect (Func dom ran) _ (Abs b) _ (eq_sym pf))) ->
(forall dom e1 e2, P (App (dom:=dom) e1 e2)) ->
(forall ty1 e1 e2, P (Let (t1:=ty1) e1 e2)) ->
forall e, P e.
Proof.
intros.
destruct e eqn:?; auto.
- specialize (X0 n eq_refl).
auto.
- specialize (X1 t0_1 t0_2 eq_refl).
auto.
- specialize (X2 dom ran eq_refl t0).
auto.
Qed.
``````

The `term_case` lemma follows the same pattern as `fin_case`: one hypothesis for each constructor. It is complicated by the fact that `term` has two indices instead of just 1 (but the use of `G` is simple enough that it doesn't require special handling) and the fact that the constructors do not return indices that unify (the `type` index is different in each case). Thus to state the hypotheses, we have to use extra transports (ie, `eq_rect`s). The proof of the lemma is easy after using simplification based on decidable equality. We also had to prove that equality on `type` is decidable. Again, that is the key to this whole approach: if your indices have a type with decidable equality, you don't need `JMeq` to get dependent case analysis.

At this point we could define a custom tactic to call `pattern` and `term_case`, just as we did with `fin_case`, but I'll leave those out because they are virtually identical. The proof of `cfold_sound` (which you can find in the book) now goes through by replacing `dependent destruction` with applications of `term_case`.

Last updated: January 13, 2016

1. There's another standard definition of `Fin` as a Coq `Fixpoint` from `nat` to `Type`. This `Fixpoint` encoding is often easier to work with because Coq can perform type-level computation. In fact, this encoding would make the example in this post straightforward to define and would not require `dependent destruction` or any other trickery. But that's not the point! The point is that one does not always have such an encoding to work with, so it's useful to have techniques that work with the inductive definitions. This is just a simple example.

2. There is some debate in the proof assistant community about whether axioms like `JMeq` should be used in verification efforts. According to CPDT, `JMeq` has been proven (on paper) to be a consistent extension of Coq's type theory. I tend to come down on the side of using whatever axioms you want if it helps you be productive. Clearly there are trade-offs here. And in any case, it's nice to know that you can avoid this axiom in the common case (of index types with decidable equality) with a little extra work.

3. Because both constructors of the `Fin` type return terms of type `Fin (S n)`, it simplifies the statement of the lemma to give `P` type `Fin (S n) -> Type` instead of the more natural (?) `Fin n -> Type`. For more complex inductive type families where the index is not constant across all constructors, this simplification isn't possible, and a different technique is needed. We'll see that in the example at the end of the post.

Finally, note that this simplification can make it harder to actually use `fin_case`. Before you apply it, you need to know that your element of `Fin` is at index `S n` for some `n`. This is always the case for `Fin` (since `Fin 0` is uninhabited), but you might expect `fin_case` (or the helper tactic defined below) to do it for you. Extending the tactic to do so is a good exercise!

4. The proof term for `caseS` is interesting (read: way better than mine). It uses a similarly complicated `match` annotation, but one that avoids the need for decidable equality or `eq_rect`. I haven't fully explored how that technique would generalize to non-`Fin` types, but I expect the results would be cleaner.