Basic AC, part 2: Choice in Algebra

The outline I (apparently) wrote on the previous post in this series says this post should talk about the Axiom of Choice in algebra, particularly how it affects vector spaces and groups. Before I talk about vector spaces and groups, though, I should (since I boldly subtitled this post “Choice in algebra”) mention some other places AC appears in algebra:

AC implies that every field has an algebraic closure;
AC implies that every nonzero ring* has a maximal ideal;
AC implies that every subgroup of a free group is free (the Nielsen–Schreier theorem).

Moreover, each of these cannot be proved in ZF alone; that is,

there is a model of ZF in which some field has no algebraic closure;
there is a model of ZF in which some ring has no maximal ideal;**
there is a model of ZF in which some free group has a non-freely-generated subgroup.

*In this post, ‘ring’ = ‘ring with identity’.
**I haven’t found a proof for this, actually, but Wikipedia claims the statement ‘every nonzero ring has a maximal ideal’ is actually equivalent to AC. My guess would be that the boolean ring of subsets of an amorphous set wouldn’t have a maximal ideal. I’ll look into it.

In this post I’m going to prove the following:

AC is equivalent to the assertion ‘Every set can be endowed with a group structure’;
the assertion ‘Every vector space has a basis’ is equivalent to AC; and
assuming AC, bases in a vector space must have the same cardinality.

The first result I shamelessly pull from Ashutosh’s answer to this MO question. Apparently the result is due to Hajnal and Kertész.¹ I’ll provide some more detail, though. Hopefully you’ll find that helpful.

Theorem 1. In ZF, the following are equivalent:

AC;
For every set $X$ , there is a binary operation $\star$ such that $(X,\star)$ forms a group.

Proof. First suppose that AC holds, and let $X$ be a set. If $X$ is finite, we can give $X$ the cyclic group structure, so suppose $X$ is infinite. Then there is a bijection from $X$ to the set $X^{<\omega}$ of finite subsets of $X$ . (In the presence of AC, this is an elementary exercise in cardinal arithmetic.) The set $X^{<\omega}$ becomes a group when endowed with the symmetric difference operation: $A \,\Delta\, B = (A \smallsetminus B) \cup (B\smallsetminus A)$ . If you don’t believe this, you should check it yourself. It’s a worthwhile exercise. Pulling this operation back along the bijection $X \to X^{<\omega}$ gives a group operation on $X$ .

The converse isn’t so easy. Suppose (2) holds. The first step is to obtain an ordinal $h(X)$ (the least such is called the Hartogs number of $X$ ) such that there is no injection from $h(X)$ into $X$ . The set of well-orderings of subsets of $X$ corresponds to a set of ordinals (namely the order-types of such well-orderings), which, being only a set, is bounded in the class of ordinals. So there are all kinds of ordinals greater than each order-type of a well-ordering of a subset of $X$ ; therefore there must be a least such ordinal $h(X)$ .

Now let $\star$ be a group operation on the disjoint union $X \sqcup h(X)$ . For every $x\in X$ there is an element $\alpha\in h(X)$ such that $x\star \alpha\in h(X)$ ; indeed, the map $h(X) \to X \sqcup h(X)$ given by $\alpha\mapsto x\star \alpha$ is an injection, so its image cannot be contained in $X$ .

Using this fact, define a function $f\colon X \to (h(X))^2$ in the following way. Equip $(h(X))^2$ with the lexicographic order (a well-ordering) $\le$ . Now for $x\in X$ let $f(x)$ be the $\le$ -minimal pair $(\alpha,\beta)$ in $(h(X))^2$ such that $x\star \alpha = \beta$ . Then $f$ is an injection, as is clear from the fact that $\star$ is a group operation. In summary, we have created a bijection from $X$ to a subset (namely $fX$ ) of the well-ordered set $(h(X))^2$ , which induces a well-ordering on $X$ .

Under the assumption (2) we have proved in ZF that every set $X$ can be well-ordered. From this we deduce AC. $\square$

Hotovy wonders whether, in the presence of AC, you can always “put a ring on it” (his words). That is, he wonders whether every set can be made into a ring. The answer is yes: take the abelian group constructed above and, as he suggests, equip it with the trivial multiplication (or equip it with the multiplication given by $x\cdot y = x \cap y$ ). This ring doesn’t have an identity, though. I don’t know whether it is possible (even with full AC) to endow any old set with a ring structure (commutative?) that gives the ring an identity, and I don’t know whether such an assertion is equivalent to AC.

The proof in ZFC that every vector space has a basis is a standard Zorn’s-Lemma argument. Since Andy has already done a fine job of outlining that argument here (hopefully he won’t mind the link), I won’t repeat it. Instead, I’ll provide a lovely proof (due to Andreas Blass) of the converse: in ZF, the assertion that every vector space has a basis implies AC.

First, some technicalities: We will need to consider a(n ostensibly) weakened version of AC here. Call the following assertion the ‘Axiom of Multiple Choice’ (AMC):

For every set $X$ not containing the empty set $\emptyset$ , there is a function that assigns to each element $a$ of $X$ a finite nonempty subset of $a$ .

Clearly AC implies AMC. It is also the case that in ZF, AMC implies AC. See Chapter 9 of Jech’s book The Axiom of Choice for details. We’re going to prove:

Theorem 2. In ZF, the assertion that every vector space has a basis implies AMC.

and deduce from this theorem that ‘Every vector space has a basis’ implies AC.

The basic idea of the proof is that we’ll have to construct a vector space out of a given set $X$ and force the vector space structure to spit out a bunch of finite nonempty subsets of members of $X$ . Blass’s great insight here, in my opinion, was that it’s important to make both the vector space and the ground field to work for you. Here’s his proof:

Proof of Theorem 2. Let $U = \{ u_i : i\in I\}$ be a set of nonempty (pairwise) disjoint sets indexed by a set $I$ . Our goal is to find a family $\{f_i : i\in I\}$ of finite subsets $f_i$ of the $u_i$ . Let $k$ be any field, and put $X = \bigcup U$ . Obtain the field $k(X)$ of rational functions over $X$ by adjoining to $k$ all elements of $X$ as indeterminates and passing to the fraction field. So $k(X)$ is the field whose elements are $k$ -rational functions in the ‘variables’ $x\in X$ . For every $i\in I$ define the $i$ –degree of a monomial in $k(X)$ to be the sum of the exponents of members of $u_i$ in that monomial. We say a rational function $q\in k(X)$ is $i$ –homogeneous of degree $d$ if $q$ is a quotient $\frac pr$ such that every monomial in the numerator $p$ has the same $i$ -degree $n$ while every monomial in the denominator $r$ has the same $i$ -degree $n-d$ . (So the ‘of degree $d$ ‘ means each monomial in the numerator has $i$ -degree $d$ more than each monomial in the denominator.)

It isn’t difficult to see that those rational functions in $k(X)$ which are $i$ -homogeneous of degree $0$ for every $i\in I$ form a subfield $F$ of $k(X)$ . (If you’re following along in Blass’s original paper or in Rubin & Rubin, notice that my $F$ is their $K$ .) Thus $k(X)$ is a vector space over $F$ ; let $V$ be the subspace of (the $F$ -vector space) $k(X)$ ~~generated~~ spanned (whatever people say in lin. alg.) by the set $X = \bigcup u_i$ .

(OK, the setup is done. Recap: we have a containment of fields $F \subset k(X)$ , and thus the bigger field $k(X)$ (rational functions) is a vector space over the smaller field $F$ ( $i$ -homogeneous functions of degree 0). We have let $V$ be the subspace of $k(X)$ spanned by the set $X= \bigcup u_i$ .)

By assumption the $F$ -vector space $V$ has a basis $B$ . For a moment, fix a coordinate $i\in I$ . For every element $x\in u_i$ we can express $x$ as an $F$ -linear combination of basic elements. That is, there is a finite subset $B(x)$ of the basis $B$ and nonzero coefficients $a_b(x)\in F$ , one for each $b\in B(x)$ , such that

$x = \displaystyle\sum_{b\in B(x)} a_b(x)\cdot b. \qquad\qquad(\star)$

For another element $y\in u_i$ , we have a similar equation:

$y = \displaystyle\sum_{b\in B(y)} a_b(y)\cdot b.$

But we can multiply the equation $(\star)$ by $y/x$ to get an alternative expression of $y$ as an $F$ -linear combination of basic vectors:

$y = \displaystyle\sum_{b\in B(x)} (y/x) a_b(x)\cdot b.$

Before you do anything else, notice that, because $x$ and $y$ both belong to the set $u_i$ , the product $(y/x)a_b(x)$ is $i$ -homogeneous of degree $0$ , and so each coefficient $(y/x)a_b(x)$ is indeed a member of the ground field $F$ . But now we have two expressions of $y$ as an $F$ -linear combination of elements of the basis $B$ , so those two expressions must be the same. That is, we must have $B(x) = B(y)$ and $a_b(y)/y = a_b(x)/x$ for every $b\in B(x) = B(y)$ . In particular, the finite subset $B(x)$ of $B$ and the elements $a_b(x)/x$ of $k(X)$ depend only on the coordinate $i\in I$ , not on the particular element $x_i$ .

We can therefore assign (uniquely, without any choice) to each coordinate $i\in I$ its finite subset $B_i = B(x)$ of $B$ and its elements $\alpha_{bi} (= a_b(x)/x)$ of $k(X)$ . Observe that, for each $b\in B_i$ , $\alpha_{bi}$ is $i$ -homogeneous of degree $-1$ (because of the $x$ in the denominator of $a_b(x)/x$ ) and $j$ -homogeneous of degree $0$ for $j\ne i$ (by the definition of the field $F$ and the fact that each $a_b(x)$ belongs to $F$ ). Therefore, when $\alpha_{bi}$ is written as a quotient of polynomials in reduced form, some indeterminates from $u_i$ must occur in the denominator. Define $f_i$ to be the set of members of $u_i$ that appear in the denominator of $\alpha_{bi}$ for some $b\in B_i$ . Then $f_i$ is a finite nonempty subset of $u_i$ , which is what we wanted. $\hfill \square$

After one proves via Zorn’s Lemma that every vector space has a basis, it’s natural to ask whether infinite-dimensional vector spaces enjoy ‘uniqueness of dimension’ as finite-dimensional ones do. Luckily, in the presence of AC, linear algebra’s reputation for being pathology-free is not tarnished here, as I show below. I direct your critical eye to the proof of this theorem on Wikipedia, which I dislike for several reasons. Zeroth, it’s not very enlightening, as a good proof would be. First, it uses so-called ‘contradiction’ (really contraposition in this case, I suppose) unnecessarily and without benefit to the reader. (That is an awful habit.) Second (a failure of the presentation rather than the proof), it fails to describe why there is a bijection from the first basis to the chosen set of elements in the second which span the first, when this is the crux of the argument! Finally, it uses full Choice when much less is needed, and using less makes for a better, more illuminating proof.

In preparing for this blog post, I set out to prove this result by Zorn’s-Lemma-ing the poset of partial span-preserving bijections $f\colon A_1\to A_2$ , where $A_1$ is a subset of the first basis and $A_2$ is a subset of the second. This seems to work; try it for yourself if you want. But, like Wikipedia’s proof, it uses full choice (ZL) without good reason. Luckily, I stumbled on a wonderful proof (possibly also due to Andreas Blass, whom early polls show to be the favourite for MVP of this post, but maybe a result from Rubin & Rubin³ (though I can’t find it there…)) that is much, much cuter than the others and uses only the Compactness Theorem (independent of ZF but equivalent to the Prime Ideal Theorem, which is weaker than full AC).

Theorem 3. Suppose $V$ is a vector space (over any field) and $X$ and $Y$ are both bases for $V.$ Then there is a bijection $X \to Y$ .

Proof. It suffices, by the Schröder–Bernstein Theorem, to find an injection $f\colon X\to Y$ . For every element $x\in X$ there is a unique minimal finite subset $A(x)$ of $Y$ such that $x$ belongs to the span of $A(x)$ . (The set $A(x)$ is unique since $Y$ is independent.) We are going to insist that $f(x)\in A(x)$ for every $x$ . This seems to make the problem more difficult, but it (doesn’t and) allows us to reformulate the problem in terms of the satisfiability of a set of propositional formulas, which we define as follows: Let $p(x,y)$ mean $f(x)=y$ . Consider the following formulas:

$y\ne y' \to \neg(p(x,y) \wedge p(x,y'))$ (‘ $f$ is a function’);
$x \ne x' \to \neg(p(x,y) \wedge p(x',y))$ (‘ $f$ is an injection’);
for each $x\in X$ , the disjunction $\displaystyle \bigvee_{y\in A(x)} p(x,y)$ (‘ $f$ is defined at every $x\in X$ and $f(x)\in A(x)$ ‘).

To say that these formulas are simultaneously satisfiable is to say that there is an injection $f\colon X\to Y$ such that $f(x) \in A(x)$ for every $x\in X$ . By the Compactness Theorem, we need only prove that every finite subset of these formulas is satisfiable. That is, we need to show that for each finite subset $X'$ of $X$ , there is an injection from $X'$ to $Y$ sending every $x\in X'$ to an element of $A(x)$ .

Let $X'$ be a finite subset of $X$ , and let $Z$ be a subset of $X'$ . The set $A(Z)$ (hopefully it’s clear what that means) is a(n independent) subset of $Y$ that spans $Z$ , an independent set. By standard finite-dimensional linear algebra we can’t have $|A(Z)|< |Z|$ , for this would mean that we could express $n=|Z|$ linearly independent vectors as the linear combination of fewer than $n$ vectors. That is, it must be that $|Z| \ge |A(Z)|$ for every subset $Z$ of $X'$ . By Hall’s Matching Theorem, there is an injection $X'\to Y$ sending every $x$ to an element of $A(x)$ . The proof is complete, by the Compactness Theorem. =^] $\square$

I like these linear-algebra–AC proofs. Each uses the sort-of ‘local’ finiteness of a vector space and the rigidness linear independence provides to transfer information between vector space structure and sets. Pretty cool, eh?

¹Hajnal, A., Kertész, A. Some new algebraic equivalents of the axiom of choice, Publ. Math. Debrecen 19 (1972), 339–340. (MR0329895)

²Blass, Andreas. Existence of bases implies the axiom of choice. Axiomatic set theory (Boulder, Colo., 1983), 31–33, Contemp. Math., 31, Amer. Math. Soc., Providence, RI, 1984. (MR0763890)

³Rubin, H., Rubin, J. E. Equivalents of the axiom of choice. II. Studies in Logic and the Foundations of Mathematics, 116. North-Holland Publishing Co., Amsterdam, 1985. (MR0798475)

11 Responses to Basic AC, part 2: Choice in Algebra

Z Norwood says:

11/06/2011 at 15:14

Does anyone know how to get the stupid single quotes ‘——‘ to turn around the right way?

- JCummings says:
  
  11/08/2011 at 01:44
  
  I don’t know. But good post! I liked the end of the proof of Theorem 3. Both the pun and the use of Hall! :-)
  
  - Z Norwood says:
    
    11/08/2011 at 02:29
    
    Thanks!
    What pun?
    I thought you’d like the use of Hall. I like that proof a lot.
    
    - JCummings says:
      
      11/08/2011 at 02:37
      
      “The proof is complete, by the Compactness Theorem. =^]” Although I suppose you didn’t mention that the proof was totally bounded…
      
      Didn’t you mean that?
    - Z Norwood says:
      
      11/08/2011 at 02:39
      
      I didn’t! But it’s a decent pun. Good catch!
Nicki says:

11/09/2011 at 14:41

Zach, this is really cool. Also, today we talked about how to make a set into a ring. We talked briefly about 5 ways to do so (although I’m sure there are more ways). If I find some time this weekend I’ll have to ask you some more about them.

- JCummings says:
  
  11/09/2011 at 16:47
  
  Nicki! You should write a post on those 5 ways, I’d love to see them!
  
Pingback: Difficult undergrad proofs | whateversuitsyourboat
Pingback: Global choice and algebraic closures « Andy Soffer
Pingback: Digression: Hilbert Spaces – Part 1 | whateversuitsyourboat
Pingback: AC, part III: Choice in context | whateversuitsyourboat