A Comprehensive Solution to the Paradoxes
by Andrew Boucher
v1.00 Last updated: 1 Sept 2000
Please send your comments to abo AT andrewboucher.com

A solution to the paradoxes has two sides: the philosophical and the technical. The paradoxes are, first and foremost, a philosophical problem. A philosophical solution must pinpoint the exact step where the reasoning that leads to contradiction is fallacious, and then explain why it is so.

A technical solution is an axiomatization which avoids Russell's paradox or its equivalents. If one can pinpoint where the reasoning leading to paradox is fallacious, then one should know how to avoid it. Evidently, the philosophical solution must motivate and justify the technical.

(Why "equivalent"? For instance, if one uses predicates rather than sets in one's mathematics, then it is strictly speaking not Russell's Paradox which is of interest, but the equivalent paradox, expressed in the language of predicates which requires a technical solution. )

A comprehensive treatment should link the philosophical solution to the Liar with that to Russell's or its equivalent, and thereby ties together the philosophical solution to the Liar with the technical solution.

ENDLESS REFERENCE

Beginning, not with a paradox, but with a look at certain referring concepts, will provide some useful ideas for the suite.

Referring concepts refer either successfully or unsuccessfully. "George Washington" refers successfully, to the man George Washington, while "the King of France" refers unsuccessfully (it has no referent), because there is no King of France. Unlike "the tallest man in the world" and "the King of France," some referring concepts have their referent determined by the reference of other referring concepts, e.g. A1 by A2 in the following:

[A1] "the referent of the referring concept A2"
[A2] "George Washington"
The referent of A1 is George Washington, because he is the referent of A2. Indeed, one has to know who the referent of A2 is before one can know who the referent of A1 is, and that is why we say A1's referent is determined by A2's. For instance, if 'A2' had labeled "Thomas Jefferson," then A1 would have referred to Thomas Jefferson. And if 'A2' had labeled "the King of France," then A1 would have referred unsuccessfully, since, just as "the King of France" supposes that there is a King of France, A1 supposes that A2 has a referent.

Generally, the construct "the referent of the referring concept A" will have its reference determined by A's referent.

The referent of a referring concept may be determined by the reference of several referring concepts:

[A3] "the referent of the next referring concept"
"the referent of the next referring concept"
"the referent of the next referring concept"
[A4] "George Washington."
A3 refers successfully, again to George Washington. Had 'A4' labeled "the King of France," it would not have. Also, it would not have had the sequence continued endlessly, as in
[A5] "the referent of the next referring concept"
"the referent of the next referring concept"
"the referent of the next referring concept"
"the referent of the next referring concept"
etc.
A5 does not refer successfully, because one never can find out who or what the referent of the ever next referring concept is.

Endless reference is a sub-class of unsuccessful reference. A sub-class of endless reference--and its most spectacular form--is viciousness, exemplified by

[A6] "the referent of this referring concept"
or
[A7] "the referent of the referring concept A7".
A6 is perhaps not the best example of viciousness, since it is debatable whether it is sensible to use "this referring concept." In any case the question is not crucial, since we are able to refer to concepts in other ways. For instance, the sub-referring concept of A7 refers to A7 by means of a label. And so A7 is clearly sensible, since after all had 'A7' labeled some other concept, "the referent of the referring concept A7" could well have referred successfully. But as it stands A7 is vicious, since its referent is determined by the referent it is supposed to have; and so it cannot refer at all. Metaphorically, it is saying that it points to whatever it shall point to; and so we have not been told what it points to.

It should be emphasized that the problem is not in self-reference. First, referring concepts may refer to themselves, e.g.

[A8] "the referring concept A8".
Secondly, A7 does not refer to itself. Rather, it is a sub-concept of A7 which refers to A7. A7 itself tries to use whatever its referent is, to determine what its referent will be, which is vicious, and so it does not refer at all. To state this in a general fashion, a referring concept is vicious in a particular context when its reference is determined in whole or in part by means of the reference it is supposed to have in that context. And these referring concepts cannot refer successfully, since there can only be the ability to say what the referent would be if we knew what the referent were, and never what the referent is.

A7 gives an example where reference is determined "in whole." An example where reference is determined "in part" is:

[A9] "the referents of the referring concept A9 and the men in this room,"
as it might appear in the statement, "The referents of the referring concept A9 and the men in this room are all bald." A9 does not refer successfully, because again we cannot know to whom it refers without knowing already to whom it refers.

As the reader may already understand, the examples of viciousness constructed here have nothing to do with the existence of presumed totalities. It may be recalled that the original Vicious Circle Principle, enunciated by Poincare and taken up by Russell and Whitehead in Principia Mathematica, was stated in terms of totalities, and even to this day viciousness is sometimes wrongly considered only in this framework. But viciousness is a general phenomenon which results from self-presumption--whether or not totalities are involved.

Because of this false concentration on totalities, sometimes heavy weather has been made of the question of whether or not a totality is vicious. It is rather easy to draw this distinction in the more general setting of referring concepts, so let us do so. A referring concept whose reference is determined by a group to which the referent belongs, is not vicious so long as which group this is, is not determined by the referent. For instance "the man with the highest field goal percentage on the Celtics" is not vicious, since one may first find the Celtics, and then find the man with the highest field goal percentage on that team. Viciousness is present only when the group is determined by means of the referent, e.g.

[A10] "the man with the highest field goal percentage on the team to which the referent of the referring concept A10 belongs."
In order to find the referent here, one must first know the team of which he is a member. But to find out the team, one must know the referent. And this is a vicious circle. Both the referring concept, and the presumed totality, are vicious.

Viciousness need not involve a sub-concept referring to the whole concept. Consider:

[A11] "the referent of the next referring concept"
[A12] "the referent of the previous referring concept."
The reference of A11 is determined by whatever the referent of A12 turns out to be. But the referent of A12 is determined by whatever the referent of A11 turns out to be. So neither refers successfully. If there had been instead
[A13] "the referent of the next referring concept"
"George Washington"
"the referent of the previous referring concept,"
all three referring concepts would have referred to George Washington.

Of course all this might be very interesting, yet there is still as yet no paradox. There is no contradiction in a referring concept not having its reference determined. It does not refer successfully, and so to speak that is that. Paradox must wait until the next section.

THE PARADOXES OF TRUTH

The Liar's Paradox is actually more or less a family of paradoxes, all resembling

[B1] "Not B1 is true".
The paradox is that, if B1 is not true, then the statement appears true; and if true, then not true.

Clearly the viciousness of B1 is striking. But it is not viciousness, but the wider category of endlessness which produces paradox. Consider:

[B2.1] "The next statement is not true"
[B2.2] "The next statement is not true"
[B2.3] "The next statement is not true"
etc.
By traditional reasoning, it can be infered that B2.i is true if and only if B2.i+1 is not true if and only if B2.i+2 is true, for all i. In other words, there is an endless sequence of statements with alternating truth values. While not contradictory, nonetheless it is paradoxical in the sense of something contra-intuitive, because there apparently is no definite truth-value to bestow on any particular statement in the sequence. There seems no reason why B2.1 is true and B2.2 not true, or vice versa, since B2.2 seems to be in exactly the same position as B2.1, namely at the head of an endless sequence of statements "The next statement is not true". So, while the vicious cases are certainly interesting and always spectacular, there are non-vicious paradoxical cases.

Identifying endlessness as a or even the problem, of course, is not a solution. It simply maps out where to find paradoxes. Providing a solution, again entails pinpointing, in the reasoning which produces a contradiction, the exact step that is fallacious.

A common proposed solution is the creation of some new truth value, say X. So, instead of two, there are three truth-values: true, false, and X. One then classifies "This statement is false" as X. Since X is disjoint from truth, "This statement is false" is not true. But because falsehood is no longer co-extensive with non-truth, we cannot conclude that "This statement is false" is false, and so that it is true. Thus there is no paradox. But all this is rather dishonest, because "This statement is false" is not the Liar. The Liar (often called the "strengthened" Liar out of politeness) is "This statement is not true" or, as we have written it to ensure that it is clear we are speaking of the full negation, "Not this statement is true." So, the introduction of a new truth value, and the division of non-truth into different categories, in and of itself, is of no use. It should be done only if one uses it to illuminate non-truthfulness in some special way, so that one can talk more cogently about it and the Liar, which again is "Not this statement is true." In brief, the fallacy in the Liar cannot be the assumption that there are only two truth values, because, even after dividing the category of non-truth into two (or more) subsets, there are still only two truth-values: truth and non-truth.

In order to pinpoint the fallacious reasoning, let us analyse the Liar argument in some detail:

Step 1. Suppose B1 is true.
Step 2. "Not B1 is true" and B1 are the same thing, so by substitution (of equals for equals), "Not B1 is true" is true.
Step 3. Using the rule "If "S" is true, then S", one concludes that not B1 is true.
Step 4. Statements 1 and 3 are contradictory, so by reductio ad absurdam, the supposition is not so, i.e. not B1 is true.
Step 5. From "If S, then "S" is true", it can be infered that "Not B1 is true" is true.
Step 6. "Not B1 is true" and B1 are the same thing, so by substitution (of equals for equals), B1 is true.
Step 7. But this is a contradiction.

The logic in this argument appears impeccable. Substitution of equals for equals, even if not considered a part of logic, still is largely indisputable. So, to avoid the conclusion, one must reject one of the two rules used, either "If "S" is true, then S", or "If S, then "S" is true." Remark that both are immediate consequences of what we will call Tarski's Truth Principle, or TTP for short, which is a schema which says that

S if and only "S" is true.
(An instance of TPP is, for example, that
Snow is white if and only if "Snow is white" is true.)

One of the two rules, "If "S" is true, then S," appears clearly sound. By process of elimination, the fallacy in the Liar argument must lay in the other direction. That is,

If S, then "S" is true
is an invalid rule, and there may be examples where S but not "S" is true. For instance, B1 is a counter-example: even though B1 is not true (S), "B1 is not true" is not true ("S" is not true). The B2.i are other counterexamples. And there are still others. Rejection of this half of TTP is the solution to the Liar. Step 5 is fallacious, because it depends on the half of TTP which is fallacious.

Indeed, one can even generalize the denial of TPP. Not only is it not correct, there cannot be any concept "plok" such that the schema

S if and only if "S" is plok
holds universally.

If rejection of TTP solves the Liar, there is nonetheless still unresolved the problem why TTP is not universally so. Yet, we are clearly--and I say this only a bit facetiously--in a better strategic position. Before, an out-and-out contradiction stared us in the face, surely the most serious situation for any philosopher interested in consistency. Before, others felt a certain temptation to tamper with the laws of logic themselves. Now, all that is wanting is an explanation what goes wrong sometimes with a certain principle, an important one clearly, but just a principle nonetheless. Since a priori, things do sometimes go wrong with principles, even the best and the most widely believed, we are clearly on stronger ground.

Nonetheless, fundamentally the problem is the same, to offer a convincing explanation. This cannot be immediate, since it seems that TTP follows directly from our meaning of 'true.' Indeed, TTP's cogency have convinced many to use it to classify every statement in natural language as either true or not true, and thereby resolve the Liar's Paradox. One proceeds as anyone with some mathematical training, by recursion. First, one starts with the statements (of level 0) not involving assertions of truth, such as "Snow is white" and "Snow is black," and sort these into the true and the not-true. Then one sorts those statements one level up, which assert truth or non-truth of a statement of level 0, such as " "Snow is white" is not true." These are the statements of level 1. Then one considers those statements one level higher, which predicate truth or non-truth of statements of level 1. And so on. Alas, the rub is with the "and so on," and efforts to explicate the concept of truth in this way have not proved spectacularly successful, the problem being that language does not easily resolve into a hierarchy, since some statements--such as the Law of the Excluded Middle--speak of all statements. Given that TTP was supposed to produce a complete categorization, its failure to do so suggests perhaps that something is amiss with TTP.

In symbolic logic it is well known that a sufficiently strong system cannot contain a predicate T whereby

T(*S*) if and only if S, for all closed statements S,
where *S* is the Godel number of S. All we are saying is that, what holds for formal languages, also holds for natural languages. The interpretation, however, is slightly different. The result on formal languages is sometimes quoted as demonstrating that a truth predicate cannot exist in a formal language. Instead, rather than showing whether a truth predicate can or cannot exist, it simply proves one principle which the truth predicate, which does exist in natural language and which could exist in a formal language, does not have. So, from another corner, this time symbolic logic, there is a suggestion that something is amiss with TTP.

But clearly in an issue of this weight, the reader cannot simply be satisfied with suggestions. He wants convincing arguments. He must understand. So, still, there needs to be a cogent explanation why the concept of truth does not universally give rise to TTP.

Now 'true' is a word in natural language, so it should not be too surprising that several, similar, overlapping meanings have come to live under the same tent. We therefore set ourselves a two-fold objective: to clarify two of these meanings, and then, by applying each one consistently, to explain why TTP does not hold.

Firstly, "x is true" is often asserted, in order not to state or to repeat the statement referred to by x, which can be very long and complicated. Call this the abbreviational use of 'true'. It may seem that TTP immediately follows in all cases. For, if R abbreviates S, then surely R if and only if S. However, this supposes that in all cases the abbreviation has been correctly introduced. After all, an abbreviation requires a hierarchy, where ultimately the abbreviational term does not appear. If this does not happen, then the term is not well-defined.

For instance, if someone asserts that he will use 'S is plok' to abbreviate 'Not S is plok', then he is breaking the rules of good abbreviation. If it were ignored, the result would be an immediate contradiction, of the same nature as the Liar's Paradox. Indeed, an introduction of 'plok' by this definition, would not give it any meaning. So one could never affirm plok of anything. On the contrary, one would have to deny plok of everything. That is to say: Not S is plok, for any S. One cannot then assert "S is plok," because this move works only in the case of a legitimate abbreviation, which does not obtain.

Apply this abbreviational logic to "true". "S is true" is a good abbreviation when S does not contain "true," or contains a legitimate abbrevational use of "true". For instance, ""Snow is white" is true" is a good abbreviation of "Snow is white," so in this context "true" is used legitimately. However, the abbreviation in ""This statement is true" is true" is not correct. So one must assert "Not "This statement is true" is true."

Similarly, "true" in B1 is not a good abbreviation. Therefore, not B1 is true. Along the same lines, the B2.i (for all i) are not true. One cannot assert that

"Not B1 is true" is true,
because "true" is again an example of an illegitimate abbreviation. So:
Not "Not B1 is true" is true.
Since we have just said that
Not B1 is true,
we have both that
Not "S" is true
and
S.
That is, TTP falls down. But there is no incoherence, since TPP only must hold when the rules of abbreviation have been respected, and here they have not been.

So uses of "true" can sometimes be instances of a bad abbreviation. Indeed, whether or not they are, can depend on contingent factors, which strikes many as bizarre and therefore an obstacle to this sort of explanation. For instance, if B1 had labeled "Snow is white," then "true" in "Not B1 is true" would have been an instance of a legitimate abbreviation, rather than the illegitimate instance it is. Of course, it is unusual (bizarre, if you will) for contingent factors to determine the legitimacy of abbreviations; but then it is usually mathematics which employs abbreviations, and mathematics is rarely if ever contingent. However, abbreviations other than "true" can have the same contingent nature. For instance, suppose we defined "plok" to mean whatever was on the back of a piece of paper. Whether "white" or "not plok" were written, a contingent matter, would determine the legitimacy of the abbreviation.

It is even possible to construct an example whereby whether a word is a legitimate abbreviation or not depends on its context. Define a "plok" to be what the first noun preceding it in a statement is, or if there is no such noun, what the first noun after it is, or otherwise, define it to be a chair. Thus, in "The computer is a plok," the first noun before "plok" is "computer," so it means "computer". Similarly, in "The plok is a computer," there is no noun before "plok", so it obtains its meaning from the first noun after it, which is "computer," so plok means "computer" again. But consider: "The plok is a plok." Now "plok" is a noun. So the first "plok" must mean what the second does, and the second must mean what the first does, and so neither receives a meaning. Perhaps this seems too much like a game; but that is exactly the nature of the abbreviational "true".

Cases of "All statements ... are true" and "All statements ... are not true" behave in the same manner. If all the statements are such that "true" can be made redundant, then applying "true" of "All statements ... are true" and "All statements ... are not true" will be an instance of a correct abbreviation, otherwise not. In particular, "All statements are true or not true" is not an instance where "true" can always be made redundant, so "All statements are true or not true" is not true. Still, of course, all statements are true or not true. And so the Law of the Excluded Middle is itself a counter-example to TPP.

Secondly, we have a meaning of 'true' which coincides with the Correspondence Theory of Truth, which says roughly that a statement is true because it makes an assertion about the world, which corresponds with the way the world really is. To add a bit of terminology, an assertion about the world is called a fact. For instance, "Snow is white" asserts the fact of snow being white. "Snow is white and Benjamin smiled" asserts the fact (one fact, according to the terminology we are using) of snow being white and Benjamin having smiled.

A statement is true when it asserts a fact, and this fact holds; or more concisely, a statement is true when the fact it asserts holds. Remark there are therefore two ways a statement may be non-true: the fact it asserts does not hold; or it does not assert a fact at all. In the first case the statement is said to be false; in the second that it is empty. So here we are admittedly introducing what others might call a third "truth-value." But again, in and of itself, this does nothing. It is only useful and justified if it will help to explain.

To understand when a statement may be empty, let us first see when it is not. There is a fact asserted by

[B4] "Thomas is playing with Paul,"
namely the fact that Thomas is playing with Paul. So B4 is not empty.

The fact asserted by

[B5] "The fact asserted by B4 holds."
is the fact that the fact asserted by B4 holds, i.e. the fact that the fact that Thomas is playing with Paul holds. So B5 is also not empty.

What is important to notice, is that the fact asserted by a statement containing "the fact asserted by S", depends on what fact S asserts. If B4 were to have labelled a different statement, then the fact asserted by B5 would have been different.

But this allows the construction of both endless and vicious examples involving facts. For instance, consider the endless

[B5a] "The fact asserted by the next statement holds"
"The fact asserted by the next statement holds"
etc.
The fact that each member of the sequence would assert depends on what fact the next statement asserts. Since this is never determined, none of the statements assert a fact. They are all empty.

Now consider,

[B6] "The fact asserted by B6 holds."
The fact that B6 asserts would depend on what fact B6 asserts. This is vicious, and so B6 does not assert a fact. It is also empty.

By what we said 'true' means (in its correspondence sense), "the fact asserted by p holds" is equivalent to "p is true." So, by the same reasoning used in categorizing B6,

[B7] "B7 is true"
is empty.

Similarly,

[B8] "The fact asserted by B8 does not hold"
and
[B9] "B9 is false"
are also both empty.

Finally,

[B10] "The fact asserted by B10 does not hold, or
B10 does not assert a fact"
is empty; it does not assert a fact. After all, the fact that it (B10) would assert is determined in part by the fact it asserts; and so there is no fact.

To repeat, B10 does not assert a fact. And so

[B11] the fact asserted by B10 does not hold, or
B10 does not assert a fact.
However, we cannot conclude that the fact asserted by B10 holds, because there is no fact. Again, B10 does not assert a fact.

What is happening is that one of the disjuncts of B10 does assert a fact, and indeed a fact which holds--namely that B10 does not assert a fact. But the entire statement does not assert a fact, because what fact it would assert depends on what facts both the first and second disjunct assert, and the fact that the first disjunct would assert depends on what fact is asserted by the entire statement. And so this is vicious. Thus even though the statement is a disjunction and the second disjunct is true, the entire statement is not true, because a statement asserting a fact disjuncted with an empty statement produces not a statement asserting a fact, but an empty statement. Emptiness is, in brief, contagious.

To write it another way,

[B12] "B12 is false or empty"
is empty, and so
[B12a] B12 is false or empty.
However, we cannot conclude that B12 is true, because B12 is empty, and only statements which assert a fact can be true. Indeed,
[B13] B12 is not true.

Thus, we have both S (B12a) and "S" is not true (B13).

Similarly, B1 is not true. Since B1 is just "B1 is not true", and since B1 is empty and so not true, we have both that S and that "S" is not true.

It may seem that we are not merely disputing TTP, but some fundamental laws of logic. To see that we are not, consider:

[B14] "B14 is not true or Helene has black hair".
Since Helene has black hair,
B14 is not true or Helene has black hair.
But any fact asserted by B14 would be determined in part by itself, so B14 does not assert a fact. I.e. it is empty and so not true. Logic still holds, because all that logic demands is, given that Helene has black hair, that B14 is not true or Helene has black hair. ("B" implies "A or B".) And this is so: B14 is not true or Helene has black hair. Where we disagree is the truth-table rule that if one disjunct is true, then the disjunction is true. (For, we have the counter-example of "B" true, but "A or B" not true.) The truth-table rule does not take into account the empty or paradoxical cases.

Of course, the truth-table rule would follow from the logical rule--"B" implies "A or B"--if TPP were universally so. But TPP is not, and so the truth-table rule cannot be inferred.

We could devise new truth-table rules, like:

PQP or Q
TTT
TFT
FTT
FFF
TEE
FEE
ETE
EFE

But, it is better to resist this temptation. In the absence of TTP, a truth-table rule is not what is important. That is, it is not important that

"B14 is not true or Helene has black hair" is empty
because "B14 is not true" is empty. What is important is that
B14 is not true or Helene has black hair
because Helene has black hair.

Again, all statements are true or not true, but "All statements are true or not true" is not true.

Suffice to say that no matter how one tries, one cannot define or find a meaning of 'true' so that TTP holds, if only because, as we have already pointed out, it is impossible to!

THE PARADOXES OF PREDICATION

A predicate is on the level of language, a property on the level of the world. For example, the concept "is red" is a predicate and is a part of our language, while things in the world have the property of redness. Because there seems to be a different relationship between referring concept and referent than between predicate and property, we say that the predicate "is red" gives (rather than refers to) the property of redness. While we will say a thing has a property, we say a thing satisfies a predicate.

Now consider the predicate

[C1] "Does not have oneself",
which predicates of properties. Reason into a paradox in the following way. Let P be the property given by C1. The property shortness has P, because shortness is not short (only material objects are short). On the other hand, the property of being a property does not have P. Does P have P? If it does, then it does not have itself; and if it does not, then it has itself; hence a paradox.

However, one has made an assumption in this argument, namely that C1 gives a property at all, or more generally that every one-place predicate gives a property. For if C1 does not give a property, then there is no P, and so no P to have or not to have P. In brief, by reductio ad absurdam, we must conclude that C1 does not give any property. And so the paradox disappears.

But, if not all one-place predicates give (unique) properties, then which do? Should we not give some sort of criteria to help us determine which do and which do not? Certainly this would be nice, but there is no reason to suppose that it can be done. Quite possibly there are no rules, based on our language, which determine whether a particular predicate gives or does not give a property. After all, there all no such rules which determine whether a thing exists ("unicorn" obeys our language rules just as well as "giraffe"). Indeed, it would be surprising if, by a mere fact of language, we could be assured that certain properties exist.

Still, while the paradox is admittedly stymied by saying that C1 does not give a property, this is not a complete explanation. After all, there certainly is a one-place predicate, and any assertion about properties can be reformulated in terms of predicates. For the person who wants to understand "predicate" when he hears 'property' (and "satisfies" when he hears 'has'), C1 certainly does give a "property" (itself!). So, if we are truly to provide a solution, we must also solve the cousin paradox which uses only the concepts of predicate and satisfaction.

Like referring concepts and facts, whether a thing satisfies a predicate may be determined by whether it satisfies another predicate. For instance, consider

[C2] "satisfies the predicate C3"
[C3] "is red"
Red things satisfy C2; but if C3 had been a different predicate, then they might not have. As with referring concepts, we can create endless chains, and vicious predicates, such as:
[C4] "satisfies the predicate C4"
C4 is vicious, since to determine whether a thing satisfies it, we must know whether the thing satisfies it. And thus we must conclude that nothing ever satisfies it.

Now consider:

[C5] "Does not satisfy the predicate C5".
If a thing does not satisfy the predicate C5, then it seems to satisfy it; and if it does satisfy it, then it seems not to. We have another paradox.

Hopefully this reminds the reader of the Liar's Paradox, and indeed our solution will follow the same lines. Call the schema

x satisfies the predicate "P" if and only if Px
Tarski's Satisfaction Principle, or TSP for short. For instance,
snow satisfies the predicate "is white" if and only if snow is white.
As with TTP, we assert that TSP cannot be universally correct, that C5 is a counter-example, and that there cannot be a concept "plok" such that
x ploks the predicate "P" if and only if Px.

We would explain this as we did with 'truth'. For instance, if we consider "satisfies" as an abbreviation, it is well-defined precisely in situations where it can be made redundant. In

"satisfies the predicate C5",
'satisfies' cannot be eliminated, so the predicate is vicious and nothing satisfies it. In particular,
C5 does not satisfy the predicate C5.
Because we do not universally accept TSP, we cannot infer that C5 satisfies C5, and so there is no contradiction.

Now consider

[C6] "does not satisfy oneself".
Clearly, C6 is vicious when applied to itself, i.e. C6 is vicious in "C6 satisfies C6". So C6 does not satisfy C6. On the other hand, "is not a predicate" satisfies C6. So viciousness may depend on the argument to which the predicate applies.

Formally, the assertion

[C7] (x)(Sat(P,x) <=> Px),
where Sat(P,x) means "x satisfies P", is fallacious for certain P, for instance C5 or ~Sat(x,x).

Remark that here we are essentially using a logic which does not distinguish between predicate and non-predicate thing variables. That is,

(x)(Sat(P,x) <=> ~Sat(x,x))
is contradictory, because by substituting a big letter for a little, 'P' for 'x', one can infer that
Sat(P,P) <=> ~Sat(P,P).
This is normal, because there is no reason to exclude predicates from being themselves things (as is done in "normal" second-order logic). After all, one can refer to the predicate "is red" just as well as one can refer to snow.

Now rewrite 'Sat(P,x)' as 'P[x]'. Then C7 becomes

[C8] (x)(P[x] <=> Px).
Next introduce, for every predicate P, a term in the language '{x | Px}', which is to be read "the predicate of those x such that Py." Then C8 becomes
[C9] (x)({x | Px}[x] <=> Px).
Finally, instead of writing Px, let phi be any one-place predicate. Then C9 becomes
[C10] (x)({x | phi}[x] <=> phi).

C10 is just a manner of writing TSP, and so it is fallacious. For instance, phi as '~x[x]' hopefully reminds the reader of Russell's Paradox and leads to a contradiction.

In order for C10 to hold, we must restrict phi to certain predicates, where all uses of "satisfies" can always be made redundant. A simple rule depending only on syntax is preferred, even if it is too restrictive, i.e. it not only forbids all phi for which C10 does not hold, but also some for which it does.

Now "satisfies" cannot in general be made redundant if there is a variable in the predicate place, on the left-handside of "[". So the rule is: only constant symbols are allowed in this place.

For instance, the (universal) predicate {x | x = x} is such that

(x)({x | x = x}[x] <=> x = x).
On the other hand, not
(x)({x | ~x[x]}[x] <=> ~x[x]),
since the variable 'x' appears in the predicate place. (There is still, of course, a predicate {x | ~x[x]}. It is only that it fails TSP and, so to speak, is not a "well-behaved" predicate.)

Remark that

(x)( {x | x[x]}[x] <=> x[x] )
even though, by our rule, we would not be able to assert it. For, the left- and right-hand sides will be vicious for the same arguments, and for such y neither {x | x[x]}[y] nor y[y], i.e. the "<=>" holds. As we said, our rule, in order to be simple, is too restrictive.

Of course, even certain constants in the predicate place cannot be made redundant, e.g. C5. For instance, not

(x)( {x | ~C5[x]}[x] <=> ~C5[x] ),
since after all {x | ~C5[x]} = C5. Also, not
(x)({x | ~C11[x] v x = x}[x] <=> ~C11[x] v x = x),
where C11 = {x | ~C11[x] v x = x}, since the left-hand-side is vicious, but the right-hand-side always holds. Of course, names for C5 or C11 do not exist in a mathematical system, but they are potentially there, since unbounded free variables in the predicate place (for instance, to define the union of two predicates) are needed, and the free variables are meant to represent any predicate, including those which are vicious.

In order to to resolve this difficulty, one needs to distinguish between big and little letters. Big letters only represent "well-behaved" predicates, while little letters represent all things. Big letters may be substituted for little letters, since well-behaved predicates are things. However, little letters may not be substituted for big letters, since not all things are well-behaved predicates. This is different, then, from the absolute distinction made by "normal" second-order logic and is better grounded philosophically.

Indeed, if big letters were construed as all predicates (or all things), then the following axiom of arithmetic would no longer hold:

(n)(m)(P)(a) ( Mn,P & ~Pa & Mm,{x : Px v x = a} => sn,m ),
where "sn,m" means that m is a successor to n and "Mn,P" means that n numbers P. For if P and {x : Px v x = a} are vicious so that nothing satisfies them, then M0,P, ~P0, and M0,{x : Px v x = 0}. But obviously, 0 is not its own successor. So keeping Comprehension simple is not the only reason for big and little letters--they are also needed to ensure correct axioms for arithmetic.

RUSSELL'S PARADOX

A famous distinction by Ramsey divides the paradoxes into two, the logical and semantic. The former consist of contradictions which involve mathematical and logical notions, and so are supposed to indicate a fault in our mathematics and logic, the latter not. Without making too much of it, the distinction is useful in the following way. In the semantic paradoxes we are not able to deny the existence of an object. The predicates "is true" and "satisfies" exist; we have them in language. We may deny certain principles concerning them which are supposed to hold, but nothing more. With the logical paradoxes, however, there is the assertion of a real-world object. For instance, for the Paradox of Properties to succeed, we must assert the existence of a property, something which is not part of language but "out there". And so it allows for a much quicker resolution, in the sense we may simply deny the object's existence, as we in fact did.

Russell's Paradox is logical, since sets are supposed to be real-world things. When a set w is said to contain all sets x which do not belong to themselves, i.e.

x belongs to w <=> x does not belong to x,
then normally the assertion is that w exists independently of language. Denying w's existence therefore resolves the Paradox. It does not, yet, explain it. The mathematician, who uses sets, expects an explanation of when a set does exist (and when it does not).

Now the technical solution of the last section can be transposed to sets, by saying that a set exists if, in the defining predicate, a variable does not appear on the right-hand side of the "belongs to" sign. This would be an error, because it is not possible to translate the philosophical justification at the same time. For predicates, as has been already noted, are linguistic, and the rejection of TSP does not and cannot provide a justification why things-in-the-world--and so sets--do not exist.

But this idea can be expaned to advantage. Just as TSP has nothing to do with sets, neither should predicates. That is, whether or not there is a predicate, should have nothing to do whether a set exists, because sets are in-the-world. Of course we admit that there do exist sets, which predicates may characterize. But it is not the predicates which create them, and so it is wrong-headed to suppose that every, or almost every, arbitrary predicate defines a set. Just because "the set of the empty set" is grammatical, does not mean there is such a set. Indeed, "the set of all sets" is grammatical, but does not exist according to ZF. Given that sets are in the real world, independent of language, it would indeed be extraordinary if, a mere fact of language could create them. To believe that we mentally "lasso" objects [in the words of Kripke reported by George Boolos in "The Iterative Conception of a Set"] and so form a new object--which exists in the world--is not credible. At the least, it is quite an exceptional claim, and not one to be easily or quickly accepted.

Mind you, there are sets, in the sense that some things do consist of several things, such as an army, a club, a heap of sand, or the Solar System. A man belongs to an army or a club; a speck of sand belongs to a heap; the Earth belongs to the Solar System. But these natural sets are not those used in modern set theory, where a set is characterized by its elements. According to the Axiom of Extensionality, two sets are the same if and only if they have the same elements. But natural sets may have the same elements and yet still be unequal. For instance, two different clubs might have precisely the same members, but be distinguished by their constitutions, their meeting places, or their histories. Indeed, two sets might have the same members but be entirely different sorts of sets, for example when all the soldiers in an army belong to a club, to which no one else belongs.

Moreover, predicates do not create such sets, which exist in and of themselves.

In brief, the resolution of the set-theoretic paradoxes is simply denying the existence of sets as Cantor, Frege, and modern mathematics would have them. Of course I cannot prove that there is not the power set of the power set of the natural numbers, and a mathematician can still suppose such sets as ZF describes exist. It is (probably) consistent. The existence of unicorns is also consistent. That does not mean that there are unicorns. Maybe there are even angels dancing on the head of a pin. I don't know. Still, ZF populates the world with so many entities that it is scarely believable.

The mathematician who is neither a formalist nor interested in making large ontological assumptions, should use predicates, and adhere to the technical solution sketched above. A little reflection will reveal, as it stands, that he will not be in Cantor's paradise; for one thing, Cantor's theorem does not go through. Admittedly this never goes down well, and the mathematician will complain that he cannot "do" mathematics in a system so barren and of so little power. But of course he can. If there is any reason for a particular theorem or theory to hold, then it will be possible to prove it, by finding axioms which co-incide with the reason. Only these axioms will be mathematical rather than logical in nature. But it is well known that logicism is dead, which forces the corollary that other axioms can and should be permitted in mathematics.

Most of all, the mathematician must forsake the idea that he must define structures rather than assume them (be it the natural numbers, the real numbers, or whatever else). Fortunately there are other good reasons, which may for instance be found in another posting ("A Foundation of Elementary Arithmetic"), to reject this define-at-any-price attitude.

In brief, mathematicians have lacked the will, not the capability. Admittedly, understanding the paradoxes is only a first step. But at least, it is a step forward.