/ Naive Validity, Internalization, and Substructural Approaches to Paradox

In this paper I investigate whether certain substructural theories are able to dodge paradox while at the same time containing what might be viewed as a naive validity predicate. To this end I introduce the requirement of internalization, roughly, that an adequate theory of validity should prove that its own metarules are validity-preserving. The main point of the paper is that substructural theories fail this requirement in various ways.

## 1. Introduction

It is fairly common to make a distinction between two views about what semantic theories should be like. One view—the so-called orthodox view—is that theories that purport to explain how the semantic concepts of a certain language work should be couched in a richer language containing expressions that stand for those semantic concepts. Usually, this is cashed out in terms of the distinction between an object language and a metalanguage. Alfred Tarski famously suggested that in order to explain how truth behaves in a certain language we need to (and should) ascend to an essentially richer metalanguage.[1] Moreover, he claimed that:

the condition of the ‘essential richness’ of the metalanguage proves to be, not only necessary, but also sufficient for the construction of a satisfactory definition of truth. (Tarski 1944: 352)

A relatively more recent view—occasionally called the naive view of semantic concepts—endorses the idea that a semantic theory should be couched in a language with enough resources to express its own semantic concepts. According to this view there is no need to look for a richer metalanguage or to make an artificial distinction between different language levels. The formal languages of our semantic theories should mimic ordinary languages, at least to the extent that in them we can talk about the semantic concepts that apply to the expressions of that same language. Following this line of thought, Kripke (1975: 715) suggested that,

On the basis of the fact that the goal of a universal language seems elusive, some have concluded that truth-gap approaches, or any approaches that attempt to come closer to natural language than does the orthodox approach, are fruitless. I hope that the fertility of the present approach, and its agreement with intuitions about natural language in a large number of instances, cast doubt upon such negative attitudes.

The idea seems attractive, but what does naiveté amount to? We say, for instance, that truth is a naive concept if for each sentence A, we can replace A by A is true and vice versa in every non-intensional context.[2][3] If our goal is to provide a formal theory of truth satisfying this property, the situation is more or less well-known. The main obstacle for the naive approach comes from the truth-theoretic paradoxes. It is possible to develop non-trivial naive truth theories but the underlying logic of these theories has to give up some classically valid principles. For instance, paraconsistent theories categorize paradoxical sentences as true contradictions but avoid triviality by rejecting the rule of explosion, a rule that allows us to infer any sentence from a contradiction. Paracomplete theories reject the law of excluded middle and thus endorse the idea that paradoxical sentences should be rejected together with their negations. There are numerous variations of paracomplete and paraconsistent theories in the literature, and it would take too much space to mention them all here. But we can say the following: the main idea with all these theories is to alter the behaviour of negation (and perhaps the conditional too[4]).

Although these theories usually contain a naive truth predicate—and, possibly other naive semantic predicates too—it is far more difficult to see how they could accommodate a naive validity predicate. By a naive validity predicate or a naive theory of validity I mean, roughly, a theory that is capable of truthfully talking about its own notion of validity. Unfortunately, in order to give an adequate characterization of validity most nonclassical theories (unwillingly or not) postulate a gap between the object language and the metalanguage.[5] The reason is that just as in the case of naive truth theories, naive theories of validity are risky: there are paradoxes affecting the notion of validity that need to be addressed (see Beall & Murzi 2013; I will present one such paradox below). One interesting aspect of these paradoxes is that they affect typical paracomplete and paraconsistent theories. These theories are defenseless against them because there is no need to use logical expressions to construct paradoxical sentences of this sort. Fortunately, there are other kinds of nonclassical theories that can accommodate a strong validity predicate without collapsing. These theories accomplish this by rejecting one of the structural properties usually attributed to the classical consequence relation.

However, although there is a wide consensus on what counts as a naive truth predicate, I think that the situation for validity is somewhat different. I will take a validity predicate for a theory $$\mathcal{S}$$ to be naive only if the following condition holds:

1. For every $$\mathcal{S}$$-valid argument from Γ to φ, $$\mathcal{S}$$ proves the statement expressing that the argument from Γ to φ is valid.

But isn’t this condition too strong? After all, in many contexts, it is a fact of life that some truths fail to be provably valid. I will discuss this issue near the end of Section 2; for now, I will assume as a working hypothesis that there is a notion of validity for which this requirement is perfectly reasonable. Moreover, what I will argue for below is that although this condition is necessary for the corresponding validity predicate to be naive, it is far from being sufficient. In particular, I will be maximally demanding in requiring that a naive validity predicate for a theory

$$\mathcal{S}$$ should capture one additional feature of $$\mathcal{S}$$. This feature has to do with the correct metarules[6] of $$\mathcal{S}$$:

2. S should prove internalized[7] versions of all its correct metarules.

This is especially important for substructural theories, which are generally characterized in terms of a collection of metarules.[8]

What I will do in the rest of the paper is to investigate whether certain substructural theories can contain what might be viewed as a naive concept of validity in the sense above. My main point will be that these theories fail the internalization requirement in various ways. Near the end of the paper I will briefly make one suggestion to avoid this difficulty.

The paper is structured as follows. The next section presents a paradox affecting the notion of validity and shows what structural principles are involved in it. In Section 3 I show how some substructural theories of validity fail to capture their own concept of validity. This occurs either because they fail to internalize some correct metarules or because they internalize some incorrect metarules. Section 4 provides two arguments that motivate the idea that internalization is an important requirement. Finally, section 5 contains some concluding remarks, points to some work that remains to be done, and hints at a potential solution to the difficulty I am considering.

## 2. Nonclassical Theories and the Validity Paradox

As I pointed out earlier, the familiar paracomplete and paraconsistent theories of truth are not adequately equipped to introduce a naive validity predicate. This has to do with the so-called validity paradox or v-Curry paradox. By now it is familiar how to generate a paradox of this sort. I will employ a language that contains names for its own sentences and I will assume self-referential expressions are available in the language. As usual, I use $$\langle\rangle$$ as a name forming device and I take $$\langle\phi\rangle$$ to be the name of the sentence φ. The general framework I will employ relies on sequents. A sequent is an ordered pair of multisets of formulas, represented as $$\Gamma\Rightarrow\Delta$$.[9]

A naive two-place validity predicate Val(x, y) for a theory[10] $$\mathcal{S}$$ is normally taken to be a predicate that represents accurately whatever the theory $$\mathcal{S}$$ declares as valid. One way to obtain such a predicate is by means of the following pair of metarules:

Roughly, L$$Val^{*}$$ says that Val is detachable: φ and $$Val(\langle\phi\rangle, \langle\psi\rangle)$$ imply ψ. R$$Val^{*}$$ asserts that if φ implies ψ, then the claim that the argument with premise φ and conclusion ψ is valid is provable. For reasons that will become clear in the next section, these two metarules fall short of characterizing a naive validity predicate. But they are sufficient to cause trouble.

Most theories couched in a language capable of expressing self-referential sentences and satisfying L$$Val^{*}$$ and R$$Val^{*}$$ are trivial. In fact, all we need besides these two metarules are the following two structural metarules[11]:

[12]

In such theories, we can construct a sentence saying of itself that it implies something absurd. Let be an absurd sentence and let π be the sentence Val($$\langle\pi\rangle,\langle\bot\rangle$$), i.e., a sentence saying of itself that it implies . Then we can reason in the following way:

What makes this paradox different from, say, the Curry paradox or the liar paradox, is that—at least in the version I have presented—no use is made of operational metarules. This means that operational approaches are defenseless against it and so our options to revise logic to avoid it are reduced.

More precisely, to accommodate a validity predicate satisfying the metarules above we have to modify the basic structural properties normally attributed to our consequence relations. Some proposals of this kind embrace a nontransitive consequence relation: Ripley (2015b; 2012; 2013), Weir (2013), Cobreros, Egré, Ripley, and van Rooij (2013; 2012), and Tennant (2015), just to name a few; and others support a noncontractive consequence relation, for example, Mares and Paoli (2014), Zardini (2011; 2014), Shapiro (2015), Weber (2014), and Caret and Weber (2015).[13]

Cut implements the idea that the consequence relation is transitive, so rejecting this metarule amounts to endorsing a nontransitive logic. In such a logic there are sentences φ, ψ, and χ such that ψ follows from φ, χ follows from ψ, but χ does not follow from φ. Some of the authors that subscribe to this approach suggest that its fundamental virtue (at least in one of its versions) is that every classically valid inference is preserved, even in the presence of paradoxical sentences. The reason why this is so is that Cut is admissible in many presentations of classical logic. This means that no derivation of a classically valid inference requires the use of Cut. When the truth predicate (or the validity predicate) is added, this is no longer so. However, all that is lost are those metarules that essentially involve paradoxical sentences. For all the others, Cut is dispensable.

Noncontractive theories exploit the fact that in some contexts it is important to make a distinction between two occurrences of a sentence and just one occurrence of it. Some of them—like Mares and Paoli (2014)—use the idea that sentences should be understood as pieces of information and that information can be accumulated in a way that causes a failure of Contraction. Others—like Zardini (2011)—use the idea that certain sentences express unstable states of affairs that are incompatible with Contraction. Still others—like Caret and Weber (2015)—claim that Contraction fails for certain sentences that express more than one proposition simultaneously.

It is not my goal here to develop these theories in detail or to explain how they provide a philosophically well-motivated account of the failure of Cut or Contraction. What is important for my purposes is whether they can accommodate a validity predicate like the one described above. However, before turning to this issue, I will say a few words on the v-Curry paradox and on the plausibility of the validity metarules I have presented.

The first thing I should note is that it is a controversial matter whether v-Curry is a genuine paradox. On the one hand, it has been pointed out by Cook (2014) and Ketland (2012) that if Val is understood as purely logical validity, then it can be represented within any sufficiently strong arithmetical theory, which means that there is no paradox for Val. In a nutshell, R$$Val^{*}$$ will fail if Val means logically valid because we should not claim that the argument from φ to ψ is logically valid in virtue of the fact that there is a proof of the sequent φ ⇒ ψ in some non-logical theory. On the other hand, if Val is understood as arithmetical provability and we assume that our background theory extends (say, Robinson’s 1950) arithmetic and is consistent, then it is L$$Val^{*}$$ that fails, due to Gödel’s second incompleteness theorem. Hence, in this context validity is to be understood more broadly, as whatever consequence relation is being characterized by the theory under consideration, and so Val is taken to be simply an object language rendering of the concept that stands for. For a more detailed discussion of this issue, see Murzi and Shapiro (2015).

But even if validity is understood in this way, it is not obvious that both L$$Val^{*}$$ and R$$Val^{*}$$ should hold (a fortiori there might be similar doubts for the strengthened versions of these metarules that I will consider in the next section). This is a worry that has been raised by Field (2016)[14] and that, as far as I know, has not been addressed in the literature. For that reason, in what remains of this section I will roughly indicate why I do not take his objections to fundamentally undercut the interest of the concept of validity as I am understanding it.

Regarding L$$Val^{*}$$ Field claims that the first part of the v-Curry paradox—which only uses L$$Val^{*}$$ plus LW—leads to π ⇒ ⊥. But, if Val is simply a way of expressing in the object language, this is nothing more than π. Therefore, we know that π is true and that π ⇒ ⊥, that is, we know that π is true and that it implies , and this seems highly implausible. So, perhaps, L$$Val^{*}$$ should be rejected.[15]

But this argument against L$$Val^{*}$$ does not seem to affect the substructural theories we are going to consider later on. The noncontractive theorist rejects LW and so will not obtain π ⇒ ⊥, but only π, Val($$\langle\pi\rangle,\langle\bot\rangle$$) ⇒ ⊥, which is unproblematic. As for the nontransitive theorist, while it is true that she will accept both that π is true and that π ⇒ ⊥, this is precisely the sort of thing that might be expected in this approach. Triviality is avoided by rejecting the claim that valid arguments are truth-preserving, and although it is puzzling at first glance to say that π is true and that it implies , it does not follow that is true, since implications might fail to preserve truth, due to the failure of transitivity.[16]

Moving on to R$$Val^{*}$$, the situation seems to be much more subtle. Let $$\mathcal{S}$$ be a certain theory; $$\mathcal{S}$$ need not be a purely logical theory, it might contain, for example, a truth predicate with its corresponding rules and other non-logical pieces of vocabulary. The details are not important for what follows. To say that the predicate Val is simply an object language rendering of $$\mathcal{S}$$’s consequence relation only justifies, according to Field, the following principle: if Val($$\langle\phi\rangle, \langle\psi\rangle$$), then it is true that Val($$\langle\phi\rangle, \langle\psi\rangle$$), which, as the reader might have noticed, is nothing more than an instance of the T-schema. However, R$$Val^{*}$$ seems to endorse something stronger, namely that if $$Val(\langle\phi\rangle, \langle\psi\rangle)$$, then $$Val(\langle\top\rangle, \langle Val(\langle\phi\rangle, \langle\psi\rangle)\rangle)$$. The problem then is that there might be true claims of the form Val($$\langle\phi\rangle, \langle\psi\rangle$$) that are not valid according to $$\mathcal{S}$$. While Val($$\langle\phi\rangle, \langle\psi\rangle$$) implies that Val($$\langle\phi\rangle, \langle\psi\rangle$$) is true, something more needs to be said in order to justify that it is also $$\mathcal{S}$$-valid. So, perhaps, R$$Val^{*}$$is too strong.

However, it is far from obvious how this could happen. As long as we are understanding Val as an object language predicate that mirrors $$\mathcal{S}$$’s consequence relation, it is not entirely clear to me how a sentence of the form Val($$\langle\phi\rangle, \langle\psi\rangle$$) could be true while failing to be $$\mathcal{S}$$-valid. After all, the paradigmatic cases of true but invalid sentences are things like ‘snow is white’ and ‘grass is green’. The truth of these sentences depends partly on the way things are in the world. But for sentences of the form Val($$\langle\phi\rangle, \langle\psi\rangle$$), the situation is different. Under our current reading of Val, to find out whether a certain sentence is valid according to $$\mathcal{S}$$ we need to make sure that the truth of the sentence is not something that goes beyond what is given by the metarules of $$\mathcal{S}$$, that is, we need to determine whether the truth of the sentence can be established purely by the metarules of $$\mathcal{S}$$. But, whatever φ and ψ are, the truth of Val($$\langle\phi\rangle, \langle\psi\rangle$$) can only depend on there being a proof in $$\mathcal{S}$$ of φ ⇒ ψ, which in turn only depends on whether the metarules of $$\mathcal{S}$$ are enough to deliver this sequent. Therefore, it seems correct to say that for sentences of the form Val($$\langle\phi\rangle, \langle\psi\rangle$$) we are justified in believing that they are $$\mathcal{S}$$-valid, if true.

One final worry is that there might be other types of sentences that are true but $$\mathcal{S}$$-invalid. Isn’t the Gödel sentence—in our case, $$\mathcal{S}$$’s Gödel sentence[17] —precisely the sort of formula that is frequently regarded as true (assuming some mild form of realism) but invalid? Maybe so, but clearly Gödel’s sentence cannot be used to argue against R$$Val^{*}$$. For something to be a counterexample to R$$Val^{*}$$ it would need to be valid according to $$\mathcal{S}$$ —which is not the case for $$\mathcal{S}$$’s Gödel’s sentence—but its validity would have to depend on something other than the metarules of $$\mathcal{S}$$, at least given the way in which I am understanding validity. But I see no way for this to occur.

In sum, I think that if we understand the predicate Val as nothing more and nothing less than our metalinguistic , there is a case to be made in favor of the view that both L$$Val^{*}$$ and R$$Val^{*}$$ hold. In fact, as I will show in the next section, if condition 2 above is to be satisfied, we need metarules that are even stronger than L$$Val^{*}$$ and R$$Val^{*}$$[18].

## 3. Substructural Logics and Internalization

In this section I will present sequent calculi for various propositional substructural theories. My goal is to consider whether these theories are capable of internalizing—in a sense to be made precise shortly—their metarules using a validity predicate together with the rest of their logical machinery. It seems convenient to start with a sequent calculus for classical propositional logic.[19] We will obtain substructural logics by removing structural metarules from it.

Definition (Classical propositional logic SC) Let Γ, ∆, Π, and Σ be (finite) multisets[20] of formulas, let φ and ψ be formulas. The system SC is given by the following initial sequents and metarules:

The system SC provides a redundant presentation of classical logic. The reason for having two conjunctions (the so-called additive $$\sqcap$$ and multiplicative ) and two disjunctions (the additive $$\sqcup$$ and the multiplicative ) is that later on I will consider substructural logics in which this difference is important. Without Contraction or Weakening the additive junctions and the multiplicative junctions do not satisfy the same principles.

Another thing worth remarking is that the metarules for negation and conjunction (or negation and disjunction) are sufficient to express other logical constants. The usual interdefinability results between them hold, so we do not need to treat all expressions as primitives. For example, I have omitted the metarules for the material conditional.[21]

Since we want our concept of validity to be stronger than the one in the previous section, the validity metarules need to be adapted accordingly. The presentation of the metarules for Val[22] partially follows Zardini (2014):

I call the resulting system $$\mathcal{S}_{C}^{V}$$ (I use the ‘V’ symbol to acknowledge that the validity metarules are present).[24] It seems clear that in this system we can express and prove facts about validity. But more importantly, the presence of a validity predicate in the object language makes it possible to express and prove facts about metarules as well.[25]

The general issue that I want to address is if certain theories that can be obtained from $$\mathcal{S}_{C}^{V}$$ by removing structural metarules are capable of proving internalized versions of all their valid rules and correct metarules. As for the rules, it is clear that every provable sequent $$\Gamma\Rightarrow\Delta$$ will be provably valid in whatever theory we choose, since by RVal we can always obtain $$\Rightarrow Val(\langle\Gamma\rangle, \langle\Delta\rangle)$$ from $$\Gamma\Rightarrow\Delta$$.

What about the metarules? Are these theories capable of internalizing their own metarules? The answer will depend on the specific properties of the theory under consideration. To explain why, I will be a bit more precise now.

Definition We say that a theory S internalizes a metarule $$\mathcal{R}$$ of the form

if $$\mathcal{S}$$ proves every instance of

The idea behind this definition should be clear enough. If a certain metarule $$\mathcal{R}$$ holds according to a theory $$\mathcal{S}$$, then $$\mathcal{S}$$ internalizes $$\mathcal{R}$$ if it is provable in $$\mathcal{S}$$ that the premise-sequents of $$\mathcal{R}$$ (expressed as validity claims) imply its conclusion-sequent (also expressed as a validity claim). Or, a bit more roughly, if a certain metarule $$\mathcal{R}$$ holds according to a theory $$\mathcal{S}$$, then $$\mathcal{S}$$ should prove that this metarule is validity-preserving.[26]

Before we turn to the application of this definition, there are a few potential sources of ambiguity in what it means to ‘internalize validity’ in a substructural setting that we should address at this point. The first is that the metarules for Val seem to have a multiplicative form. So isn’t there some other validity predicate that can be characterized by means of additive metarules? Indeed, we can introduce the following metarules instead of (or in addition to) RVal and LVal:

Although this is technically possible, from a conceptual standpoint I wouldn’t know what to make of the distinction between additive and multiplicative vocabulary for validity. Does it mean that there are two different notions of validity, one additive and one multiplicative, in the same way as there are two conjunctions and two disjunctions? If so, we would have more than one notion of internalization, depending on whether we use additive or multiplicative validity. I leave these matters here, not only because they seem to unnecessarily complicate things a lot, but also because the validity metarules are always presented multiplicatively in the literature. And in fact, this is so for a good reason. Some validities are not provably valid if we employ the additive RVal′.[27]

A second issue has to do with the fact that the form of internalization I am requiring is a weak one. As it stands, internalization only asks for the theory $$\mathcal{S}$$ to have a derivation of (every instance of) $$Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle),...,Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\Rightarrow Val(\langle\Gamma\rangle, \langle\Delta\rangle)$$ if the corresponding metarule holds. But perhaps what is wanted is an object language sentence expressing that metarules are validity-preserving. That is, a derivation of $$\Rightarrow Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle) \wedge...\wedge Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\rightarrow Val(\langle\Gamma\rangle, \langle\Delta\rangle)$$, for some suitable conjunction $$\wedge$$ and some suitable conditional $$\rightarrow$$.

I think this is unproblematic. In the theories I will be considering, there are ways of expressing claims like these using the connectives available in the language. In particular, the deduction theorem holds for the multiplicative connectives, so if there is a derivation of $$Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle),..., Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\Rightarrow Val(\langle\Gamma\rangle, \langle\Delta\rangle)$$ there will also be a derivation of $$\Rightarrow Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle) \otimes...\otimes Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\multimap Val(\langle\Gamma\rangle, \langle\Delta\rangle)$$

A third and final issue I will mention is that, due to the fact that in substructural logics there is a splitting of the notion of consequence into an internal and an external component (see for instance Mares & Paoli 2014; Avron 1988; Barrio, Rosenblatt, & Tajer 2015), perhaps there should be a corresponding distinction in the definition of internalization. More precisely, since there are two notions of consequence, there should be two validity predicates, one to capture each notion. After all, it is one thing to declare a rule valid and another to say that a metarule holds.[28]

Although this suggestion is interesting and might be worth pursuing, we can do without the two predicates. Notice that a metarule is normally understood as a conditional claim to the effect that if certain arguments are valid, then another argument is valid as well. The notion of validity involved appears to be the usual one and, as far as I see, it is fairly plausible to assume that the conditional used to state what the metarule expresses can be captured via the validity predicate (or one of the conditionals) of the corresponding theory. So, in other words, the concept of external validity can be defined in terms of the concept of internal validity in the following way. A metarule with premise $$\Gamma\Rightarrow\Delta$$ and conclusion $$\Pi\Rightarrow\Sigma$$ holds if and only if the sentence $$Val(\langle Val(\langle\Gamma\rangle, \langle\Delta\rangle)\rangle, \langle Val(\langle\Pi\rangle, \langle\Sigma\rangle)\rangle)$$ has a derivation. And, naturally, this can be generalized to metarules with any (finite) number of premises.[29]

Now that I have considered these potential sources of equivocation, let us return to the official characterization of internalization. Our theories should be able to internalize all (and only) those metarules that hold according to their own standards of validity.[30] For instance, a theory where R$$\sqcap$$ holds but L$$\sqcap$$ fails should prove $$Val(\langle\phi\rangle, \langle\psi\rangle), Val(\langle\phi\rangle, \langle\chi\rangle)\Rightarrow Val(\langle\phi\rangle, \langle\psi\sqcap\chi\rangle)$$ but it should not prove $$Val(\langle\phi\rangle, \langle\chi\rangle)\Rightarrow Val(\langle\phi\sqcap\psi\rangle, \langle\chi\rangle)$$ or $$Val(\langle\psi\rangle, \langle\chi\rangle)\Rightarrow Val(\langle\phi\sqcap\psi\rangle, \langle\chi\rangle)$$.

As I remarked at the outset, the sort of theories I am interested in are substructural. Although there are many logics of this kind, I will focus on the simplest ones. If we remove both the Contraction and the Weakening metarules from SC, we obtain what is known as linear logic, which I will call SL. If we remove just the Contraction metarules, we obtain affine logic, which I will call $$\mathcal{S}_{A}^{V}$$.[31] These two logics have a number of nice properties. For my purposes, the one property that matters is Cut admissibility. In both of them, if a sequent has a derivation where Cut is used, it has a derivation where Cut is not used. For proofs of these facts see Paoli (2002). Moreover, it can be proved that Cut is admissible in linear and affine logic extended with LVal and RVal, i.e., in $$\mathcal{S}_{L}^{V}$$ and $$\mathcal{S}_{A}^{V}$$. See Zardini (2014) and Zardini (2011).

Before presenting the results of this section, let me add some terminology. I will say that a metarule is basic in a theory $$\mathcal{S}$$ if it is explicitly mentioned in the definition of $$\mathcal{S}$$, I will say that a metarule is derivable in $$\mathcal{S}$$ if there is a derivation of the conclusion-sequent of the metarule from the premise-sequent(s) of the metarule in $$\mathcal{S}$$ and I will say that a metarule is admissible in $$\mathcal{S}$$ if its conclusion-sequent is provable whenever its premise-sequent(s) is (are) provable. Clearly, all basic metarules are derivable and all derivable metarules are admissible, but the converses might fail.

Now we have all we need to state our first result.

Proposition 3.1 Linear logic $$\mathcal{S}_{L}^{V}$$ cannot internalize some of its basic metarules.

Proof sketch Let me just illustrate why this is so. The metarules of Weakening are crucial to internalize some of the metarules for the additive connectives. As an example, consider the following derivation which attempts to show that (a simplified version of) $$R\sqcap$$ can be internalized:

Of course, the reader might be skeptical as to whether there is no other way of internalizng $$R\sqcap$$ in linear logic. But there is no other way. Consider the following instance of $$R\sqcap$$: $$Val(\langle p\rangle, \langle q\rangle), Val(\langle p\rangle, \langle r\rangle)\Rightarrow Val(\langle p\rangle, \langle q\sqcap r\rangle)$$. In the system in question Cut is admissible, so we can do a root-first proof search. Hence, for our target sequent there will be finitely many options and—as the reader can check for herself—none of them provides a derivation of it.                                                                                                                                                         █

The linear logician can escape this incompleteness result in two different ways. First, she might say that additive metarules should be internalized additively (that is, with $$\sqcap$$) and that multiplicative metarules should be internalized multiplicatively (that is, with or the comma).[32] More precisely, the modified definition would require that if a certain metarule $$\mathcal{R}$$ is additive, then the theory $$\mathcal{S}$$ should prove every instance of $$Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle) \sqcap...\sqcap Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\Rightarrow Val(\langle\Gamma\rangle, \langle\Delta\rangle)$$. And in case the metarule $$\mathcal{R}$$ is multiplicative, the theory $$\mathcal{S}$$ should prove every instance of $$Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle) \otimes...\otimes Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\Rightarrow Val(\langle\Gamma\rangle, \langle\Delta\rangle)$$. In this way one might hope that it would be possible to internalize every nonstructural metarule without employing Weakening or Contraction. And in fact this is indeed possible if we use the metarule 5 (of Footnote 26) instead of the weaker RVal. This would solve the problem I raised for linear logic, because although we cannot multiplicatively internalize R$$\sqcap$$, we can additively internalize it without applying Weakening, as the following derivation shows:

Now, although I do not have much to complain about this approach, there are a few little wrinkles that remain. First, the strategy does not work for RVal. If only RVal is available, the derivation of Val($$\langle\phi\rangle, \langle\psi\rangle)\sqcap Val(\langle\phi\rangle, \langle\chi\rangle)\Rightarrow Val(\langle\phi\rangle, \langle\psi\sqcap\chi\rangle$$) can only be carried out if we are allowed to apply Contraction. So we must use the metarule 5 instead. But, as I have already pointed out, with LVal and 5 Cut admissibility is gone.

The second issue is that the alternative definition of internalization involves specific connectives—the additive conjunction and the multiplicative conjunction—while our official definition is connective-free. And it is a bit odd to force the definition to depend on the occurrence of a particular piece of vocabulary (other than Val of course) in the language. One potential problem with this is that the definition cannot be applied to theories formulated in a language not containing that connective. For example, the theory in Zardini (2011) only contains multiplicative connectives, and dually, the theory advocated by Shapiro (2015) only contains additive junctions. For this reason, I prefer the official definition.[33]

The third issue I want to mention is of a more conceptual nature. Having two ways of internalizing metarules amounts to admitting that there are two ways of bunching premises of metarules together. And that seems at least odd, given that when these metarules are presented, the premise-sequents can only be bunched together in one way. Perhaps this idea can be developed if we employ a nonclassical metatheory where we have two explicitly different ways of bunching premises of metarules together. Although pursuing this option is beyond the scope of the paper, I will briefly address this possibility in the final section.

The second way the linear logician has of escaping the incompleteness result is by admitting Weakening for formulas of the form $$Val(\langle\Phi\rangle, \langle\Psi\rangle)$$:[34]

But this idea is problematic for at least two reasons. First, if we are allowed to weaken in this way, the notion of consequence will no longer be relevant in any interesting sense, which is, I take it, one of the main motivations to embrace linear logic. Even though ‘snow is white’ relevantly implies ‘snow is white’, it doesn’t seem to be the case that ‘snow is white’ together with ‘ ‘either grass is green or it is not’ is valid’ relevantly imply ‘snow is white’.

Secondly, if we admit Weakening for formulas of the form $$Val(\langle\Phi\rangle, \langle\Psi\rangle)$$, it seems to be congenial to the spirit of the linear logician's account to also admit Contraction for this type of formulas. That is, assuming something along the lines of the resource interpretation of linear logic, a plausible condition to impose is that a sentence can be ‘weakened on’ if and only if it can be ‘contracted on’. However, although it is unproblematic to admit instances of Weakening of this kind, the same cannot be said of Contraction. If validity claims contract, paradoxes show their ugly face once more.

Before turning to $$\mathcal{S}_{A}^{V}$$, I will introduce one more substructural logic. This is the logic that can be obtained from $$\mathcal{S}_{C}^{V}$$ by eliminating Cut. This gives a nontransitive logic which sometimes goes by the name of ST but that I will call $$\mathcal{S}_{NT}$$ following Ripley (2015b). Both $$\mathcal{S}^{V}_{A}$$ and $$\mathcal{S}^{V}_{NT}$$ (i.e., $$\mathcal{S}_{NT}$$ plus a validity predicate obeying LVal and RVal) enjoy the following property:

Proposition 3.2 (See Barrio et al. 2016 and also Ripley 2017) $$\mathcal{S}^{V}_{A}$$ and $$\mathcal{S}^{V}_{NT}$$ internalize their derivable metarules. That is, if $$\Pi\Rightarrow\Sigma$$ follows from $$\Gamma_{1}\Rightarrow\Delta_{1}$$...$$\Gamma_{n}\Rightarrow\Delta_{n}$$, then $$Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle),..., Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\Rightarrow Val(\langle\Pi\rangle, \langle\Sigma\rangle)$$ has a derivation.

Proof First we assume that the following metarule $$\mathcal{R}$$ is derivable:

That is, there is a derivation of the conclusion-sequent $$\Pi\Rightarrow\Sigma$$ from the premise-sequents $$\Gamma_{1}\Rightarrow\Delta_{1}$$... $$\Gamma_{n}\Rightarrow\Delta_{n}$$ in $$\mathcal{S}_{A}^{V}$$ ($$\mathcal{S}^{V}_{NT}$$). Take that derivation and add to each node the sentences $$Val(\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle),..., Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)$$ on the left of the arrow. The resulting object is a derivation in $$\mathcal{S}_{A}^{V}$$ ($$\mathcal{S}^{V}_{NT}$$) of the sequent

from the $$\sqcap$$ sequents

The crucial property I am relying on is that adding validity claims on the left does not make any of the steps in the derivation incorrect. This means that the following metarule $$\mathcal{R}'$$ is also derivable:

Now we reason as follows. For each γi Γi and δii we start with γi ⇒ γi and δi ⇒ δi. Then we proceed by applying LVal to obtain for each i:

Val($$\langle\Gamma_{i}\rangle, \langle\Delta_{i}\rangle$$), $$\Gamma_{i}\Rightarrow\Delta_{i}$$.

From this we obtain again for each i

Val($$\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle),..., Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle), \Gamma_{i}\Rightarrow\Delta_{i}$$

by $$\sqcap$$-1 applications of Weakening. In virtue of $$\mathcal{R}'$$ we infer

Val($$\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle),..., Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle), \Pi\Rightarrow\Sigma$$.

Then by RVal we get

Val($$\langle\Gamma_{1}\rangle, \langle\Delta_{1}\rangle),..., Val(\langle\Gamma_{n}\rangle, \langle\Delta_{n}\rangle)\Rightarrow Val(\langle\Pi\rangle, \langle\Sigma\rangle$$).[35]                                                                                            █

This is very nice indeed. Our validity metarules LVal and RVal are strong enough to internalize every derivable metarule. However, this is not the whole story. For $$\mathcal{S}^{V}_{NT}$$ we also have the following negative result.

Proposition 3.3 (Barrio et al. 2016) $$\mathcal{S}^{V}_{NT}$$ internalizes the metarule of Cut.

Proof sketch Note that we do not need Cut to internalize Cut, as the following derivation shows:

To make the derivation more readable, I have only shown how to internalize a version of Cut with sequents that have one premise and one conclusion, but the proof can be generalized easily.                                                                                                                                                                             █

This kind of overinternalization is surely problematic. In $$\mathcal{S}^{V}_{NT}$$ Cut is not a correct metarule. There are certain sentences, like π, for which Cut has to fail. For if we were allowed to cut on π, $$\mathcal{S}^{V}_{NT}$$ would be trivial.[36]

What about $$\mathcal{S}_{A}^{V}$$? Proposition 3.2 shows that $$\mathcal{S}^{V}_{A}$$ internalizes all its derivable metarules. At least in that sense it is better off than $$\mathcal{S}^{V}_{NT}$$. The reason is that Contraction is not needed to internalize non-structural metarules. Also, $$\mathcal{S}^{V}_{A}$$ dodges the problem we just raised for $$\mathcal{S}^{V}_{NT}$$, for unlike the metarule of Cut—whose internalization does not require an application of Cut—the metarules of Contraction are such that their internalization does require an application of Contraction.

However, there is a difficulty affecting $$\mathcal{S}^{V}_{NT}$$. It surfaces when we try to internalize the metarules of $$\mathcal{S}^{V}_{A}$$ that are admissible. Although I suspect that, in general, admissible metarules can be internalized in $$\mathcal{S}^{V}_{NT}$$ (although in order to show this a different kind of proof would be required), the situation is different for $$\mathcal{S}^{V}_{A}$$.

Proposition 3.4 Affine logic $$\mathcal{S}^{V}_{A}$$ cannot internalize some of its admissible metarules.

Proof sketch As an example, take

This metarule is certainly admissible. If there is a derivation of ⇒ φ, we can use that sequent as many times as we like, so there will be a derivation of ⇒ φ ⊗ φ as well. But there is no derivation of $$Val(\langle\top\rangle, \langle\phi\rangle)\Rightarrow Val(\langle\top\rangle, \langle\phi\otimes\phi\rangle)$$ in $$\mathcal{S}^{V}_{A}$$, for the derivation of that sequent requires an application of Contraction.[37]                                                                                                                                                              █

What to make of these results? It appears to be that while some substructural theories undergenerate, other substructural theories overgenerate. Or, in a more familiar terminology, certain theories are incomplete as far as internalization goes, while other theories are unsound. I use this potentially misleading terminology because there should be a certain familiarity for those acquainted with the debate between paracomplete and paraconsistent theories of truth. In that case, advocates of the paracomplete approach tend to settle for theories that fail to assert some truths, while those who favor a paraconsistent approach usually accept theories that assert some untruths. For substructural theories, this dichotomy reemerges at the level of metarules. While noncontractive theories fail to internalize some correct metarules, nontransitive theories internalize some incorrect metarules. This is, I think, one plausible way to characterize the situation.

## 4. The Importance of Internalization

At this point the reader might wonder why we should care about metarules and internalization. In order the see the importance of the internalization requirement discussed in the previous section, it will be convenient to consider first a similar requirement that is sometimes mentioned in the literature on theories of truth. As Priest (2006: Chapter 3) suggests, prima facie naive truth theories should prove that their rules are truth-preserving. Moreover, if we consider a language containing a truth predicate satisfying the usual truth rules there seems to be a fairly compelling argument to the effect that an inference is valid if and only if it is logically necessary that it preserves truth. Field calls it “the validity argument”. In his words, it goes like this:

'Only if' direction: Suppose A1,...,An $$\vDash$$ B. Then by (T-elim), True ($$\langle A_{1}\rangle$$,..., True($$\langle A_{n}\rangle$$)$$\vDash$$ B; and by (T-introd), True($$\langle A_{1}\rangle$$),..., True($$\langle A_{n}\rangle$$)$$\vDash$$ True($$\langle B\rangle$$). By $$\wedge$$-Elim, True($$\langle A_{1}\rangle$$)$$\wedge$$...$$\wedge$$ True($$\langle A_{n}\rangle$$)$$\vDash$$ True($$\langle B\rangle$$). So by $$\rightarrow$$-Introd,

That is, the claim that if the premises $$A_{1},..., A_{n}$$ are true, so is the conclusion, is valid, i.e. holds of logical necessity.

'If' direction: Suppose $$\vDash$$ True($$\langle A_{1}\rangle$$)$$\wedge$$...$$\wedge$$ True($$\langle A_{n}\rangle$$)$$\rightarrow$$ True($$\langle B\rangle$$). By Modus ponens, True($$\langle A_{1}\rangle$$)$$\wedge$$...$$\wedge$$ True($$\langle A_{n}\rangle$$)$$\vDash$$ True($$\langle B\rangle$$). So by $$\wedge$$-Introd, True($$\langle A_{1}\rangle$$),..., True($$\langle A_{n}\rangle$$)$$\vDash$$ True($$\langle B\rangle$$). So by T-Introd, $$A_{1}$$,..., $$A_{n}\vDash$$ True($$\langle B\rangle$$); and by T-Elim, $$A_{1}$$,..., $$A_{n}\vDash$$ B.

(Field, 2008:284)

But, as Field rightly notices, the problem with this argument is that it relies on rules (T-Introd, T-Elim, $$\rightarrow$$-Introd, and $$\rightarrow$$-Elim) that together lead to the Curry paradox. Field’s own paracomplete theory blocks the ‘only if’ direction of the argument by rejecting the $$\rightarrow$$-Introd rule. As a consequence, his theory must refrain from declaring that all its rules are unrestrictedly truth-preserving, and this does seem to be a cost.[38]

The same applies to other nonsubstructural accounts of the self-referential paradoxes. In particular, most classical theories of truth cannot accept either direction of the argument, since they will reject that the T-rules hold unrestrictedly, while other nonclassical (paraconsistent) theories will fail to validate either the ‘only if’-direction, if they reject $$\rightarrow$$-Introd, or the ‘if’-direction, if they reject Modus ponens.

The only theories I know of that can accept the validity argument are those based on substructural logics (with one notable exception being the classical theory of truth Friedman-Sheard). This has been seen as a crucial advantage by some of the supporters of this approach (see Murzi & Shapiro 2015 for a substructuralist defense of the validity argument). In these theories Curry’s paradox—and, in fact, all truth-theoretic paradoxes—are avoided at the structural level, by rejecting metarules like Contraction and Cut.

However, as I pointed out in the previous section, going substructural brings about a sort of splitting of the notion of validity. More specifically, in substructural theories the valid rules do not coincide with the correct metarules. So in this context it is important to bear in mind that some of the action goes on not at the level of rules but at the level of metarules. Therefore, as I see it, in a substructural setting the requirement that valid rules be truth-preserving should be complemented with the requirement that correct metarules be validity-preserving. This is where the notion of internalization becomes relevant. In order to show that its own correct metarules are validity-preserving, substructural theories need to satisfy the requirement of internalization. And conversely, in order to show that only validity-preserving metarules are acceptable, substructural theories need to make sure that only correct metarules can be internalized. This is why the notion of internalization plays such a crucial role.

A second (albeit more indirect) reason for caring about internalization has to do with the possibility of using the validity predicate to provide a theory of meaning for (at least some of) the expressions of the language in the language. In order to see what I mean by this, let me start by considering theories of truth once more. One of the perks of having a naive theory of truth for a certain language is that we can state the truth-conditions of every sentence of the language in the language. For instance, we can give the truth-condition of a negation $$\neg\phi$$ using the biconditional $$True\langle\neg\phi\rangle\leftrightarrow \neg True\langle\phi\rangle$$, and we can give the truth-condition of a conjunction $$\phi\wedge\psi$$ using the biconditional $$True\langle\phi\wedge\psi\rangle\leftrightarrow True\langle\phi\rangle\wedge True\langle\psi\rangle$$. This can be viewed as a vindication of the Davidsonian project that seeks to capture meaning via the concept of truth and that understands that the basic structure of an appropriate theory of meaning is delivered by a formal theory of truth.

If instead of adopting a truth-conditional semantics we endorse an inferentialist account of meaning, our focus should no longer be on truth but on validity. For the inferentialist, the meaning of an expression is given by the inference rules governing its use.[39] Of course, there are various forms of inferentialism; here I will assume that meaning is given, at least partially, by the metarules of a sequent calculus. For example, the meaning of (the additive) conjunction is given by the metarules L$$\sqcap$$ and R$$\sqcap$$, and they can be understood (in the metatheory) as follows:

(L$$\sqcap$$) If the argument from φ to χ is valid, then the argument from φ $$\sqcap$$ ψ to χ is valid. And if the argument from ψ to χ is valid, then the argument from φ $$\sqcap$$ ψ to χ is valid.

(R$$\sqcap$$) If the argument from χ to φ is valid and the argument from χ to ψ is valid, then the argument from χ to φ $$\sqcap$$ ψ is valid.[40]

If the validity predicate satisfies the internalization requirement, then these facts can be proved inside our theory. In particular, we would have $$Val(\langle\phi\rangle, \langle\chi\rangle)\Rightarrow Val(\langle\phi\sqcap\psi\rangle, \langle\chi\rangle)$$ and $$Val(\langle\psi\rangle, \langle\chi\rangle)\Rightarrow Val(\langle\phi\sqcap\psi\rangle, \langle\chi\rangle)$$, which together internalize L$$\sqcap$$, and $$Val(\langle\chi\rangle, \langle\phi\rangle), Val(\langle\chi\rangle, \langle\psi\rangle)\Rightarrow Val(\langle\chi\rangle, \langle\phi\sqcap\psi\rangle)$$, which internlizes R$$\sqcap$$.

Interestingly, since the meaning of an expression is given in terms of a set of metarules, what is required from an account of meaning is for it to state the ∂conditions of use for the expressions of the language in the language, as we did for conjunction above. So the validity predicate plays a crucial role for an inferentialist, a role that is very similar to that played by the truth-predicate for those that believe that meaning is given in terms of truth-conditions. What we should notice is that validity can only play such a role if the internalization requirement is satisfied.

Of course, we should not put too much weight on this as a reason to care about internalization unless we already subscribe to some form of inferentialism about meaning. But since some of the supporters of the substructural approach are indeed inferentialists (for instance Ripley 2015a and Carrara & Murzi 2015), this might be relevant to them.

## 5. Concluding Remarks and Future Work

In this paper I have considered whether substructural theories can accommodate a naive validity predicate. I have suggested that crucial to this project is the theory’s capacity to internalize its own metarules, and I have shown how various substructural theories fail to satisfy this requirement in different ways.

There are a few issues that I have been sidestepping. First, I only considered a quantifier-free language. But there is a lot to say about the internalization of the metarules for the quantifiers. Given that in some noncontractive theories there is an ambiguity between an additive reading of conjunction and disjunction and a multiplicative reading, we might wonder if the same happens with the quantifiers. After all, it is fairly common to interpret the universal quantifier as a generalized conjunction and the existential quantifier as a generalized disjunction. The first difficulty that we come across in this respect is that there is no consensus on how to treat the multiplicative quantifiers.[41] So some first-order substructural logics are already problematic independently of the issue of the internalization of the quantifiers.

Another related issue is that I have only examined how to internalize metarule schemata. But, ultimately, it might be desirable to investigate whether we can have internalizations in the form of quantified sentences. For instance, we can say that the rule R¬ is internalized as a sentence if the following is provable (we will need the aid of a truth predicate Tr(x), a predicate Form(x) that applies to x if and only if x is a formula, and suitable connectives $$\wedge$$, , and $$\rightarrow$$):

I leave the careful consideration of this more ambitious project for another time.

The third issue that I want to mention at least briefly has to do with the possibility of endorsing a nonclassical metatheory. This is related to the negative results reached in Section 3. The incompleteness result for $$\mathcal{S}^{V}_{A}$$ can be seen as the product of a mismatch between the actual behaviour of Val in $$\mathcal{S}^{V}_{A}$$ and its expected behaviour given how it was defined in the metatheory, which obeys classical logic. The admissible metarule with premise ⇒ φ and conclusion ⇒ φ ⊗ φ holds because to have one derivation of ⇒ φ is the same as to have two, three, or ten (identical) derivations of it. Since the metatheory is classical, once we have a derivation of a sequent, that’s it. We can use it as many times as we like. However, this fact cannot be captured in $$\mathcal{S}^{V}_{A}$$ by the sequent Val($$\langle\top\rangle, \langle\phi\rangle)\Rightarrow Val(\langle\top\rangle, \langle\phi\otimes\phi\rangle$$) because Val is indeed a nonclassical notion. Val is sensitive to multiple occurrences of a formula in a way that is not. This means that Val and stand for different notions.

As I remarked in Footnote 37, the noncontractive theorist has the option of rejecting Contraction in the metatheory as well. In this way the metarule used in the proof of Proposition 3.4 is no longer admissible. Just as two occurrences of the sentence Val($$\langle\top\rangle, \langle\phi\rangle$$) imply certain things which are not implied by one occurrence of it, one copy of ⇒ φ is not enough to infer ⇒ φ ⊗ φ. For that two copies of ⇒ φ are needed. So, if we think of collections of sequents as multisets that cannot be contracted on, then the lack of adjustment between Val and seems to vanish.

We saw that a similar difficulty affects $$\mathcal{S}^{V}_{NT}$$. The mismatch is produced once again because Val behaves nonclassically while $$\Rightarrow$$ was defined using a classical metatheory. The difference manifests itself in the form of an unsoundness result. More precisely, the definition of validity for $$\mathcal{S}^{V}_{NT}$$ was given in a classical metatheory, which implies that an argument is either valid or invalid, but not both. Similarly, a metarule either holds or doesn’t, but not both. In particular, Cut has counterexamples and hence it doesn’t hold (which rules out its holding). There are sentences φ, ψ, and χ such that φ ⇒ ψ, ψ ⇒ χ, but φ ⇒ χ fails to hold. However, we can show in the theory that Val(φ, ψ), Val(ψ, χ) ⇒ Val(φ, ψ) has a derivation. Once more, this means that Val and stand for different notions.

Can the nontransitivist resist this objection by endorsing a nonclassical metatheory? Maybe so, but things are not as straightforward as they were for the noncontractivist. In $$\mathcal{S}^{V}_{NT}$$ we can prove both ⇒ π and ⇒ ¬π. So a natural suggestion would be to avoid the mismatch between Val and by allowing the argument from π to to be not just valid, but invalid as well (thus adopting a dialetheic metatheory). Unfortunately, this will not do on its own. The problem we are considering affect $$\mathcal{S}^{V}_{NT}$$'s metarules. In particular, there is a proof of the following sequent in $$\mathcal{S}^{V}_{NT}$$: $$Val(\langle\top\rangle, \langle\pi\rangle), Val(\langle\pi\rangle, \langle\bot\rangle) \Rightarrow Val(\langle\top\rangle, \langle\bot\rangle)$$. Intuitively, this is tantamount to saying that that particular instance of Cut holds. But we can also prove the following three facts: $$\Rightarrow Val(\langle\top\rangle, \langle\pi\rangle)$$, $$\Rightarrow Val(\langle\pi\rangle, \langle\bot\rangle)$$, and $$\Rightarrow\neg Val(\langle\top\rangle, \langle\bot\rangle)$$. Intuitively, they together express (or at least imply) that the same instance of Cut does not hold. Here the nontransitivist could try to dodge that bullet by saying that Cut both holds and fails, taking advantage of her nonclassical metatheory. The problem is that if Cut holds we can infer that T ⇒ ⊥ and hence—by the usual T- and -drop metarules—the empty sequent . By Weakening, every sequent would be provable, and that kind of trivialism is surely unacceptable.[42]

This is certainly not the end of the story. The nontransitivist might reply that the sequent $$Val(\langle\top\rangle, \langle\pi\rangle), Val(\langle\pi\rangle, \langle\bot\rangle) \Rightarrow Val(\langle\top\rangle, \langle\bot\rangle)$$ does not express what we intuitively think it does. Or she might use—as we saw earlier with the v-Curry paradox—the distinction between strict and tolerant assertions to suggest that when we assert that Cut holds, we only do it tolerantly. Yet another route would be to reject the validity metarules. For example, if validity is understood in terms of a collection of constraints on our speech acts of assertion and denial, as in Ripley (2015a), it is at best debatable whether LVal and RVal (or even L$$Val^{*}$$ and R$$Val^{*}$$) hold.[43] This is not the place to settle this problem, but what seems to be clear is that more needs to be said about it.

What is the upshot of all this? We should not conclude from the previous discussion that the substructural approaches to paradox that I have considered face insurmountable difficulties. In fact, the substructural theorist has at least two alternatives to deal with the issues I have raised. Either she embraces a still shaky and understudied nonclassical metatheory or she stays in the safer grounds of her well-known battle-tested classical metatheory. Although I believe that, ultimately, she should embrace a nonclassical metatheory,[44] all I will say here is the following. If she chooses to do so, she will need to carefully explain how this metatheory works. This might involve putting forward a nonclassical set theory—as in Weber (2010)—or developing a novel proof-procedure where matters of provability might not be determinate—as in Rosenblatt (2016). If, instead, she chooses the second option, she has to provide a satisfactory philosophical explanation of the mismatch between Val and the concept of validity characterized by her classical metatheory.

## Acknowledgments

I owe thanks to two anonymous referees of this journal (and two of another) for their excellent comments and suggestions. I am also indebted to Eduardo Barrio, Natalia Buacar, Pablo Cobreros, Bruno Da Ré, Paul Egré, Andreas Fjellstad, Federico Pailos, Francesco Paoli, Lorenzo Rossi, Dave Ripley, Robert van Rooij, Damián Szmuc, Diego Tajer, Paula Teijeiro, Luca Tranchini, Elia Zardini, the Buenos Aires Logic Group, and the participants of the 2016 Navarra Workshop on Logical Consequence. This research has been partially supported by a postdoctoral fellowship of the National Scientific and Technical Research Council, Argentina.

## Notes

1. By a ‘language’ (or a ‘metalanguage’) Tarski understands what we nowadays would more or less call a theory. In particular, a ‘language’ (or a ‘metalanguage’) in this sense can contain axioms and rules.

2. Here I have equated naiveté for truth with what is usually known as the property of transparency or intersubstitutivity. Sometimes it is also required that the truth predicate satisfy some version of the T-schema. To avoid complications I will mostly ignore this subtlety.

3. If the context is intensional, this kind of substitution might fail. For instance, consider a doxastic operator for belief. An agent can believe that 2 + 2 = 4 without believing that ‘dos más dos es igual a cuatro’ is true, because she might not know Spanish. There might be other contexts where transparency fails too, but I will not consider this possibility here. See for instance Zardini (2015b).

4. Priest (2006) and Field (2008) provide theories (paraconsistent and paracomplete, respectively) where an extra conditional is introduced, while Beall (2011) defends a paraconsistent theory with no extra conditional.

5. This sort of criticism has found recent spokesmen in Zardini (2014; 2013), Weber (2014) and Caret and Weber (2015).

6. It is more or less standard to make a distinction between laws (e.g., the law of Excluded Middle), rules (e.g., the rule of Explosion), and metarules (e.g., Conditional Proof or Reasoning by Cases). See for instance Zardini (2011: Footnote 1). I understand a law as asserting that all sentences of a certain form are true. I understand a rule as asserting that all inferences or arguments of a certain form are valid. And I understand a metarule as asserting that all metainferences of a certain form are correct. For lack of a better word, I will talk about metainferences—and derivatively, of metarules—as being ‘correct’ or ‘incorrect’. I will also talk about rules being ‘valid’ or ‘invalid’, although strictly speaking it is inferences that are so.

These distinctions are not meant to be exhaustive or exclusive. They are not exhaustive because there might be perfectly legitimate and intelligible principles which are neither laws, nor rules nor metarules. And they are not exclusive because we can understand a law as a 0-premise rule and, similarly, we can understand a rule as a 0-premise metarule.

7. I have not properly explained yet what internalizing is but, roughly, the idea is that an adequate theory of validity should prove that its own metarules are validity-preserving; I will provide a more rigorous definition in Section 3.

8. In fact, I believe that there is a third condition that we should impose on a naive concept of validity but that I will not consider in this paper. This has to do with the invalid arguments:

3. For every $$\mathcal{S}$$-invalid argument from Γ to φ, S disproves the statement expressing that the argument from Γ to φ is valid.

A discussion of this issue can be found in Rosenblatt (2016).

9. Multisets are basically sets that allow repetitions. That is, a multiset is just like a set except for the fact that it is sensitive to the different occurrences of a member. For example, the multiset with members a, a, and b is not the same as the multiset with members a and b.

10. The issue of how to understand theories in a nonclassical setting is controversial. As Restall (2013) points out, it would be inaccurate to identify a theory with a set of sentences closed under some consequence relation. It will do for my purposes to understand a theory as a set of pairs of multisets of formulas closed under certain metarules (or, more simply, as a collection of sequents closed under certain metarules).

11. We also need the (0-premise) metarule of Reflexivity, according to which φ ⇒ φ. And it is not entirely implausible to develop an account of the paradoxes in which the culprit is Reflexivity, rather than Cut or Contraction. The reader can take a look at French (2016) and Fjellstad (2016) for some hints.

12. I am using sequents with multiple conclusions in the formulation of these metarules because this is a crucial feature of the sequent calculi I will present later on. Nevertheless, the paradox I will discuss in no way depends on this.

13. It is not my intention to claim that there is a sharp distinction between operational and substructural approaches to paradox. In fact, as Shapiro (2016) points out, there might be reasons for doubting this. Here I take no stand on this matter.

14. Actually, Field has several doubts regarding the plausibility of L$$Val^{*}$$ and R$$Val^{*}$$, and, more generally, the significance of the v-Curry paradox:

I am skeptical that there is anything new to be learned from the v-Curry. More fully: the notion of validity can be understood in distinct ways; on some ways of understanding it, the only morals of the v-Curry are ones we should have learned long ago from Gödel, whereas on other ways of understanding it, any way of handling the paradoxes of truth and satisfaction handles the v-Curry automatically. (2016: 2)

For reasons of space, here I can only discuss his main arguments against L$$Val^{*}$$ and R$$Val^{*}$$ under the reading of Val as an object language rendering of . I will ignore other ways of understanding Val, such as those that take Val to be characterized in terms of a class of models definable in the language of set theory or those that take Val to be necessary preservation of truth.

15. Field does not consider this argument as a potential objection to nontransitive responses to the v-Curry paradox. But Shapiro (2013) suggests that it is. He says that "given the intended interpretation of the predicate Val, Stage 1 of our derivation suffices for paradox" (2013: 104), where stage 1 is basically the first part of the v-Curry paradox. His point is that we can show that π implies and that π is true without using Cut.

16. Here it might also be helpful to use the distinction between strict and tolerant assertions from Ripley (2013). That π is true (i.e., Val($$\langle\pi\rangle,\langle\bot\rangle$$)) can only be asserted tolerantly, which means that it does not rule out the strict denial of . Although at first it might seem strange to apply the strict-tolerant distinction to metatheoretic claims about validity, π is precisely the sort of sentence that can be expected to hold only tolerantly. Thanks to Dave Ripley for discussing this matter and to an anonymous referee for raising some concerns.

17. Assuming for the sake of the argument that $$\mathcal{S}$$ has the appropriate resources to express its own Gödel sentence.

18. Thanks to an anonymous referee for prompting me to address these worries.

19. I will not consider the quantifiers for now. This is not just a matter of simplicity. The quantifiers present complications and I will come back to them in the final section.

20. Obviously, we need to use collections that are sensitive to the occurrences of formulas because we do not want to take for granted that Contraction holds. Because multisets are insensitive to order, the structural rules of Exchange and Associativity will be assumed.

21. As with conjunction and disjunction, there is an additive context-sharing version of its left metarule and a multiplicative non-context-sharing version:

Also, there are two ways of presenting the right metarule for the conditional. The one on the left is additive and the one on the right is multiplicative:

22. I will use Val as if it were a single predicate with variable arity. For any positive integers m and n, Val can work as an n + m-place predicate, where n stands for the number of premises and m for the number of conclusions. To simplify the notation, I will sometimes write $$Val(\langle\Phi\rangle, \langle\Psi\rangle)$$ to represent the claim that the argument with premises $$\phi_{1},...,\phi_{n}$$ for each $$\phi_{i}\in\Phi$$, and with conclusions $$\psi_{1},...,\psi_{m}$$ for each $$\psi_{i}\in\Psi$$, is valid. More precisely, $$Val(\langle\Phi\rangle, \langle\Psi\rangle)$$ is an abbreviation of $$Val(\langle\phi_{1}\rangle,....,\langle\phi_{n}\rangle, \langle\psi_{1}\rangle,....,\langle\psi_{m}\rangle)$$ where the first $$\sqcap$$ places are occupied by the (names of the) sentences that are the premises of the argument and the remaining places are occupied by the (names of the) sentences which are the conclusions of the argument. A more rigorous presentation would take Val to be not a single predicate but infinitely many predicates, one for each type of argument. For example, $$Val_{1}^{2}(x, y, z)$$ would be a validity predicate for arguments with two premises and one conclusion and $$Val_{3}^{1}(x, y, z, w)$$ would cover arguments with one premise and three conclusions. For each of these predicates we would have a corresponding LVal metarule and a corresponding RVal metarule. However, I will ignore this complication because it would make the notation extremely cumbersome. See Zardini (2014: 364–365) for the details.

23. The reason for only allowing side formulas of the form $$Val(\langle\Gamma_{i}\rangle, \langle\Delta_{i}\rangle)$$ is simple. Given the metarule of Weakening, a completely unrestricted version of this metarule is enough to prove unwanted sequents such as $$p \Rightarrow Val(\langle q\rangle, \langle p\rangle)$$, which intuitively states that the truth of the sentence p implies that the argument from any atomic sentence whatsoever to p is valid.

24. The reader familiar with sequent calculus presentations of modal logics might have noticed that Val is very similar to $$\mathcal{S}$$4’s strict-implication, with the only relevant difference being that the latter is binary while Val need not be.

25. I should note that whenever it is necessary, I will assume that the system contains a truth constant and a falsity constant with their corresponding metarules. This becomes important if we want to talk about the validity of sentences in this setting. If I need to say that a certain sentence φ is valid I will use the expression $$Val(\langle\top\rangle, \langle\phi\rangle)$$.

26. As we saw in section 2, using RVal is not the only way to introduce a validity predicate. There is also R$$Val^{*}$$. However, R$$Val^{*}$$ is too weak for my purposes. Since it can only be applied in the absence of side formulas, it is not sufficient to internalize metarules, which is one of the desiderata I am imposing on the naive concept of validity. Another alternative is to use the following $$\mathcal{S}$$5-ish metarule:

The restriction here can be explained as follows. We say that a formula is V-logical if it is of the form $$Val(\langle\Phi\rangle, \langle\Psi\rangle)$$ or it is formed by such formulas using the usual logical operations (of course, this can be defined recursively). And, naturally, ΓV (∆V) is V-logical if it only contains V-logical formulas. The definition is from Zardini (2014). With this metarule Val is similar to $$\mathcal{S}$$5’s strict implication. The problem with it is that in the resulting system Cut is no longer admissible (assuming we also have LVal). This is an interesting and complicated matter. A detailed analysis of it can be found in Barrio, Rosenblatt, and Tajer (2016).

27. For instance, although $$p \Rightarrow p\sqcup q$$ has a proof, $$\Rightarrow Val(\langle p\rangle, \langle p\sqcup q\rangle)$$ does not, because of the restriction in RVal’. I should have written $$\Rightarrow Val(\langle p\rangle, \langle p$$ $$d{\sqcup}$$ $$q\rangle)$$, to make clear that $$d{\sqcup}$$ $$q\rangle)$$ is not the (additive) conjunction but stands for a function that given the codes of a pair of formulas outputs the code of its (additive) conjunction. However, for reasons of readability, I will omit underdots throughout the paper.

28. Actually, I am deviating a bit from the literature in taking external validity to be a relation between arguments. Usually, the idea of external validity is restricted to sentences. That is, the argument from φ to ψ is externally valid if and only if there is a derivation of ψ whenever there is a derivation of φ. This deviation is insignificant for the point I am presently trying to make.

29. For a similar treatment of the distinction between internal and external validity see Priest and Wansing (2015: Section 7).

30. Concerning the metarules, there is a similar sentiment present in Zardini (2014); when talking about his naive theory of validity N he says “Thus, N is not only a theory of validity consisting in a set of categorical claims about what is valid (and what is not so)—given the presence of an object-language validity predicate, N is also ipso facto a far more general theory of validity containing noncategorical claims about validity” (2014: 374–375).

31. I will ignore the relevant logic that results by removing just the Weakening metarules from SC. There are two reasons for this. First, all the results below that apply to linear logic will apply to this relevant logic as well. Second, logics without Weakening but with Contraction are usually not considered in the substructural literature on self-referential paradoxes.

32. Dave Ripley suggested this interesting idea to me.

33. It might be replied that the problem I have pointed out is simply an artifact of the way linear logic’s comma corresponds to multiplicative conjunction, whereas there is no punctuation mark corresponding to additive conjunction. Fair enough. Indeed, in the presence of two structural connectives (a comma and a semicolon, corresponding to and $$\sqcap$$, respectively)—as in dual-bunching logics—it appears to be quite natural to have two forms of internalization. However, although dual-structure presentations of substructural logics are hardly unusual—as can be seen, e.g., in Restall (2000)—this is not the way in which advocates of substructural accounts of the semantic paradoxes tend to present their theories—Murzi and Shapiro (2015) is the only exception I know of—so I will leave this matter for another time.

34. Thanks to Francesco Paoli here.

35. Notice that the proof above only requires Weakening for formulas of the form $$Val(\langle\Phi\rangle, \langle\Psi\rangle)$$. This means that there is a similar result available for the linear logician if she admits Weakening for validity claims.

36. A more detailed analysis of this result can be found in Barrio et al. (2016). I should note that advocates of the nontransitive approach might simply bite the bullet and say that validity should be defined from a nonclassical metatheory, using perhaps a nonclassical set theory, like the one in Ripley (2015c). This sort of response is quite interesting, and even though I have no space here to give it the discussion it deserves, I will briefly comment on it in the final section.

37. At this point, the natural thought for the noncontractive theorist is to embrace the failure of contraction in the metatheory as well. In particular, in order to avoid this difficulty it seems necessary to distinguish between having a derivation of ⇒ φ and having two (identical) derivations of it. I will not explore this possibility here in detail, but some hints can be found in Weber (2016), Zardini (2014), Badia, Girard, and Weber (2015) and in the final section of this paper.

38. Having said that, I should note that Field provides a conceptual explanation of the failure of $$\rightarrow$$-Introd in terms of the distinction between conditional assertion and assertion of a conditional (Field 2008: 288). He has also developed an account of validity as something different from truth-preservation in Field (2015).

39. I will only focus on logical expressions here, so I will remain neutral towards the possibility of extending this sort of inferentialist account to other kinds of expressions. I will also ignore tonk-related worries concerning the appropriateness of meaning-conferring rules.

40. To simplify matters I have used versions of these metarules where the sequents involved contain one premise and one conclusion, but nothing important hangs on this.

41. For a discussion of this see Zardini (2015a), Paoli (2005) and Da Ré and Rosenblatt (2016).

42. I am very greatful to an anonymous referee for bringing this problem to my attention.

43. Ripley (2017) seems to endorse this last idea.

44. An example of this is Zardini (2013), who actually argues that nonclassical theorists who are such because of the v-Curry paradox must adopt a substructural metatheory.