/ Betting, Risk, and the Law of Likelihood

Abstract

When exactly does evidence E favor one hypothesis H1 over another hypothesis H2? I formulate an answer based on the expected utilities of bets on H1 and H2 and the variances therein (i.e., the risk inherent in those bets). This answer turns out to conflict with the Law of Likelihood and to agree with a solution based on Bayesian Confirmation Theory as proposed by Fitelson, but with a novel confirmation measure (with a clear operational meaning) underlying it. The account of favoring proposed here clarifies, I argue, the mechanism and intuition behind several recently proposed counterexamples to the Law of Likelihood.

1. Introduction

Recent discussions about the relative merits of Likelihoodism and Bayesian Confirmation Theory have zoomed in on a small number of central arguments about how evidential favoring ought to be explicated, i.e., how one ought to answer this

Favoring Question: When exactly does evidence E favor one hypothesis H1 over another hypothesis H2?

This is a question of both relevance and complexity. Whether evidence favors suspect X being guilty over X being innocent, or whether evidence favors medicine A curing a disease over medicine B curing it (without too many side effects), may both be of vital importance. While cosmological observations have unambiguously favored general relativity over Newton’s theory of gravity, the question whether evidence favors string theory over loop quantum gravity as providing a quantum theory of gravity is much murkier. Not only are the latter two theories far more complicated than the classical theories, and not only are both still under construction, the evidence so far has not been of the canonical empirical type, but rather is theoretical and conceptual (for discussions, see (Dawid 2013) and (de Haro, Dieks, ’t Hooft, and Verlinde 2013)).

Perhaps surprisingly, even on the elementary level of toy problems involving hypotheses about, say, cards drawn from well-shuffled decks, there is no consensus on what constitutes the[1] correct answer to the Favoring Question. One strong contender in the race towards the appropriate answer is the Law of Likelihood, which states

(LL) Evidence E favors H1 over H2 iff Pr(E|H1) > Pr(E|H2).

Here it is assumed that the hypotheses under consideration are statistical and assign well-defined probabilities to the evidence E occurring, given background knowledge K.[2] Hacking (1965) is an early proponent of this Law (he gave the Law its name in chapter V of his book) and in more recent times Sober (2008) has argued in favor of it. Sober’s book also provides a good illustration of both the complexity and the importance of the Favoring Question.

Fitelson has formulated an alternative answer based on Bayesian Confirmation Theory (Fitelson 2007, 2011, 2013). Suppose we have a measure of incremental (to be contrasted with absolute) confirmation \({\cal M}\)(H ; E), which quantifies the degree to which evidence E increases our confidence in hypothesis H, such that

E confirms H iff \({\cal M}\)(H ; E) > 0 iff Pr(H|E) > Pr(H),

E undermines H iff \({\cal M}\)(H ; E) < 0 iff Pr(H|E) < Pr(H).

The Favoring Question is then answered, on Fitelson’s Bayesian Confirmation Theory account, as follows:

(BCT) Evidence E favors H1 over H2 iff \({\cal M}\)(H1; E) > M(H2; E).

The purpose here is to examine recent arguments given in favor of the Law of Likelihood as well as (purported) counterexamples to this law (Fitelson 2007, 2011, 2013; Chandler 2013; Gandenberger 2014a, 2014b, 2014c, 2014d). The main new tool used is a quantitative measure of the risk involved in bets on hypotheses H1 and H2. I introduce that tool and how to use it in the context of the Favoring Question in section 2. This leads quite naturally to the answer (BCT) but in terms of a novel Bayesian confirmation measure, denoted by \({\cal M}\)u.

In section 3 I consider how (BCT) based on \({\cal M}\)u treats several (purported) counterexamples to the Law of Likelihood. While my account of favoring agrees that all of these examples are genuine counterexamples, it may not be intuitively clear on what exactly this agreement rests. In section 3.2 I rewrite the condition (BCT) based on measure \({\cal M}\)u in terms of two quantities, L and C, that both do have an intuitive meaning, and show how these two quantities can be exploited to explicate the intuition behind the counterexamples. In particular, while a Likelihoodist considers only L to be relevant, I claim that L and C are equally important. In section 3.4 I consider a principle that Chandler (2013) has proposed (dubbed “behaviour under conditionalisation”) in order to help answer the Favoring Question. I argue that his principle is acceptable only to someone who already accepts the Law of Likelihood, but not to someone who accepts (BCT) and \({\cal M}\)u. As such, it is not an independent argument either for or against the Law of Likelihood.

In the Appendix I show that the condition (BCT) when used in conjunction with several other known Bayesian confirmation measures m can be rewritten in terms of similar quantities Lm and Cm (different numbers for different measures). This serves mainly to show that the intuition I develop here is not all that sensitive to the choice of Bayesian confirmation measure.

2. Betting and Favoring

One strategy for attempting to explicate the concept of favoring is by means of betting procedures. This strategy may be feared to be biased towards Bayesian methods, as the latter are often developed or justified through considerations of bets (Dutch book arguments in particular). However, the fact that Chandler (2013) uses precisely such a strategy (amongst others; see Section 3.4 for a different principle he proposed to settle the Favoring Question) to argue for the Law of Likelihood may alleviate this fear. Let me consider his proposal, where I do change notation to make explicit one particular crucial (implicit) assumption of his.

2.1. Betting on Hypotheses

Consider bets on the truth of some hypothesis H of the following straightforward type. We deposit an amount l and if we are right and H is true, we win back an amount w > l (so that our net win is w l > 0). If we are wrong and H is false, we forfeit our deposit. If we have assigned a probability Pr(H) to H being true, the expected utility of this bet is

u〉 = wPr(H) − l.

Now consider two such bets on two different hypotheses H1 and H2, with win and loss amounts w1, w2, l1, and l2, respectively. Suppose the expected utilities are the same for these two bets, i.e.,

u1 := w1Pr(H1) − l1 = w2Pr(H2) − l2 =: 〈u2 .

With Chandler, let us now suppose we obtain some evidence E that makes us modify our probabilities, but the bets are kept the same (perhaps the bookie is not aware of the existence of E). We may now attempt to explicate favoring like so:

(*) Evidence E favors H1 over H2 iff E makes the expected utility of the bet on H1 larger than that of the bet on H2, where the expected utilities were the same before E was taken into account.

This could and would be a valid attempt, were it not for the fact that the conditions of the two bets have not yet been fully determined by the condition of equal expected utilities. We need, in fact, one more condition to make the resulting favoring relation unique.[3] Chandler (2013) assumed that l1 = l2, without providing a justification. This assumption indeed settles the Favoring Question and it yields the Law of Likelihood, as can be easily verified. But why not stipulate something like w1 = w2 or w1 l1 = w2 l2 instead? The latter stipulation would lead to a different favoring relation, namely:[4]

(G) Evidence E favors H1 over H2 iff Pr(EH2) > Pr(EH1),

which involves the “catchall” likelihoods in terms of the negations of the hypotheses H1 and H2. The former stipulation would yield yet another explication of favoring:

(Ca) Evidence E favors H1 over H2 iff

Pr(H1|E) − Pr(H1) > Pr(H2|E) − Pr(H2),

where the expert recognizes the appearance of Carnap’s difference measure (Carnap 1962).

There are arguments to be found for all three of these stipulations. For example, suppose we are willing to spend only a small amount of money X and will bet only once. Then we may insist on choosing l1 = l2 = X. Or suppose we desperately need a large amount of money Y to pay off our debt and our only chance to acquire such a large amount on short notice is by placing a bet (and winning!). In that case we may insist either that w1 = w2 = Y or that w1 l1 = w2 l2 = Y, depending on how much money we want to be left with after having paid off our debt.

These arguments make sense if we consider just a single bet, but not so much when considering multiple bets: we could really set only average values and perhaps an upper bound on l1 and l2 in that case, or average values and lower bounds on w1 or w1 l1 and w2 or w2 l2.

Since we are forced to make many decisions in our lives, all of which could be formulated in terms of bets on competing hypotheses, it pays to consider in more detail multiple bets.

2.2. Risk or the Variance in Utility

An important quantity that characterizes the structure of a given bet is how much risk is involved. There is risk because there is uncertainty about the outcome of the bet (unless either Pr(H) = 1 or Pr(H) = 0), and thereby uncertainty about the amount of utility we will actually gain. A standard measure of risk is given by the variance in expected utility, defined as

$$\sigma^2_u:=〈u^2〉-〈u〉^2.$$

For the type of bets considered here we find

$$\sigma^2_u=w^2 (\Pr(H)[1-\Pr(H)]).$$

We see two contributions to the variance. The second factor is the uncertainty about the outcome (maximal when Pr(H) = 1/2, minimal when either Pr(H) = 0 or Pr(H) = 1), independent of the structure of the bet; the factor w2 sets the overall size of the variance, independent of the probabilities we assign.

Variance plays a well-defined role when considering large numbers of bets. The actual average utility (averaged over all bets) will approach the expected average value, and the extent of the statistical fluctuations around that value is determined by the standard deviation (the square root of the variance, σu).

Returning now to the attempt (*) to explicate favoring, it makes sense to make the comparison between the two bets on H1 and H2 (before the evidence comes in) as fair as possible, by making the performances of the two bets in the long run as similar as possible. Hence the proposal to set not just 〈u1 = 〈u2 but also σu1 = σu2.[5] Let us, therefore, explicate the favoring relation as follows:

(**) Evidence E favors H1 over H2 iff E makes the expected utility of the bet on H1 larger than that of the bet on H2, where the expected utilities and the variances therein were equal before E was taken into account.

We can translate this explication into the equivalent but more explicit form

(***) Evidence E favors H1 over H2 iff \({\cal M}\)u(H1; E) > Mu(H2 ; E),

with a novel (as far as I am aware) Bayesian confirmation measure defined as

$${\cal M_u}(H;E):=\frac{\Pr(H|E)-\Pr(H)}{\sqrt{\Pr(H)(1-\Pr(H))}}.$$

It indeed satisfies the standard desiderata for incremental measures of confirmation: for confirming evidence (i.e., Pr(H|E) > Pr(H)) we have \({\cal M}\)u > 0, for disconfirming evidence \({\cal M}\)u < 0. I note two pleasing properties of this measure here: First, it satisfies the right type of symmetry, Hypothesis Symmetry (as defined and approved by (Eells & Fitelson 2002) and as approved also by (Kemeny & Oppenheim 1957)):

\({\cal M}\)u(¬H ; E) = −\({\cal M}\)u(H ; E).

Second, it can be rewritten as the change in expected utility, ∆u, in units of the standard deviation σu for any bet of the structure considered here, irrespective of the values of the numbers w and l:

$${\cal M_u}(H;E)=\frac{w(\Pr(H|E)-\Pr(H))}{w\sqrt{\Pr(H)(1-\Pr(H))}}=\frac{\Delta_u}{\sigma_u}.$$

This can be seen as the operational definition of \({\cal M}\)u: it tells us how much more we expect to make on a given bet, thanks to our new evidence, in units of the risk σu.

3. Counterexamples to the Law of Likelihood

I discuss here several purported counterexamples to the Law of Likelihood, with the focus on one particular such example (subsection 3.1), which I find convincing and which seems the least objectionable to defenders of the Law (Gandenberger 2014a, 2014b, 2014c). In subsection 3.2 I propose an explication of the intuition behind this example, which will also indicate when one’s intuition may be strong and when weak. Subsequently I will test whether that explanation applies to other counterexamples as well. (In the Appendix I will show that this explication of intuition is not specific to the new measure \({\cal M}\)u proposed here, but works for Good’s log-likelihood ratio, Carnap’s difference measure, and Nozick’s measure just as well. This does not imply I endorse any of the these three confirmation measures.)

3.1. Titelbaum’s Counterexample

The example of this subsection is due to Titelbaum. As far as I know, it has been discussed exclusively in blog posts (see Gandenberger 2014a, 2014b, 2014c). Stripped from its background story (which concerns details of the card game of Hearts), the situation is as follows. The evidence E is this: You see the glimpse of a card, dealt from a well-shuffled standard deck of 52 cards (with four suits of 13 cards each) and all you can detect is its suit: Hearts. The two hypotheses under consideration are

H1 : the card is the Two of Hearts.

H2: the card is either the Queen of Spades, or else a Hearts card higher than the Two.

All relevant probabilities can be straightforwardly calculated in this case,

$$ \Pr(H_1)=\frac{1}{52};    \Pr(H_1|E)=\frac{1}{13};\\ \Pr(H_2)=\frac{1}{4};     \Pr(H_2|E)=\frac{12}{13}. $$

The Law of Likelihood tells us that the evidence favors hypothesis H1, since

$$\Pr(E|H_1)=1>\Pr(E|H_2)=\frac{12}{13}.$$

According to a poll on Gandenberger’s blog (Gandenberger 2014c), a majority of respondents revealed to have the opposite intuition that the evidence favors H2 . What I wish to do is first calculate what the measure \({\cal M}\)u through its use in (***) says about this case, and then try to understand its verdict. The idea is to try to pinpoint the intuition behind Titelbaum’s example, and subsequently test this intuition about other (purported) counterexamples to the Law of Likelihood (section 3.3).

3.2. Proposed Mechanism behind Counterexamples

According to the Law of Likelihood there is a single condition that must be satisfied in order to yield an affirmative answer to the Favoring Question. We may call this condition the likelihood condition (L),

(L) Pr(E|H1) > Pr(E|H2).

There may be additional conditions besides (L) that play a role in affirmatively answering the Favoring Question. One candidate is the “catchall condition” (C),

(C) Pr(EH2) > Pr(EH1),

which is phrased in terms of the negations of hypotheses H1 and H2. Fitelson (2007, 2011, 2013) has argued that it is these two conditions that determine the favoring relation.

The premiss that conditions (L) and (C) are sufficient for answering the Favoring Question entails the Weak Law of Likelihood (Joyce 2008)[6]

(WLL) If both Pr(EH2) > Pr(EH1) and Pr(E|H1) > Pr(E|H2) then evidence E favors H1 over H2

An advantage to accepting this premiss is that one does not need the full prior probability distribution over hypotheses in order to answer the favoring question: catchall likelihoods are reducible to priors but not vice versa, as pointed out by Fitelson (2007).

If one accepts (WLL) but not (LL), then the two conditions (L) and (C) together do not provide an a priori unambiguous affirmative answer to the Favoring Question whenever one condition is fulfilled and the other is violated. In such an ambiguous case one has to determine how much weight to assign to each condition in order to arrive at a conclusive answer.

I will show here that the measure \({\cal M}\)u can be seen to assign, in a specific sense, equal weights to these two conditions. For convenience, let us focus on confirming evidence only. That is, assume that E (incrementally) confirms both H1 and H2.[7] In that case, we may quantify the degree to which condition (L) is satisfied (or violated) by a number L, such that for confirming evidence (L) is satisfied iff L > 1 (and violated iff L < 1), with[8]

$$L:=\frac{\Pr(H_1|E)-\Pr(H_1)}{\Pr(H_2|E)-\Pr(H_2)}\times\frac{\Pr(H_2)}{\Pr(H_1)}.$$

Analogously, we introduce a number C which quantifies to what degree condition (C) is satisfied (or violated) by

$$C:=\frac{\Pr(H_1|E)-\Pr(H_1)}{\Pr(H_2|E)-\Pr(H_2)}\times\frac{1-\Pr(H_2)}{1-\Pr(H_1)},$$

such that for confirming evidence (C) is satisfied iff C > 1 (and violated iff C < 1). The point of quantifying the degrees of satisfaction or violation in this specific, and perhaps somewhat peculiar, way is that we can write for confirming evidence \({\cal M}\)u(H1 ; E)/\({\cal M}\)u(H2; E) = \(\sqrt{LC}\). This relation is important: it shows that (***) can be expanded to

(****) Confirming evidence E favors H1 over H2 iff \({\cal M}\)u(H1; E) > Mu(H2; E) iff LC > 1.

This in turn shows that in a well-defined sense equal weights are assigned to the two conditions (L) and (C). There are, moreover, no other conditions than (L) and (C) relevant to the question of favoring, in full agreement with (Fitelson 2007, 2011, 2013). If both (L) and (C) are fulfilled, then \({\cal M}\)u (through its application in (***) or (****)) will automatically favor H1 over H2 and thus will agree with the Weak Law of Likelihood. More importantly, it provides an unambiguous answer even when one condition is fulfilled and the other violated.

Now let us see what the numbers tell us about Titelbaum’s counterexample. Short calculations show that

$$ {\cal M_u}(H_1;E) = \frac{3}{\sqrt{51}}\,\approx 0.42, \\ {\cal M_u}(H_2;E) = \frac{35}{13\sqrt{3}}\,\approx 1.55, $$

so that H2 is confirmed to a larger degree than is H1 (thus agreeing with Titelbaum that his example is a genuine counterexample). It is not so straightforward, though, to get an intuition for these particular numbers, and so we also calculate

$$L=\frac{39}{35}\approx 1.11,$$

which gives the degree to which condition (L) is fulfilled (and its fulfillment is, according to the Law of Likelihood the one and only reason to state that the evidence favors H1 over H2), and the number C:

$$C=\frac{39}{35\cdot 17}\approx 0.066,$$

which shows that (C) is violated. More precisely, although condition (L) is satisfied, the degree to which (C) is violated is more than the degree to which (L) is fulfilled. That is why (****) conflicts with (LL) in this case. We can also see why one’s intuition may be strong in this case: L is only a little larger than 1, whereas C is much smaller than 1.

3.3. Two Other Counterexamples

3.3.1. Fitelson’s Counterexample

One counterexample by Fitelson (2007) may be dismissed on the grounds that it involves hypotheses that are not mutually exclusive (Gandenberger 2014d). My aim here is merely to check whether the intuition behind that example (whether or not it is accepted as a potential counterexample) can be explained by the numbers L and C .

The example is this: We draw a card from a well-shuffled standard deck. The evidence E is: the card is a spade. The two hypotheses are:

H1: the card is the ace of spades.

H2: the card is black.

The law of Likelihood states that the evidence favors H1 because Pr(E|H1) = 1 > 1/2 = Pr(E|H2). But, Fitelson’s point is that the evidence implies the truth of H2 (indeed, the card being a spade implies it is black), but not (at all) the truth of H1. And so the evidence really—so says Fitelson—favors H2, contrary to what the Law of Likelihood states.

If we calculate the numbers L and C for this case we obtain (note that the evidence is confirming) L = 3 and C = 1/17. Thus, (****) agrees that the evidence here confirms H2 more than it confirms H1, and that this is so because (C) is violated to a larger degree than (L) is satisfied.

3.3.2. Leeds’s Counterexample

Here is an example taken from (Leeds 2004). I quote (leaving intact Leeds’s notation):

Suppose our background information is that a certain deck of cards has been drawn at random from a collection of decks of which half are all clubs and half a 4:1 mix of hearts and diamonds. Let E be the statement that the deck in fact is composed of hearts and diamonds; let H be the hypothesis that on a certain random draw a heart was drawn, H’ that the same draw produced a diamond. Then we would feel no hesitation in saying that E supports H over H’; p(E|H) and p(E|H’), however, are, depending on one’s view about what sort of probability p is supposed to be, either both 1 or both ill-defined. (Leeds 2004: 1)

I take the point of view that there is no particular difficulty in assigning probabilities here: Pr(H) = 2/5, Pr(H') = 1/10, Pr(H|E) = 4/5 and Pr(H'|E) = 1/5. I now identify H:=H2 and H’:=H1 . This allows us to calculate the number L, and we find L = 1. Hence the Law of Likelihood states that the evidence is neutral between H and H’ (which, of course, also follows directly from Leeds’s observation that both likelihoods equal 1 on the view of probability adopted here). We may also calculate C and we find C = 1/6 < 1. Condition (C) is, therefore, violated and with (L) being neutral the evidence, according to (****), unambiguously favors H2 = H, in agreement with the intuition expressed by Leeds.

And so both counterexamples follow the same pattern as does Titelbaum’s counterexample: it is the (relatively large) violation of condition (C) that underlies the intuition that there is a contradiction with the Law of Likelihood, which relies exclusively on condition (L) being satisfied.

3.4. Another Favoring Principle?

Chandler (2013) proposed a principle to be used in settling the Favoring Question, and that principle yields the Law of Likelihood. His proposal is that evidence E should count as favoring H1 over H2 iff that same evidence would favor H1 were we to learn that, in fact, either H1 or H2 is true.

We may treat the fact that either H1 or H2 is true as a different piece of evidence, , to be added to our background knowledge. If, before we learn , we have assigned probabilities Pr(H1) and Pr(H2)—which we assume to add up to less than unity—after learning we update

$$ \Pr(H_1|\tilde{E}) = \frac{\Pr(H_1)}{\Pr(H_1)+\Pr(H_2)},\\ \Pr(H_2|\tilde{E}) = \frac{\Pr(H_2)}{\Pr(H_1)+\Pr(H_2)}. $$

Obviously, always constitutes confirming evidence. We may evaluate the two numbers \(\tilde{L}\) and \(\tilde{C}\) that quantify to what degrees the likelihood condition and the catchall condition are satisfied or violated. We find for evidence

$$ \tilde{L} = 1\qquad\qquad\qquad\qquad\enspace\\ \tilde{C} = \frac{\Pr(H_1)-\Pr(H_1)\Pr(H_2)}{\Pr(H_2)-\Pr(H_1)\Pr(H_2)}. $$

If one adheres to the Law of Likelihood, then the evidence favors neither H1 nor H2, since \(\tilde{L}\) = 1. In that case Chandler’s principle is, I think, acceptable since we merely add to our background knowledge a fact that is neutral between H1 and H2 .

On the other hand, according to the account of favoring (****), the evidence favors the hypothesis whose initial probability is the larger, because \(\tilde{C}\) is not equal to 1 in such a case. There is no particular reason to accept Chandler’s principle in that case, since we are asked to change our background knowledge in a way that is not neutral between the two hypotheses.

The roles of L and C in relation to Chandler’s principle become more clear by noting that after conditionalizing on , for any new evidence E that will come in, the two quantities L and C will be equal. This is so, because H1 has become equivalent to ¬H2, thanks to . So, if one regards C as providing important information—in fact, equally important as L—by accepting Chandler’s principle one would have to admit it actually cannot be important.

All this does not show that the principle is wrong. Instead, whereas Chandler meant his principle to be used for arguing in favor of the Law of Likelihood, it now transpires that it is not an independent argument (neither pro nor con). For one who accepts the Law of Likelihood it is a valid principle, but for one who insists that condition (C) must be taken into account to settle the Favoring Question, Chandler’s principle is invalid.

4. Summary

My proposal is to explicate evidential favoring as follows: evidence E favors hypothesis H1 over H2 if and only if the utility of a given bet on the truth of H1 becomes, thanks to the evidence, larger than that of a similar given bet on H2, where the bets had the same expected utility and the same risk before the evidence came in. Here by “risk” I mean the variance (or, equivalently, the standard deviation) in the utility, and the expected utility is determined by probabilities we assigned to the hypotheses being true.

It turns out that this explication is equivalent to this statement: Evidence E favors H1 over H2 if and only if

\({\cal M}\)u(H1; E) > Mu(H2 ; E),

where \({\cal M}\)u(H ; E) is a (novel) Bayesian confirmation measure that assigns a number to how much confidence we gained (if \({\cal M}\)u > 0) or lost (if \({\cal M}\)u < 0) in hypothesis H due to the evidence. The measure has a well-defined operational meaning and may be written as

$${\cal M_u}(H;E)=\frac{\Delta_u}{\sigma_u},$$

with ∆u the change in expected utility of our bet on H and σ the standard deviation in the utility. This ratio is independent of the amounts we potentially can win or lose on the bet. One particular way of rewriting the condition \({\cal M}\)u(H1; E) > Mu (H2; E) locates quite precisely, or so I argued, the mechanism behind recently proposed counterexamples to the Law of Likelihood. Namely, by quantifying the degrees to which the likelihood condition Pr(E|H1) > Pr(E|H2) and the catchall condition Pr(EH2) > Pr(EH1) are satisfied (or violated) for confirming evidence by two numbers L and C, respectively, being larger than 1 (or smaller than 1), we can write

Confirming evidence E favors H1 over H2 iff LC > 1.

This shows that the explication of favoring offered here assigns, in a specific sense, equal weights to the likelihood condition and the catchall condition. Recently proposed counterexamples to the Law of Likelihood can all be understood by noting that the likelihood condition is satisfied to a (relatively) small degree, whereas the catchall condition is violated to a (much) larger degree. Given the explication advocated here, all such examples are genuine counterexamples to the Law of Likelihood.

Appendix

There are other ways to quantify the degrees to which conditions (L) and (C) are either satisfied or violated. Natural ways of achieving this involve either taking the ratio of left- and right-hand sides (followed by taking the logarithm) or by taking differences. I consider these two possibilities in turn.

Taking Ratios

We may rewrite condition (L) by first taking the ratio of the left- and right-hand sides and then taking the logarithm. This manipulation shows that (L) is equivalent to the condition that Lr be positive, with

$$L_r:=\ln \frac{\Pr(H_1|E)\Pr(H_2)}{\Pr(H_1)\Pr(H_2|E)}.$$

Similarly, condition (C) can be rephrased as Cr > 0, with

$$C_r:=\ln \frac{[1-\Pr(H_1)][1-\Pr(H_2|E)]}{[1-\Pr(H_1|E)][1-\Pr(H_2)]}.$$

Now the log likelihood ratio l (Good 1960), defined as

$$l(H;E)=\ln \frac{\Pr(H|E)(1-\Pr(H))}{\Pr(H)(1-\Pr(H|E))},$$

emerges naturally here via the relation

l(H1 ; E) − l(H2 ; E) = Lr + Cr.

Taking Differences

Likelihood functions are considered equivalent when they are the same up to a proportionality constant (and in that case the likelihood functions are said to be “proportional”). That is, if for evidence E and evidence E* and for all hypotheses H in some given domain we have Pr(E|H) = k Pr(E*|H) for some constant k, then E and E* are considered as providing in essence the same evidence. By taking ratios of the left-hand and right-hand sides of (L) and (C) in the preceding subsection, this proportionality constant dropped out, and it led to a unique way of quantifying the degrees to which (L) and (C) are satisfied or violated. In contrast, when we are going to take differences in this subsection, there is still some freedom to include one or the other constant. This freedom will lead us to several different (ordinally inequivalent) Bayesian confirmation measures.[9]

The first measure we encounter is one proposed in (Nozick 1981): n(H ; E) = Pr(E|H) − Pr(EH). By straightforwardly taking the differences between left-and right-hand sides of (L) and (C) we obtain

Ln := Pr(E|H1) − Pr(E|H2),

and

Cn = Pr(EH2) − Pr(EH1),

in terms of which we find

n(H1; E) − n(H2; E) = Ln + Cn.

Instead of taking the difference between left- and right-hand sides of (L) and (C) we may instead first divide out a common factor of Pr(E) and then multiply both sides by the prior probabilities Pr(H1) and Pr(H2), and only then take differences. This leads to the following observation. We rewrite condition (L) as Ld > 0, with

Ld := Pr(H2) Pr(H1|E) − Pr(H1) Pr(H2|E).

Similarly, we rewrite condition (C) as Cd > 0, with

Cd := [1 − Pr(H1)][1 − Pr(H2|E)] − [1 Pr(H2)][1 − Pr(H1|E)].

We see then that Carnap’s difference measure d(H ; E) = Pr(H|E) Pr(H) arises in this context via

d(H1; E) − d(H2 ; E) = Ld + Cd.

Using (BCT)

When we insert any of the three confirmation measures, d, n, or l, into (BCT) we see that (L) and (C) play equal roles in answering the Favoring Question. Moreover, there are no other conditions than (L) and (C) needed to settle that question on these accounts. (These are exactly the same conclusions that followed for \({\cal M}\)u.)

The relation between the numbers Lm and Cm and the three confirmation measures m = n, l, d is of the simple form m(H1; E) − m(H2; E) = Lm + Cm . This may, on first sight, look quite different than the relation between the measure \({\cal M}\)u and the numbers L and C . It is, however, straightforward to turn the latter into a similar form: just rewrite (for confirming evidence, so that \({\cal M}\)u is positive)

$$\log({\cal M_u}(H_1;E))-\log({\cal M_u}(H_2;E))=\tfrac{1}{2}\log(L)+\tfrac{1}{2}\log(C).$$

For completeness I note, finally, that for the ratio measure r(H ; E) only Lr matters for settling the Favoring Question, and for Gaifman’s measure g(H ; E) only Cr does, when we use these measures in (BCT).

References

Notes

    1. There may not be a uniquely correct answer. The working assumption here is that there is one.return to text

    2. The background knowledge K is assumed to stay fixed here (with a small exception in section 3.4) and explicit reference to K may be suppressed.return to text

    3. Given a specific bet on H1, there are two numbers pertaining to the bet on H2, w2 and l2, left to be determined. Hence we need two conditions (or constraints) to determine these. The equality of expected utilities yields one such condition and we need one more.return to text

    4. Both the Law of Likelihood and (G) can be reformulated so as to take the form of (BCT) by choosing appropriate confirmation measures. This is well known in the former case (one uses the ratio measure, r(H ; E) = Pr(H|E)/ Pr(H), or the logarithm thereof), and in the latter case one would use Gaifman’s measure: g(H ; E) = Pr(¬H) / Pr(¬H|E) (Gaifman 1985).return to text

    5. Note there is no desire here of trying to minimize risk, or, for that matter, to maximize it, but just to equalize the two risks.return to text

    6. If exactly one of the conditions’ inequalities is changed into an equality, the conclusion is still assumed to follow.return to text

    7. If the evidence disconfirms both hypotheses, then (L) and (C) are fulfilled when L < 1 and C < 1, respectively. It is still true that \({\cal M}\)u (H1 ; E)/\({\cal M}\)u (H2 ; E) = \(\sqrt{LC}\). The case in which evidence confirms one hypothesis and disconfirms the other will be ignored here, as the Favoring Question is unambiguously and trivially settled in that case on any reasonable account.return to text

    8. Since this may be an unusual way to restate the Likelihood condition, let me give a derivation: (L) says Pr(E|H1) > Pr(E|H2). As long as Pr(E)>0 and neither hypothesis is impossible, this is equivalent to Pr(H1|E) / Pr(H1) > Pr(H2|E) / Pr(H2). Subtracting 1 from each side gives Pr(H1|E) / Pr(H1) − 1 > Pr(H2|E) / Pr(H2) − 1. For confirming evidence both sides are positive quantities, and so we may divide both sides by the quantity on the right-hand side and keep the inequality sign. (For disconfirming evidence, we have to flip the sign when performing the same division.) Multiplying, finally, by Pr(H1) Pr(H2) yields the condition L > 1 of the text. The inequality C > 1 is derived from condition (C) in an analogous way.return to text

    9. Two confirmation measures are ordinally equivalent if and only if for all pairs of hypotheses H and H' and for all pairs of pieces of evidence E and E' they agree on which of the two numbers \({\cal M}\)(H ; E) and \({\cal M}\)(H' ; E') is the larger one.return to text