If we assume that a false proposition is true, we can prove anything (ex falso quodlibet). Bertrand Russell, so the story goes, once mentioned this in class. A student raised his hand and challenged: in that case prove that 1=0 implies that you’re the Pope. Russell promptly obliged, see below. Bernoulli 1738, as discussed in an earlier post, contains two contradictory definitions of expected utility theory. This contradiction amounts to a false proposition, and that means any statement can be proved using expected utility theory. As an illustration, let’s prove that Bertrand Russell is the Pope.

Let’s start with Russell’s proof that 1=0 implies he’s the Pope. Russell said the following.

False Proposition:

(Eq.1) 1=0

Theorem 1:  I am the Pope.

Proof:  Add 1 to both sides of (Eq.1): then we have 2 = 1. The set containing just me and the Pope has 2 members. But 2 = 1, so it has only 1 member; therefore, I am the Pope.


This makes our work a lot easier. We only have to prove that 1=0, using expected utility theory, and then, by Theorem 1, we know that Bertrand Russell is the Pope.

Here’s the strategy. We identify the contradiction in Bernoulli 1738 and show that it implies 1=0. The contradiction will be of the following form: “v=1 and v=0.” Since v=v, this implies that latex 1=0. So we’ll go for that. We find a place where Bernoulli says v=1, and then we’ll find another place where he says v=0. That’s all we need to do, the rest was done by Russell.

On p. 24 Bernoulli writes the following:


At least since Laplace 1814, this has been interpreted as follows: the value of an uncertain prospect is the expected change in utility induced by it (Bernoulli converts this utility change into an equivalent certain monetary change, but we won’t do that, to keep things simple). In symbols, if u is the utility function, v is the value of the proposition, and \langle \cdot \rangle the expectation operator, we have

(Eq. 2) v=\langle \Delta u \rangle

To really keep things simple, let’s work with a trivial gamble: our initial wealth is x, we have to pay a fee F, and we are guaranteed (probability 1) to receive a payout G. According to (Eq.2) the value of this trivial gamble is then simply

(Eq.3) v = u(x+G-F) -u(x)

If this number is positive we should take the gamble, if it’s negative we should stay away from it. To be specific, let’s use the logarithmic utility function proposed by Bernoulli, u(x)=\ln x, and the following parameters:


G=\$3.43 (\text{or, more precisely, } \$\frac{e+1+\sqrt{e^2+2e-3}}{2})

x= \$1.41 (\text{or, more precisely, } \frac{G-F}{e-1})

Evaluate (Eq.3) with these parameters, and you’ll find that v=1.

Later in the paper, on p.27, Bernoulli contradicts himself (referring to an equation on p.26, see this longer blog post for details).


Bernoulli accompanied this statement with a figure (original Latin version (1738), German version (1896), English version (1954) [update 2020-07-30: an earlier version of this post wrongly stated that the figure was not included in the original 1738 version. This has now been corrected.]). Written as an equation this gives us a different expression for the value v, namely

(Eq.4) v= u(x+G)-u(x) - [u(x)-u(x-F)].

Evaluating this expression with our chosen parameters yields v=0. Since in both cases we have evaluated the same quantity — the value of the same prospect to the same person of the same wealth x, according to Bernoulli’s expected utility theory — we have shown that according to expected utility theory v=1 and v=0. Since that implies 1=0, using Russell’s Theorem 1, we have also shown the following:

Expected utility theory proves that Bertrand Russell is the Pope.


16 thoughts on “Economics 101: Bertrand Russell is the Pope

  1. Cute but isn’t that a result of using a logarithmic utility function? What Bernoulli tried to say, I think, is that one person who could be on opposite sides of a wager should place the same utility on their potential gains as on their potential losses. That precludes a logarithmic utility function I think.


    Liked by 1 person

    1. Hey JP! Thanks for your comment. No, the argument is not specific to logarithmic utility. With a different function you’d just choose different gamble parameters and get the same result. Actually, there are many ways of doing this — the point is Bernoulli accidentally introduced two contradictory criteria for gamble evaluation.

      With linear utility the problem goes away (when gains and losses have equal utility-weight). But linear utility is the same as working with expected wealth — the setup that caused all the problems Bernoulli wanted to solve. So not just cute but fatal for any formalism built on this.


      1. | With a different function you’d just choose different gamble parameters and get the same result.
        Not if we choose u(x) = x.

        Liked by 1 person

      2. Yes (like I said in the second paragraph of my response to JP). Now let’s think one step further: utility theory was introduced to deal with the problems that arise when we optimize the expectation value of wealth x. The idea was to optimize instead the expectation value of some function of x, which is called the utility function u(x). But if that function is identical to x, meaning u(x)=x, then we’re not actually using expected utility theory at all. We’re back to square one. In other words, this problem with expected utility theory is indeed fatal unless you’re not using expected utility theory (no surprise there).


  2. Eq.3 looks familiar to me; but I don’t recall anyone using something like Eq.4 in the last 40 years. Where has it been used recently rather than Eq.3?

    Bernoulli might have confused Eq.4 with Eq.3; but I don’t think that mistake is made much anymore.


  3. David, thanks for your comment. The problem is that no one seems to know that Eq.4 exists in Bernoulli and is different from what everyone thinks he said. That creates the danger that whenever someone digs out Bernoulli’s original and reads the wrong bit (the bit where he’s actually being quantitative), he will come away with something that’s inconsistent with modern expected utility theory (which is conceptually wrong in other ways).

    It has had a catastrophic influence on the path that modern economic theory has taken. Here is how: Menger dug up Bernoulli and read the wrong bit. That led him to conclude (wrongly) that utility functions have to be bounded. This excludes logarithmic utility (and linear, and power-law…). In 1956 Kelly wrote his paper, which for the umpteenth time recognized that something is not quite right in how people had conceptualized randomness. No one has been quite as explicit as we have about this, but Kelly basically optimized time-average growth rates.

    Someone then noticed that that’s like optimizing logarithmic utility, which is unbounded and not allowed according to Menger’s flawed argument. Samuelson called Menger’s paper “a modern classic that stands above all criticism,” and other influential economists said similar things.

    But this ruled out optimizing growth rates over time — the most natural economic thing to do.

    At least following Kelly, economics should have turned a corner and started asking itself what it was doing with these bizarre utility functions and what their actual physical meaning was. But that didn’t happen for the reasons I just mentioned. So this error really kept economics on the wrong track.

    Our work is to re-do economics by switching tracks and interpreting randomness and probability theory in the right way, where a distinction is made between frequencies over time and frequencies within an ensemble.

    You can find earlier references than Kelly, by the way, that should have led to a better form of economics. Ito’s 1944 paper should have definitely done it. Also Whitworth’s 1870 book. The St Petersburg paradox should have put people on the right track in 1713. There are wonderful passages in Cardano’s 16th-century book “On games of dice” that show a clear understanding that the effect of time is different from that of an ensemble. You could even argue that Solomon had the right idea, though the further you go back in time, the less quantitative the statements become (probability theory is very young, starting in 1654).


    1. Like David said, Eq 3 is familiar but Eq 4 is not. It is correctly translated from the paper (the analysis on pg 27) so it does appear that DB made an error:

      Eq 3 correctly nets out profit and loss before applying the utility fn.
      Eq 4 does not, equating E[u(w0 + random profit)] = -u(w0 – certain fee)

      But doesn’t modern EUT follow Eq 3? If so, isn’t DB’s mistake relatively benign?


      1. It’s hard to tell how significant a role Bernoulli’s error played in the development of economics in the 20th and 21st centuries. The Menger error is fairly serious (see the message you replied to) and a direct consequence of Bernoulli. It simply held back a reasonable interpretation of utility theory — namely that it’s simply a weirdly phrased form of growth rate optimization. It has nothing to do with psychology, it’s just evolution 101: “since the thing that grows fastest starts to dominate in evolution, every thing we observe today is quite good at maximizing growth.”

        I also think it’s significant that the enormous bubble of behavioral economics started after the Econometrica publication of Bernoulli. Approximately no one was reading Bernoulli’s original in the 19th century — because there were re-tellings of Bernoulli’s work by Laplace (1814) in French, and by Todhunter (1865) in English, neither of which made the mistakes Bernoulli made. Economics only fell into the self-referential circular psychological black hole after Bernoulli’s original erroneous work became widely available. Kahneman and Tversky’s work seems a direct consequence of the error in Bernoulli. It’s very easy, starting from Bernoulli, to take the wrong path into the behavioral story about a missing reference value and human irrationality.

        But I guess we’ll never know exactly who read the Bernoulli translation and what they did as a consequence of it. The most important thing is to make it widely known that Bernoulli’s paper has an error and you shouldn’t let that confuse you. It’s early work, and like most early work it’s flawed, the concepts weren’t clear to the author and gaps had to be filled in by later researchers.

        The actual story is not utility theory, nor is it prospect theory or similar (those are “wrong corrections”). The actual story is ergodicity: just compute what happens over time. In the important dynamical model of (noisy) exponential growth that means computing the exponential growth rate, which is exactly Eq.3 (where the time unit is set to 1 and not written explicitly).


    2. Did Kelly refer to Bernoulli? I know he mentioned the utility axioms of Morganstern and von Neumann. If Bernoulli’s paper was translated and published in 1954 (Econometrica), it’s possible he wasn’t even familiar with Bernoulli. And, recall that Shannon’s 1948 paper was the linchpin.


  4. That’s very interesting. I’m familiar with Ito’s paper (it figures prominently in the study materials for the third Actuarial Exam); but I didn’t hear about it when I was studying Economics in graduate school.

    Liked by 1 person

  5. How is this true? For the second equation he clearly says “in a fair game” which is not the case of the bet chosen here, so there is no contradiction.


  6. In a parallel world, Betrand Russell says, “I can do better, I will prove to you that I am God”


Leave a Reply to Ibraheem Muhammad Moosa Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s