Not all academic fields have a clear starting point, a seminal paper that constitutes the foundation of the entire discipline. But economics does. The paper that defines modern formal economics was written by Daniel Bernoulli in 1738. It introduces expected utility theory. The main thrust of our work is, of course, to replace expected utility theory and instead work with time-average growth rates of wealth. I’ll mention how that works, but the focus of this post will be on something else. Bernoulli’s paper is not only conceptually misleading but also technically flawed in a sneaky way that keeps confusing everyone. Where Bernoulli determines the price to be paid for a risky prospect, he contradicts himself. I wouldn’t make such a fuss about this if the paper wasn’t so absolutely crucial. This basis of economics contains an error that invalidates commonly held beliefs and puts tens of thousands of studies into a different light. I recently encouraged people, using twitter, to read the paper and see for themselves. In this blog post I go through the relevant analysis step by step and address questions that came up in response to the tweet.


Bernoulli’s paper was re-published in 1954 in Econometrica (if you don’t like paywalls, here’s a free pdf). This is the standard translation from Latin, and page numbers below refer to it. I have not read the original Latin paper, but the error is spelled out in words, visually in a figure, and also in an equation, so this is not something lost in translation. If you have a copy of the original, please send it to me. [2018-02-18 addendum: a scan of the original is here.] The paper is so fundamental to economics that more than 200 years after its publication it was fished out of the proceedings of the Papers of the Imperial Academy of Sciences in Petersburg (Vol. V, 1738, pp. 175-192), translated into English, and published in the leading journal of the field.

What is expected utility theory?

Expected utility theory (EUT) is a form of decision theory. Imagine someone offers you a lottery ticket. Prior to Bernoulli, it was assumed that people roughly maximize the expected change in wealth resulting from such a gamble. That is, if initial wealth is x(t) and possible changes in wealth are \delta x then you’d maximize

(Eq.1)   \langle x(t+\delta t) - x(t) \rangle = \langle \delta x\rangle.

Here, \langle \cdot \rangle is the expectation operator. Some people like to denote it by \mathbb{E} but it’s same thing. If \langle \delta x\rangle is positive, you’d buy the ticket, if it’s negative, you wouldn’t. But observations soon convinced everyone that this is not how people behave.

EUT was introduced as a refinement of this decision criterion. It says that you will choose the action that maximizes the expected change, not of your wealth but of your utility. The insight here is that people don’t evaluate gambles in isolation but with respect to a reference level (their initial wealth). An extra dollar is worth less to me if I’m rich than if I’m poor. Mathematically, a so-called utility function u(x) is introduced, where x represents wealth.

The most commonly used utility function is u(x)=\log x. This is motivated by assuming that the extra utility someone attaches to an extra dollar is inversely proportional to the wealth that that someone already has, p.25:


Later on Bernoulli writes this assumption as the differential equation du = 1/x dx, whose solution is the logarithm. Let’s write down what EUT wants us to maximize:

(Eq.2)   \langle \delta u \rangle = \langle \ln[x(t+\delta t)]-\ln[x(t)] \rangle.

What’s our conceptual critique?

By re-writing this object you will immediately see that it actually means something else, and that’s the essence of our conceptual critique. Logarithms turn division into subtraction, \ln(a/b)=\ln a -\ln b. So (Eq.2) can be re-written as

(Eq.3)   \langle \delta u \rangle = \left\langle\ln\left(\frac{x(t+\delta t)}{x(t)}\right) \right\rangle.

Now divide both sides by \delta t

(Eq.4)   \frac{1}{\delta t}\langle \delta u \rangle = \frac{1}{\delta t} \left\langle\ln\left(\frac{x(t+\delta t)}{x(t)}\right) \right\rangle.

What’s this? In any field other than economics, this object is called the expected exponential growth rate of wealth. No utility required, no psychology required. Digging just a little bit deeper, the whole story comes to light. Wealth dynamics are fundamentally multiplicative — we can invest wealth to generate more wealth. For such dynamics the exponential growth rate is an ergodic observable, which means its expectation value tells us what happens in the long run. Mystery solved! People just optimize what happens to their wealth over time. Crucially, additive wealth changes, \delta x are not ergodic, wherefore the expectation value in (Eq.1) does NOT tell us what happens over time.

Economics textbooks and papers miss this point. The problem is treated in an a-temporal space, as a so-called “one-shot game”. The mathematics is the mathematics of things happening in parallel universes, not of things happening over time. Where time is mentioned, it is usually wrongly assumed that \langle \delta x \rangle indicates what happens over time. Endless arguments ensue over whether \ln x is the correct psychological re-weighting people apply to wealth — in reality this is a question about dynamics, with psychology as a second-order effect — of course we can be deceived, confused, or stupid and not act in our best interest. It’s also far from easy to put a number on wealth x(t), or indeed on the different \delta x and their probabilities involved in real-world decisions.

The history of economics is a wonderful example of how a basic conceptual error prevents the detection of technical errors and inconsistencies. Without the right concepts it’s just not possible to ask the right questions, let alone find consistent answers.

How much would I pay for a lottery ticket?

Let’s put our conceptual troubles aside for a moment and use EUT to address precisely one of the questions Bernoulli asked. My initial wealth is x, I have to pay a ticket fee F, and I can win prizes G(n) with probabilities p(n), where 1\leq n \leq N is a random integer denoting which of the N possible prizes I win. We imagine the lottery to take a time \delta t that doesn’t depend on which prize is won.

EUT says: find the value F_{\text{max}} that makes the expected change in utility zero. If I pay more, the lottery will have a negative expected utility change, and EUT would advise me not to take part. Mathematically, here is the object we have to compute:

(Eq.5)    \langle\delta u \rangle= \sum_n p(n) u(x+G(n)-F) - u(x)

That’s it. Now vary F so that the expected change in utility becomes zero. The value of F where that happens is the maximum fee I should pay for the lottery.

Bernoulli’s error

Bernoulli makes an error when he talks about this fee. The key figure is on p.26, and I’ve scribbled in red a translation of Bernoulli’s notation into the symbols used in Peters and Gell-Mann (2016).


Let me talk you through the figure. The horizontal axis represents wealth x, and the vertical axis represents utility, so that the solid curve is the utility function u(x).

Let’s focus on the solid horizontal line first. The point B is the initial wealth x. The point p, to the left of B, represents the wealth x-F, that is, the initial wealth minus the ticket fee. The points C, D, E, F are wealth levels arrived at by adding to the initial wealth one of the possible prizes G(1), G(2), G(3), G(4).

You may already suspect the problem: let’s imagine outcome 1 occurs in the lottery, meaning we receive the prize G(1). In that case, our wealth will change to x+G(1)-F, and not x+G(1) (which is the position of point C). We have to subtract the fee! The points C, D, E, F marked by Bernoulli have no relevance to the problem (unless the fee is zero).

Now let’s walk along the utility curve. The point o marks the utility of  wealth x-F, so that the dashed line po is the drop in utility associated with a drop in wealth from x to x-F. The points G, H, L, M (to the right of B along the utility curve) represent the utilities associated with the irrelevant wealth levels u(x+G(1)), u(x+G(2)), u(x+G(3)), u(x+G(4)). The point O represents the expected utility assuming prizes are received with their respective probabilities but no fee is paid, p.26


In Peters and Gell-Mann (2016) we call this object \langle \delta u^+ \rangle := \sum_n p(n) u(x+G(n)), for want of a better symbol.

Now comes Bernoulli’s error. He claims that the lottery is to be valued by comparing \langle \delta u^+ \rangle to the loss in utility one would suffer if one were to buy the lottery ticket for its fee but not receive any prize. The key passage is on p.26–27



This amounts to a decision criterion, different from (Eq.5) and inconsistent with EUT. We denote with the symbol \delta u^- what Bernoulli calls “the disutility to be suffered by losing” (Bp in his notation). Now we can write Bernoulli’s decision criterion in symbols. Bernoulli tells us to buy a ticket if the following quantity is positive, and not to buy it if it’s negative

(Eq.6)   \langle \delta u^+\rangle - \delta u^- = \underbrace{\langle u(x+G) \rangle - u(x)}_{ \langle \delta u^+\rangle} - \underbrace{[u(x) - u(x-F)]}_{\delta u^-}.

Error or different model?

In principle, one could now say that Bernoulli just had a different model of people’s behavior than does modern economics. That would be bad enough because modern economics claims that it has the same model as Bernoulli. It’s more likely that Bernoulli got confused specifically when he tried to find the maximum fee. Otherwise he seems consistent with EUT, for instance on p.24, where he describes the certain profit (“the value of the risk in question”) that corresponds to an uncertain profit



Let’s really kill this. Why is (Eq.6) not a good decision criterion?

  1. It is easy to construct an example where the maximum fee I should pay, F_{\text{max}}, is smaller than the smallest possible prize in the lottery, but the criterion still tells me to refrain from buying a ticket. That makes no sense: such a lottery has no downside risk. I’m guaranteed a positive net profit, the only uncertain element is how much better off I will be after the game. Intriguingly — have another look at the figure — the maximum fee calculated by Bernoulli is smaller than the smallest possible prize, assuming that one of the prizes will be won (the distance between B and p is smaller than the distance between B and C).
  2. Another problem is this: how much should I be willing to pay for one dollar, according to Bernoulli? What? … for one dollar? One dollar, of course! Yes, but not according to Bernoulli. Try it out: \langle \delta u^+\rangle < \delta u^- for any non-zero number of dollars I buy in dollars, due to the concavity of the utility function. Therefore, Bernoulli’s criterion tells me that no amount of money is worth that amount of money. Come on! That’s nonsense!

Bernoulli in practice

Later, on p.33, Bernoulli works through the specific case of the St. Petersburg lottery. This lottery is defined by p(n)=\left(\frac{1}{2}\right)^n and G(n)=2^{n-1}, with n any positive integer. But his method is so nonsensical that he does something really curious. He wants to evaluate his criterion (Eq.6) but realizes that that’s cumbersome. He then says that if wealth is very large and the utility function can be considered linear, his criterion can be approximated by (Eq.5), meaning actual EUT. So he uses actual EUT as an approximation:


Bernoulli’s notation is different from ours (\alpha is initial wealth (whereas we use x) and x is the fee (whereas we use F)). Let’s put this back into our notation and show that he really is using criterion (Eq.5) and not his own criterion (Eq.6).

We have

(Eq.7)   \sqrt[2]{x+G(1)-F}\sqrt[4]{x+G(2)-F}\cdots=x

…take the logarithm

(Eq.8)   \frac{1}{2} \ln(x+G(1)-F) +\frac{1}{4} \ln(x+G(2)-F) + \cdots=\ln(x)


(Eq.9)   \sum_n \left(\frac{1}{2}\right)^n \ln(x+2^{n-1}-F)=\ln(x)

…simplify even more and subtract \ln(x)

(Eq.10)   \langle \ln(x+G-F)\rangle -\ln(x) = \langle \delta u \rangle = 0.

This sets the expected change in logarithmic utility to zero, to determine the maximum fee I should pay, as required by EUT. Bernoulli does not use his own criterion (Eq.6) but the generally accepted criterion (Eq.5).

Depending on which part of Bernoulli we read, and how carefully we read it, we will come away with a different impression of what utility means and what EUT is. Not surprisingly, the economics literature is littered with arguments and disagreements and invalid studies that seem to arise from a confused use of EUT.

Serious consequences

This is a blog post, so let me speculate about the sort of trouble Bernoulli has caused over the centuries. There are special conditions under which criteria (Eq.5) and (Eq.6) are identical. Assuming that they are identical in general (which is often — wrongly — done), one would implicitly assume such conditions. I’ll tell you about two cases with troubling consequences (there are more).

  1. The fee is zero F=0. Check this for yourself: set F=0 in (Eq.5) and (Eq.6). They really are the same then. But that’s not a very interesting case: of course I should “buy” the ticket if it doesn’t cost anything. Why not? The hidden assumption of zero fee utterly confused Karl Menger, and he concluded wrongly that utility functions have to be bounded to treat St. Petersburg-like lotteries. Samuelson (1977) was so convinced by Menger’s incorrect argument that he wrote “Menger 1934 is a modern classic that stands above all criticism.”Menger’s study (in German and behind a paywall) is here.An English translation (careful, it has a few typos) is in this book:
    Menger (1967) The role of uncertainty in economics. English translation by W. Schoellkopf and W. G. Mellon. In: M. Shubik (ed) Essays in mathematical economics in honor of Oskar Morgenstern, Princeton University Press, Chap. 16, pp 211–231).

    I have discussed the problem here.

  2. The utility function is linear. Again, check for yourself and set u(x)=x in (Eq.5) and (Eq.6). Again, they are the same. Especially in the context of prospect theory one often finds statements criticizing EUT that are puzzling. Like this one from the 2002 Nobel Prize Ceremony Speech:

    “A key element in prospect theory is that individuals compare uncertain outcomes with a reference level which depends on the decision situation, instead of evaluating the outcome according to an absolute scale.”

    EUT already includes such a reference level: initial wealth. Only under linear utility does the reference level cancel out. A researcher who assumes that Bernoulli and EUT are identical is likely to assume implicitly that utility is linear, in which case the reference level cancels out and would have to be re-introduced. Kahneman refers directly to Bernoulli in his Nobel lecture, aware that something is not working in Bernoulli’s theory, but apparently unaware that Bernoulli’s theory is not the same as EUT.

Let’s start considering time and ergodicity

Finally: a plea for treating the problem with the modern mathematical concepts we now have. By this, I mean: start worrying about time, and compute time-average growth rates. The “expected change of logarithmic utility” is nothing but the time-average growth rate of wealth, under multiplicative dynamics. You can learn more about this in our lecture notes. Once this concept has sunk in — that gambles are evaluated according to the growth rates they generate for those who engage in them — the type of confusion that surrounds EUT becomes almost impossible.

EUT is the foundation of modern economics. Despite this, I have yet to find a practitioner who uses it. Of course I may be exposed to an unusual sample, but in my experience investors, bankers, risk managers, gamblers — no one uses EUT. Shouldn’t that give us pause? Economics is devoted to the quantitative evaluation of risky prospects, but the people who quantitatively evaluate risky prospects for a living make no use of the techniques it has developed.

The fundamental and fatal flaw is conceptual — parallel universes are used where there should be time and dynamics. Because of this flaw, there’s nothing to check against, the theory is not falsifiable because it depends on unobservable states of happiness or discomfort. Technical errors and inconsistencies can be argued away. It’s what Pauli called “not even wrong,” and the result of this murkiness is the coexistence of mutually exclusive, contradictory theories. Nothing is wrong and nothing is right. Different “schools of thought” have emerged. Let’s acknowledge that and ask what it means. This happens in science but it’s always a sign that a deep flaw has to be corrected, that the appropriate language has not been found yet.

We believe we now know that appropriate language. It’s the language of time and dynamics.

17 thoughts on “The trouble with Bernoulli 1738

  1. Hi, here, I understand that it mainly uproots the foundations of microeconomics, and not macroeconomics; is there any implication for the latter?


    1. Fantastic! Thank you. I looked for the original paper years ago and someone has uploaded it now.

      Interestingly, no figure in this scan. I wonder if it’s present anywhere in the real original paper, maybe in an appendix.

      The figure is present in this earlier German translation

      The translator for Econometrica was Louise Sommer, and she acknowledges Karl Menger for his help. Menger was considered an expert on the problem, but he certainly was confused about expected utility theory. That’s evident from his 1934 “proof” that utility functions must be bounded, which is based on forgetting that a fee must be paid in a lottery, or that in general one can win or lose money by gambling (not only win).


  2. Btw: The translator from Latin into German (as can be seen from the link to the German version of Bernoulli’s text) Alfred Pringsheim, was not only a known Munich mathematician at the time, but Thomas Mann became his son-in-law when he married Katharina Pringsheim. Katia again was one of the first female students in Munich in math & natural sciences, although she never finished or earned a degree, due to consequent chores 😉

    Apart from that, thanks Ole for laying bare the difference between Bernoulli’s work and EUT, but also for translating it into the modern ergodicity econ lingo you developed.

    Liked by 1 person

  3. While currently re-reading the German and English editions, I stumbled upon this funny openeing sentence in the preface by Ludwig Fick of the German edition:

    “In den letzten Jahrzehnten ist eine der Hauptlehren der theoretischen Nationalökonomie, die Lehre vom Wert, wiederholt von verschiedenen Seiten einer gründlichen Revision unterzogen worden.”

    Roughly 120 years later, still revisions under way.

    Liked by 1 person

  4. The conclusion is absolutely bang on. One often hears that EUT is a “good enough” model for a wide variety of applications, but this is precisely the problem: EUT is basically devoid is empirical content, so it can be applied to almost all economic phenomena.

    Hallmarks of a good theory are that it garners more empirical support and more detail over time, resulting in more precise predictions and more widespread use. Something of the opposite has happened with EUT:
    1) The body of evidence that contradicts EUT grows every day.
    2) Many competing “flavors” of the theory exist. Authors often diverge wildly in their assumptions about the unobservable utility functions, i.e. their shapes (and thus risk attitudes), what exactly they depend on (e.g. consumption, wealth, reputation, social status, etc.), and (recently popular) the method by which future utility is discounted. Often these assumptions are in direct conflict with each other.
    3) As a result, the theory is not used in practice at all. By anyone. Even by economists that use the theory in their work academic work. (One could argue that it is not, in fact, useable.)

    But if EUT is such a bad theory, why does it stick around? I think the reason is that it is absolutely perfect for justifying absolutely any economic mechanism. Lots of quantities follow known (boring) rules, but utility is magic: with a bit of creativity those quantities can be made completely irrelevant or extremely important, thus resulting in wildly different behaviors.

    Liked by 1 person

    1. It may be easier , in this market-obsessed age, to think in terms of returns. Obviously you cannot arithmetically add returns ( or average them) across time. A hedge fund that yielded a 150% return in year 1 followed by a minus 100% return in year 2, obviously does not result in a 50% overall return over the two years ( or an average return of 25%/yr)!
      It results in the manager taking a nice six month leave at some island resort, so he can start his next fund, even larger fund, nice and fresh!
      As for academic Economics ( Paul Samuelson variety)- I would suspect that it is a mentally debilitating experience ( perhaps similar to studying orthodox Statistics – Sir Ronald Fisher variety). It aint Physics! But of course no one undertakes a study of Economics to “decipher the Truth” – typically it is like religious studies – not to be taken literally – merely to catapult oneself into the religious order, or in this case, take a place in the crony capitalist system ( where loyalty is rewarded and lack of critical thinking is a virtue). As for the priests ( the tenured professors in Economics) – having demonstrated their loyalty to the faith – they are well looked after by the High Priests( AKA the Federal Reserve). It all works out quite well – for them.


      1. Your idea to think in terms of returns leads directly to a very deep insight into probability theory. You’re proposing to compute what happens over time in a system, whereas the expectation values of wealth that were the motivating puzzle for utility theory compute what happens across an ensemble of parallel universes. So the confusion runs incredibly deep and is know as the ergodicity problem. It starts with the beginning of probability theory in the 1650s, and is not limited to economics.

        One myth we must dispel is that there’s nothing we can do about the problems of economic theory. I think everyone agrees that something is seriously wrong with the formalism, and the practice of economic theory. The practice has been criticized a lot — there are complaints about cronyism, closed-mindedness, and structures that keep new ideas out and hinder critique. But it’s hard to distinguish genuine grievances from normal frustration — we can’t make every applicant a professor at a university, not even every qualified applicant.

        Let’s approach the problem differently, namely just scientifically. Step one would be to look at the existing formalism, and with a cool head point out where the logical, conceptual, and methodological errors are. Logical ones are the best because they tend to be clearest. That’s why this post is important: there really is something wrong, even right in the foundations. Many of the problems further down the formalism are related to this original sin.

        Step two is to develop a logically consistent, conceptually sound, methodologically careful approach to economics. Many are saying that’s impossible, but even if there’s only a minuscule chance of it working, it’s worth letting some people try. So we’re trying, and surprisingly it’s looking very promising. Our approach starts exactly with an alternative to expected utility theory, and exactly with an answer to the ergodicity problem. Utility theory is the basis of economics, and we know it’s wrong conceptually and logically, so we have to correct it and see what happens.


  5. Economics is not like Physics, or Computer Science or Mathematics. See, theories in Physics have serious consequences – measurable consequences. Your rover reaches Mars or goes floating off into space etc.
    Theories in Economics? No one making the big economic decisions actually pays much attention to that stuff. It is not of consequence. Take the most recent Big Economic decisions made in 2008 – TARP etc. It was a back of the envelope – lets ask Congress for $700 Bn – sounds like it is big enough, a Trillion sounds scary – they probably wont go for it – so yeah lets go for $700 Bn. Oh and we wont say exactly how we’ll use it – we’ll play it by ear. Thats how Big economic decisions get made.
    Economic “theory” is basically an inconsequential, impotent subject, with third rate hypotheses that have repeatedly been shown to be wrong ( but are mostly “not even wrong”) – but persist in the textbooks. It fails to explain or predict anything of consequence in the real world. The “Nobel” prize is not even a real Nobel prize.
    Actual Economic decision making is more about politics, geostrategy, the petro-dollar, currency manipulation ,trade and military dominance.
    Real world economic decisions dont rely on academic theories any more than real world political decisions rely on game theory.
    Economic theory is used more as a smoke screen to justify decisions ( after the fact), using pseudo-technical-mathematical sounding jargon so regular folks dont get all upset.
    If you want to understand actual economics the best bet is to study history and learn about how our financial plumbing (banking etc) is actually put together. Forget the theory stuff.
    So- finding gaping holes /errors in this theory is actually a battle of wits against the unarmed. Not much sport in it.


  6. Ole:
    You are definitely on to something. I like your single minded pursuit of this over several years! ( nice to see someone as obsessive as I am!).
    For starters it might be good to start looking at a series of annual returns as Mu – (Sigma^2)/2, rather than .
    Ultimately, I think it is not worth it to point out flaws to the current Econ PhD types – only leads to frustration. And the subject to too mired in the muck to be rescued
    Better to start a whole new thing, from scratch. Have a few axioms on how to evaluate risky bets , and what constitutes “rational” behavior” (eg. people want to “make money over several years” – given what they believe about how the system works), then use machine learning type tools to examine actual empirical data ( which is sadly lacking in the orthodoxy). Real world economics is too complex to attempt simple closed-form solutions.


    1. oops : that should read Mu – (Sima^2)/2 rather than Mean(Mu). here is something you could ask someone who uses “mean-variance” style risk models: Given a bet with a 50-50 chance of +60% or -50% returns annually – so an Average return of 5%, with the only alternative being Cash yielding Zero, how would you allocate $100 of capital?
      You would be shocked at the answer you get ( I asked a senior risk manager at a leading WS firm this and he , without any sense of irony replied -“of course I would put it all into the risky alternative with 5% expected value – why are you asking?” !).


    2. That’s exactly what we’re doing. You can find out all about it in the lecture notes, where we develop that whole new thing from scratch, precisely along the lines you’re suggesting.

      It’s not terribly interesting to find 300 year-old errors. It’s a lot more fun to correct them (or just not make them) and build something better instead. But when you build a logically and conceptually sound formalism you notice the places where the original went off the rails. It would be wrong not to point those out.


  7. Notice how the Ivy League Econ Phd types immediately resort to name calling, and off-the-cuff dismissal? You wont get a serious substantive response. Thats because you are questioning a religious-orthodoxy – whose adherents are in business mainly because they are fierce defenders of the faith.


  8. When the Federal Reserve has your back, there is no Absorbing Boundary! Capital is not constrained, it can be printed at will – for friends, and true-believers !


  9. I think that the expected utility theory is just a formal theory about representation of preferences under uncertainty, and it does not require the introduction of time (that is a more sophisticated issue). There are many axiomatic structures to justify the expected utility representation, and they have no flaws. The original paper by Bernulli is not taken as a rigorous foundation for this representatin by any contemporaneous economist; Bernoulli’s paper has a historical relevance, not theoretical relevance. Anyway, the theory is not taken very seriously in mainstream eonomics, because the axioms that support it do not hold in reality, as Daniel Kahneman and Amos Tversky showed in the last 1970s and the 1980s. There exists a wide range of alternative models for risk preferences, many are regularly used in finance.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s