The trouble with Bernoulli 1738

Not all academic fields have a clear starting point, a seminal paper that constitutes the foundation of the entire discipline. But economics does. The paper that defines modern formal economics was written by Daniel Bernoulli in 1738. It introduces expected utility theory. The main thrust of our work is, of course, to replace expected utility theory and instead work with time-average growth rates of wealth. I’ll mention how that works, but the focus of this post will be on something else. Bernoulli’s paper is not only conceptually misleading but also technically flawed in a sneaky way that keeps confusing everyone. Where Bernoulli determines the price to be paid for a risky prospect, he contradicts himself. I wouldn’t make such a fuss about this if the paper wasn’t so absolutely crucial. This basis of economics contains an error that invalidates commonly held beliefs and puts tens of thousands of studies into a different light. I recently encouraged people, using twitter, to read the paper and see for themselves. In this blog post I go through the relevant analysis step by step and address questions that came up in response to the tweet.


Bernoulli’s paper was re-published in 1954 in Econometrica (if you don’t like paywalls, here’s a free pdf). This is the standard translation from Latin, and page numbers below refer to it. I have not read the original Latin paper, but the error is spelled out in words, visually in a figure, and also in an equation, so this is not something lost in translation. If you have a copy of the original, please send it to me. [2018-02-18 addendum: a scan of the original is here.] The paper is so fundamental to economics that more than 200 years after its publication it was fished out of the proceedings of the Papers of the Imperial Academy of Sciences in Petersburg (Vol. V, 1738, pp. 175-192), translated into English, and published in the leading journal of the field.

What is expected utility theory?

Expected utility theory (EUT) is a form of decision theory. Imagine someone offers you a lottery ticket. Prior to Bernoulli, it was assumed that people roughly maximize the expected change in wealth resulting from such a gamble. That is, if initial wealth is x(t) and possible changes in wealth are \delta x then you’d maximize

(Eq.1)   \langle x(t+\delta t) - x(t) \rangle = \langle \delta x\rangle.

Here, \langle \cdot \rangle is the expectation operator. Some people like to denote it by \mathbb{E} but it’s same thing. If \langle \delta x\rangle is positive, you’d buy the ticket, if it’s negative, you wouldn’t. But observations soon convinced everyone that this is not how people behave.

EUT was introduced as a refinement of this decision criterion. It says that you will choose the action that maximizes the expected change, not of your wealth but of your utility. The insight here is that people don’t evaluate gambles in isolation but with respect to a reference level (their initial wealth). An extra dollar is worth less to me if I’m rich than if I’m poor. Mathematically, a so-called utility function u(x) is introduced, where x represents wealth.

The most commonly used utility function is u(x)=\log x. This is motivated by assuming that the extra utility someone attaches to an extra dollar is inversely proportional to the wealth that that someone already has, p.25:


Later on Bernoulli writes this assumption as the differential equation du = 1/x dx, whose solution is the logarithm. Let’s write down what EUT wants us to maximize:

(Eq.2)   \langle \delta u \rangle = \langle \ln[x(t+\delta t)]-\ln[x(t)] \rangle.

What’s our conceptual critique?

By re-writing this object you will immediately see that it actually means something else, and that’s the essence of our conceptual critique. Logarithms turn division into subtraction, \ln(a/b)=\ln a -\ln b. So (Eq.2) can be re-written as

(Eq.3)   \langle \delta u \rangle = \left\langle\ln\left(\frac{x(t+\delta t)}{x(t)}\right) \right\rangle.

Now divide both sides by \delta t

(Eq.4)   \frac{1}{\delta t}\langle \delta u \rangle = \frac{1}{\delta t} \left\langle\ln\left(\frac{x(t+\delta t)}{x(t)}\right) \right\rangle.

What’s this? In any field other than economics, this object is called the expected exponential growth rate of wealth. No utility required, no psychology required. Digging just a little bit deeper, the whole story comes to light. Wealth dynamics are fundamentally multiplicative — we can invest wealth to generate more wealth. For such dynamics the exponential growth rate is an ergodic observable, which means its expectation value tells us what happens in the long run. Mystery solved! People just optimize what happens to their wealth over time. Crucially, additive wealth changes, \delta x are not ergodic, wherefore the expectation value in (Eq.1) does NOT tell us what happens over time.

Economics textbooks and papers miss this point. The problem is treated in an a-temporal space, as a so-called “one-shot game”. The mathematics is the mathematics of things happening in parallel universes, not of things happening over time. Where time is mentioned, it is usually wrongly assumed that \langle \delta x \rangle indicates what happens over time. Endless arguments ensue over whether \ln x is the correct psychological re-weighting people apply to wealth — in reality this is a question about dynamics, with psychology as a second-order effect — of course we can be deceived, confused, or stupid and not act in our best interest. It’s also far from easy to put a number on wealth x(t), or indeed on the different \delta x and their probabilities involved in real-world decisions.

The history of economics is a wonderful example of how a basic conceptual error prevents the detection of technical errors and inconsistencies. Without the right concepts it’s just not possible to ask the right questions, let alone find consistent answers.

How much would I pay for a lottery ticket?

Let’s put our conceptual troubles aside for a moment and use EUT to address precisely one of the questions Bernoulli asked. My initial wealth is x, I have to pay a ticket fee F, and I can win prizes G(n) with probabilities p(n), where 1\leq n \leq N is a random integer denoting which of the N possible prizes I win. We imagine the lottery to take a time \delta t that doesn’t depend on which prize is won.

EUT says: find the value F_{\text{max}} that makes the expected change in utility zero. If I pay more, the lottery will have a negative expected utility change, and EUT would advise me not to take part. Mathematically, here is the object we have to compute:

(Eq.5)    \langle\delta u \rangle= \sum_n p(n) u(x+G(n)-F) - u(x)

That’s it. Now vary F so that the expected change in utility becomes zero. The value of F where that happens is the maximum fee I should pay for the lottery.

Bernoulli’s error

Bernoulli makes an error when he talks about this fee. The key figure is on p.26, and I’ve scribbled in red a translation of Bernoulli’s notation into the symbols used in Peters and Gell-Mann (2016).


Let me talk you through the figure. The horizontal axis represents wealth x, and the vertical axis represents utility, so that the solid curve is the utility function u(x).

Let’s focus on the solid horizontal line first. The point B is the initial wealth x. The point p, to the left of B, represents the wealth x-F, that is, the initial wealth minus the ticket fee. The points C, D, E, F are wealth levels arrived at by adding to the initial wealth one of the possible prizes G(1), G(2), G(3), G(4).

You may already suspect the problem: let’s imagine outcome 1 occurs in the lottery, meaning we receive the prize G(1). In that case, our wealth will change to x+G(1)-F, and not x+G(1) (which is the position of point C). We have to subtract the fee! The points C, D, E, F marked by Bernoulli have no relevance to the problem (unless the fee is zero).

Now let’s walk along the utility curve. The point o marks the utility of  wealth x-F, so that the dashed line po is the drop in utility associated with a drop in wealth from x to x-F. The points G, H, L, M (to the right of B along the utility curve) represent the utilities associated with the irrelevant wealth levels u(x+G(1)), u(x+G(2)), u(x+G(3)), u(x+G(4)). The point O represents the expected utility assuming prizes are received with their respective probabilities but no fee is paid, p.26


In Peters and Gell-Mann (2016) we call this object \langle \delta u^+ \rangle := \sum_n p(n) u(x+G(n)), for want of a better symbol.

Now comes Bernoulli’s error. He claims that the lottery is to be valued by comparing \langle \delta u^+ \rangle to the loss in utility one would suffer if one were to buy the lottery ticket for its fee but not receive any prize. The key passage is on p.26–27



This amounts to a decision criterion, different from (Eq.5) and inconsistent with EUT. We denote with the symbol \delta u^- what Bernoulli calls “the disutility to be suffered by losing” (Bp in his notation). Now we can write Bernoulli’s decision criterion in symbols. Bernoulli tells us to buy a ticket if the following quantity is positive, and not to buy it if it’s negative

(Eq.6)   \langle \delta u^+\rangle - \delta u^- = \underbrace{\langle u(x+G) \rangle - u(x)}_{ \langle \delta u^+\rangle} - \underbrace{[u(x) - u(x-F)]}_{\delta u^-}.

Error or different model?

In principle, one could now say that Bernoulli just had a different model of people’s behavior than does modern economics. That would be bad enough because modern economics claims that it has the same model as Bernoulli. It’s more likely that Bernoulli got confused specifically when he tried to find the maximum fee. Otherwise he seems consistent with EUT, for instance on p.24, where he describes the certain profit (“the value of the risk in question”) that corresponds to an uncertain profit


Let’s really kill this. Why is (Eq.6) not a good decision criterion?

  1. It is easy to construct an example where the maximum fee I should pay, F_{\text{max}}, is smaller than the smallest possible prize in the lottery, but the criterion still tells me to refrain from buying a ticket. That makes no sense: such a lottery has no downside risk. I’m guaranteed a positive net profit, the only uncertain element is how much better off I will be after the game. Intriguingly — have another look at the figure — the maximum fee calculated by Bernoulli is smaller than the smallest possible prize, assuming that one of the prizes will be won (the distance between B and p is smaller than the distance between B and C).
  2. Another problem is this: how much should I be willing to pay for one dollar, according to Bernoulli? What? … for one dollar? One dollar, of course! Yes, but not according to Bernoulli. Try it out: \langle \delta u^+\rangle < \delta u^- for any non-zero number of dollars I buy in dollars, due to the concavity of the utility function. Therefore, Bernoulli’s criterion tells me that no amount of money is worth that amount of money. Come on! That’s nonsense!

Bernoulli in practice

Later, on p.33, Bernoulli works through the specific case of the St. Petersburg lottery. This lottery is defined by p(n)=\left(\frac{1}{2}\right)^n and G(n)=2^{n-1}, with n any positive integer. But his method is so nonsensical that he does something really curious. He wants to evaluate his criterion (Eq.6) but realizes that that’s cumbersome. He then says that if wealth is very large and the utility function can be considered linear, his criterion can be approximated by (Eq.5), meaning actual EUT. So he uses actual EUT as an approximation:


Bernoulli’s notation is different from ours (\alpha is initial wealth (whereas we use x) and x is the fee (whereas we use F)). Let’s put this back into our notation and show that he really is using criterion (Eq.5) and not his own criterion (Eq.6).

We have

(Eq.7)   \sqrt[2]{x+G(1)-F}\sqrt[4]{x+G(2)-F}\cdots=x

…take the logarithm

(Eq.8)   \frac{1}{2} \ln(x+G(1)-F) +\frac{1}{4} \ln(x+G(2)-F) + \cdots=\ln(x)


(Eq.9)   \sum_n \left(\frac{1}{2}\right)^n \ln(x+2^{n-1}-F)=\ln(x)

…simplify even more and subtract \ln(x)

(Eq.10)   \langle \ln(x+G-F)\rangle -\ln(x) = \langle \delta u \rangle = 0.

This sets the expected change in logarithmic utility to zero, to determine the maximum fee I should pay, as required by EUT. Bernoulli does not use his own criterion (Eq.6) but the generally accepted criterion (Eq.5).

Depending on which part of Bernoulli we read, and how carefully we read it, we will come away with a different impression of what utility means and what EUT is. Not surprisingly, the economics literature is littered with arguments and disagreements and invalid studies that seem to arise from a confused use of EUT.

Serious consequences

This is a blog post, so let me speculate about the sort of trouble Bernoulli has caused over the centuries. There are special conditions under which criteria (Eq.5) and (Eq.6) are identical. Assuming that they are identical in general (which is often — wrongly — done), one would implicitly assume such conditions. I’ll tell you about two cases with troubling consequences (there are more).

  1. The fee is zero F=0. Check this for yourself: set F=0 in (Eq.5) and (Eq.6). They really are the same then. But that’s not a very interesting case: of course I should “buy” the ticket if it doesn’t cost anything. Why not? The hidden assumption of zero fee utterly confused Karl Menger, and he concluded wrongly that utility functions have to be bounded to treat St. Petersburg-like lotteries. Samuelson (1977) was so convinced by Menger’s incorrect argument that he wrote “Menger 1934 is a modern classic that stands above all criticism.”Menger’s study (in German and behind a paywall) is here.An English translation (careful, it has a few typos) is in this book:
    Menger (1967) The role of uncertainty in economics. English translation by W. Schoellkopf and W. G. Mellon. In: M. Shubik (ed) Essays in mathematical economics in honor of Oskar Morgenstern, Princeton University Press, Chap. 16, pp 211–231).

    I have discussed the problem here.

  2. The utility function is linear. Again, check for yourself and set u(x)=x in (Eq.5) and (Eq.6). Again, they are the same. Especially in the context of prospect theory one often finds statements criticizing EUT that are puzzling. Like this one from the 2002 Nobel Prize Ceremony Speech:

    “A key element in prospect theory is that individuals compare uncertain outcomes with a reference level which depends on the decision situation, instead of evaluating the outcome according to an absolute scale.”

    EUT already includes such a reference level: initial wealth. Only under linear utility does the reference level cancel out. A researcher who assumes that Bernoulli and EUT are identical is likely to assume implicitly that utility is linear, in which case the reference level cancels out and would have to be re-introduced. Kahneman refers directly to Bernoulli in his Nobel lecture, aware that something is not working in Bernoulli’s theory, but apparently unaware that Bernoulli’s theory is not the same as EUT.

Let’s start considering time and ergodicity

Finally: a plea for treating the problem with the modern mathematical concepts we now have. By this, I mean: start worrying about time, and compute time-average growth rates. The “expected change of logarithmic utility” is nothing but the time-average growth rate of wealth, under multiplicative dynamics. You can learn more about this in our lecture notes. Once this concept has sunk in — that gambles are evaluated according to the growth rates they generate for those who engage in them — the type of confusion that surrounds EUT becomes almost impossible.

EUT is the foundation of modern economics. Despite this, I have yet to find a practitioner who uses it. Of course I may be exposed to an unusual sample, but in my experience investors, bankers, risk managers, gamblers — no one uses EUT. Shouldn’t that give us pause? Economics is devoted to the quantitative evaluation of risky prospects, but the people who quantitatively evaluate risky prospects for a living make no use of the techniques it has developed.

The fundamental and fatal flaw is conceptual — parallel universes are used where there should be time and dynamics. Because of this flaw, there’s nothing to check against, the theory is not falsifiable because it depends on unobservable states of happiness or discomfort. Technical errors and inconsistencies can be argued away. It’s what Pauli called “not even wrong,” and the result of this murkiness is the coexistence of mutually exclusive, contradictory theories. Nothing is wrong and nothing is right. Different “schools of thought” have emerged. Let’s acknowledge that and ask what it means. This happens in science but it’s always a sign that a deep flaw has to be corrected, that the appropriate language has not been found yet.

We believe we now know that appropriate language. It’s the language of time and dynamics.

39 thoughts on “The trouble with Bernoulli 1738

  1. Hi, here, I understand that it mainly uproots the foundations of microeconomics, and not macroeconomics; is there any implication for the latter?


    1. Fantastic! Thank you. I looked for the original paper years ago and someone has uploaded it now.

      Interestingly, no figure in this scan. I wonder if it’s present anywhere in the real original paper, maybe in an appendix.

      The figure is present in this earlier German translation

      The translator for Econometrica was Louise Sommer, and she acknowledges Karl Menger for his help. Menger was considered an expert on the problem, but he certainly was confused about expected utility theory. That’s evident from his 1934 “proof” that utility functions must be bounded, which is based on forgetting that a fee must be paid in a lottery, or that in general one can win or lose money by gambling (not only win).


  2. Btw: The translator from Latin into German (as can be seen from the link to the German version of Bernoulli’s text) Alfred Pringsheim, was not only a known Munich mathematician at the time, but Thomas Mann became his son-in-law when he married Katharina Pringsheim. Katia again was one of the first female students in Munich in math & natural sciences, although she never finished or earned a degree, due to consequent chores 😉

    Apart from that, thanks Ole for laying bare the difference between Bernoulli’s work and EUT, but also for translating it into the modern ergodicity econ lingo you developed.

    Liked by 1 person

  3. While currently re-reading the German and English editions, I stumbled upon this funny openeing sentence in the preface by Ludwig Fick of the German edition:

    “In den letzten Jahrzehnten ist eine der Hauptlehren der theoretischen Nationalökonomie, die Lehre vom Wert, wiederholt von verschiedenen Seiten einer gründlichen Revision unterzogen worden.”

    Roughly 120 years later, still revisions under way.

    Liked by 1 person

  4. The conclusion is absolutely bang on. One often hears that EUT is a “good enough” model for a wide variety of applications, but this is precisely the problem: EUT is basically devoid is empirical content, so it can be applied to almost all economic phenomena.

    Hallmarks of a good theory are that it garners more empirical support and more detail over time, resulting in more precise predictions and more widespread use. Something of the opposite has happened with EUT:
    1) The body of evidence that contradicts EUT grows every day.
    2) Many competing “flavors” of the theory exist. Authors often diverge wildly in their assumptions about the unobservable utility functions, i.e. their shapes (and thus risk attitudes), what exactly they depend on (e.g. consumption, wealth, reputation, social status, etc.), and (recently popular) the method by which future utility is discounted. Often these assumptions are in direct conflict with each other.
    3) As a result, the theory is not used in practice at all. By anyone. Even by economists that use the theory in their work academic work. (One could argue that it is not, in fact, useable.)

    But if EUT is such a bad theory, why does it stick around? I think the reason is that it is absolutely perfect for justifying absolutely any economic mechanism. Lots of quantities follow known (boring) rules, but utility is magic: with a bit of creativity those quantities can be made completely irrelevant or extremely important, thus resulting in wildly different behaviors.

    Liked by 1 person

    1. It may be easier , in this market-obsessed age, to think in terms of returns. Obviously you cannot arithmetically add returns ( or average them) across time. A hedge fund that yielded a 150% return in year 1 followed by a minus 100% return in year 2, obviously does not result in a 50% overall return over the two years ( or an average return of 25%/yr)!
      It results in the manager taking a nice six month leave at some island resort, so he can start his next fund, even larger fund, nice and fresh!
      As for academic Economics ( Paul Samuelson variety)- I would suspect that it is a mentally debilitating experience ( perhaps similar to studying orthodox Statistics – Sir Ronald Fisher variety). It aint Physics! But of course no one undertakes a study of Economics to “decipher the Truth” – typically it is like religious studies – not to be taken literally – merely to catapult oneself into the religious order, or in this case, take a place in the crony capitalist system ( where loyalty is rewarded and lack of critical thinking is a virtue). As for the priests ( the tenured professors in Economics) – having demonstrated their loyalty to the faith – they are well looked after by the High Priests( AKA the Federal Reserve). It all works out quite well – for them.


      1. Your idea to think in terms of returns leads directly to a very deep insight into probability theory. You’re proposing to compute what happens over time in a system, whereas the expectation values of wealth that were the motivating puzzle for utility theory compute what happens across an ensemble of parallel universes. So the confusion runs incredibly deep and is know as the ergodicity problem. It starts with the beginning of probability theory in the 1650s, and is not limited to economics.

        One myth we must dispel is that there’s nothing we can do about the problems of economic theory. I think everyone agrees that something is seriously wrong with the formalism, and the practice of economic theory. The practice has been criticized a lot — there are complaints about cronyism, closed-mindedness, and structures that keep new ideas out and hinder critique. But it’s hard to distinguish genuine grievances from normal frustration — we can’t make every applicant a professor at a university, not even every qualified applicant.

        Let’s approach the problem differently, namely just scientifically. Step one would be to look at the existing formalism, and with a cool head point out where the logical, conceptual, and methodological errors are. Logical ones are the best because they tend to be clearest. That’s why this post is important: there really is something wrong, even right in the foundations. Many of the problems further down the formalism are related to this original sin.

        Step two is to develop a logically consistent, conceptually sound, methodologically careful approach to economics. Many are saying that’s impossible, but even if there’s only a minuscule chance of it working, it’s worth letting some people try. So we’re trying, and surprisingly it’s looking very promising. Our approach starts exactly with an alternative to expected utility theory, and exactly with an answer to the ergodicity problem. Utility theory is the basis of economics, and we know it’s wrong conceptually and logically, so we have to correct it and see what happens.


  5. Economics is not like Physics, or Computer Science or Mathematics. See, theories in Physics have serious consequences – measurable consequences. Your rover reaches Mars or goes floating off into space etc.
    Theories in Economics? No one making the big economic decisions actually pays much attention to that stuff. It is not of consequence. Take the most recent Big Economic decisions made in 2008 – TARP etc. It was a back of the envelope – lets ask Congress for $700 Bn – sounds like it is big enough, a Trillion sounds scary – they probably wont go for it – so yeah lets go for $700 Bn. Oh and we wont say exactly how we’ll use it – we’ll play it by ear. Thats how Big economic decisions get made.
    Economic “theory” is basically an inconsequential, impotent subject, with third rate hypotheses that have repeatedly been shown to be wrong ( but are mostly “not even wrong”) – but persist in the textbooks. It fails to explain or predict anything of consequence in the real world. The “Nobel” prize is not even a real Nobel prize.
    Actual Economic decision making is more about politics, geostrategy, the petro-dollar, currency manipulation ,trade and military dominance.
    Real world economic decisions dont rely on academic theories any more than real world political decisions rely on game theory.
    Economic theory is used more as a smoke screen to justify decisions ( after the fact), using pseudo-technical-mathematical sounding jargon so regular folks dont get all upset.
    If you want to understand actual economics the best bet is to study history and learn about how our financial plumbing (banking etc) is actually put together. Forget the theory stuff.
    So- finding gaping holes /errors in this theory is actually a battle of wits against the unarmed. Not much sport in it.


  6. Ole:
    You are definitely on to something. I like your single minded pursuit of this over several years! ( nice to see someone as obsessive as I am!).
    For starters it might be good to start looking at a series of annual returns as Mu – (Sigma^2)/2, rather than .
    Ultimately, I think it is not worth it to point out flaws to the current Econ PhD types – only leads to frustration. And the subject to too mired in the muck to be rescued
    Better to start a whole new thing, from scratch. Have a few axioms on how to evaluate risky bets , and what constitutes “rational” behavior” (eg. people want to “make money over several years” – given what they believe about how the system works), then use machine learning type tools to examine actual empirical data ( which is sadly lacking in the orthodoxy). Real world economics is too complex to attempt simple closed-form solutions.


    1. oops : that should read Mu – (Sima^2)/2 rather than Mean(Mu). here is something you could ask someone who uses “mean-variance” style risk models: Given a bet with a 50-50 chance of +60% or -50% returns annually – so an Average return of 5%, with the only alternative being Cash yielding Zero, how would you allocate $100 of capital?
      You would be shocked at the answer you get ( I asked a senior risk manager at a leading WS firm this and he , without any sense of irony replied -“of course I would put it all into the risky alternative with 5% expected value – why are you asking?” !).


    2. That’s exactly what we’re doing. You can find out all about it in the lecture notes, where we develop that whole new thing from scratch, precisely along the lines you’re suggesting.

      It’s not terribly interesting to find 300 year-old errors. It’s a lot more fun to correct them (or just not make them) and build something better instead. But when you build a logically and conceptually sound formalism you notice the places where the original went off the rails. It would be wrong not to point those out.


  7. Notice how the Ivy League Econ Phd types immediately resort to name calling, and off-the-cuff dismissal? You wont get a serious substantive response. Thats because you are questioning a religious-orthodoxy – whose adherents are in business mainly because they are fierce defenders of the faith.


  8. When the Federal Reserve has your back, there is no Absorbing Boundary! Capital is not constrained, it can be printed at will – for friends, and true-believers !


  9. I think that the expected utility theory is just a formal theory about representation of preferences under uncertainty, and it does not require the introduction of time (that is a more sophisticated issue). There are many axiomatic structures to justify the expected utility representation, and they have no flaws. The original paper by Bernulli is not taken as a rigorous foundation for this representatin by any contemporaneous economist; Bernoulli’s paper has a historical relevance, not theoretical relevance. Anyway, the theory is not taken very seriously in mainstream eonomics, because the axioms that support it do not hold in reality, as Daniel Kahneman and Amos Tversky showed in the last 1970s and the 1980s. There exists a wide range of alternative models for risk preferences, many are regularly used in finance.


  10. Some issues are still unclear to me in your approach, as laid out in the paper with Gell-Mann and more recent writings.
    [1] Do you reject the notion of rational agents maximizing expected utility? Do you reject the framework and theorems of von Neumann and Morgenstern?

    [2] You seem to argue that log utility is somehow “formally banned”,
    “Logarithmic utility must not be banned formally because it is mathematically equivalent to the modern method of defining an ergodic observable for multiplicative dynamics. This point of view provides a firm basis on which to erect a scientific formalism.”
    – but I don’t see that in recent literature…? (apart from Menger)

    [3] Do you contend that the classic solution to the SPB Paradox is wrong, and should not have utilized utility functions?
    Or do you claim that you can “solve” the paradox without any extraneous assumptions of utility, by framing it as some time evolving nonergodic process (whereas it is really an instantaneous bet – a bet on a skewed distribution with infinite support)?

    [4] Did you take into account that E [Log W] is really only “special” in describing an asymptotic growth rate, but loses its standing in a finite-horizon setting? In such finite setting it merely amounts to a choice of a utility function, doesn’t it?
    The core problem as I see is finite vs. infinite horizons. In an infinite horizon I can see some merits in your approach, esp. as it derives the “utility” from the specific nature of the stochastic process (non-ergodic dynamics) without requiring a-priori postulation. But in finite time one cannot rely on the asymptotic growth rate, and instead must assume some other optimization framework, such as expected utility maximization, or choosing a payoff function in a game-theoretic setting (Bell & Cover), or some other criterion etc.

    [5] Lastly, what is your mathematical definition of “time average growth rate” for dynamics which are not multiplicative? And do you prove that maximizing such a growth rate entity has merits (e.g., for multiplicative processes like stock portfolios, it can be shown that maximizing the asymptotic growth rate results in the ‘best’ asymptotic wealth — see Cover & Thomas Theorem 16.3.1 “Asymptotic optimality of the log-optimal portfolio”).


    1. Thanks for your questions, Omri.

      [1] We point out that there is an interpretation of “utility functions” that leads to deeper insights than simply thinking of them as a representation of idiosyncratic risk preferences. In the first instance we prefer to call these functions “ergodicity mappings” rather than utility functions, reflecting our different approach.

      [2] Menger’s work bans any unbounded utility function, including the linear and logarithmic functions. By doing so, he inadvertently banned the perspective we’re taking, as do those who endorse his work (including Arrow, Samuelson, and recently Campbell (in his 2018 book)). Samuelson in particular was a big fan of Menger’s 1934 paper, which he called “a modern classic that stands above all criticism,” and he (Samuelson) helped establish Menger’s unfortunate (wrong) result as part of the canon.

      [3] There are various problems with the original solution. One is that it contradicts modern utility theory, see
      Again: the notion of utility as a mathematization of idiosyncratic behavior is not helpful. Bernoulli couldn’t do much else, and in the science historical context his solution is understandable. From a modern perspective it should be replaced.
      “Right” and “wrong” are perhaps not the best categories. “Fruitful” and “barren” may be better. A (mathematically acceptable) solution can be an opening to deeper understanding, or it can be a dead end.

      By viewing the St. Petersburg gamble as a single step in a trajectory of many subsequent decisions whose outcomes obey a sensible dynamic, one can solve the paradox without utility.

      Whichever perspective is taken, information is missing from the original problem: either the utility function is missing, or the dynamic is missing. If we know the dynamic we don’t need a utility function. If no one tells us about the dynamic, assuming a utility function formally solves the problem [though conceptually this solution is fairly circular: I have to know the risk preferences (the utility function) in order to determine the risk preferences (whether I want to play the gamble)].

      [4] I wouldn’t write E[log W] because that’s not dimensionally sensible. But apart from that, the time-average growth rate is a limit (of infinite time). That’s often a good heuristic. An expectation value, in the sense of expected utility theory (where the utility function need not correspond to the prevailing dynamics), is also a limit (of infinitely many parallel universes). That’s rarely a good heuristic for an individual.

      How the ensemble size N and the time horizon T come to dominate in one setting or another can be computed using work from the 1980s — the free energy in Derrida’s random energy model can be mapped to this problem, see

      [5] This problem (of time-average growth rates for processes other than additive or multiplicative dynamics) is discussed here:
      See also our lecture notes, Ch.2.7, p.48ff.


      1. Thanks for the further thoughts.
        One more point. in your Ergodicity Economics text you claim that “Boundedness makes utility functions non-invertible”:
        “Translating into utility language, every invertible utility function is actually
        an encoding of a unique wealth dynamic which arises as utility performs a
        Brownian motion. Curiously, a celebrated but erroneous paper by Karl Menger
        [45] “proved” that all utility functions must be bounded (the proof is simply
        wrong). Boundedness makes utility functions non-invertible and precludes the
        developments we present here.”
        Could you clarify why? Surely there are bounded (from above) utiltiy functions that are invertible (in their domain).. .


  11. When we consider a non-ergodic system we are saying that data may not be independent of each other and tehrefore these data can be non-meaningful from a statistical point of view. In order to face the problem under the condition of non-ergodicity, we need to remove the attention from the results and shift our attention to our ability to operate in a cognitive and non-random way. The method I prefer is based on the axiom of disorder (von Mises), this mathematical axiom applied on financial markets can be enunciated in this way:
    “Whenever we understand any kind of deterministic market process, the probability of our financial operation being successful increases by more than 50%” (Von Mises’ axiom of disorder from the early 1920s).
    As a consequence of this axiom given above, any correct market analysis will always tend to increase the probability of our prediction beyond the 50% mean, and this results in a consequent decrease in the probability of obtaining the same result randomly. To conclude, it follows that the parameter to be linked to the validity of a financial strategy under the condition of non ergodicity, is not its performance but its statistical property of generating non-reproducible results in a random way.
    To conclude, the problem of utility functions is due to the fact that this method maintain the attention on the results this way of acting is correct when the system is ergodic but becomes no longer correct when the system is not stationary


  12. Sorry, but why devote more than 2000 words to such a triviality! Of course the G(i) in Bernoulli’s diagram are not the prizes themselves, but the prizes net of fees, i.e. G(i) – F. This resolves all apparent ‘contradictions’.
    Also, Menger does not “ban” log utility, but correctly shows that for every unbounded utility function there is a gamble (random variable) that leads to infinite expected utility.
    Finally, why on purpose ignore all the developments and extensions of EUT of the last 50 years? Prospect theory, optimal growth portfolio theory, etc..?


  13. Unfortunately, no, they’re not prizes net of fees. Have another look at the figure: if what you say were correct, then all outcomes would lead to net winnings, irrespective of the fee.

    Menger (confused by this) wrongly proved that utility functions must be bounded. The logarithm is unbounded and therefore banned. But Menger’s argument is invalid, so this is nothing to worry about. With log utility there is always a finite fee that prevents the expected change in utility from diverging positively.

    One thing at a time. Before we start discussing what happened downstream from utility theory, we should fix what’s wrong with utility theory itself. Later developments then appear in a different light — and, of course, we find out what utility theory actually is. It’s ergodic theory in disguise, namely done before Boltzmann introduced the ergodicity concept — hence the errors and confusion.


  14. I agree that the figure is wrong, since it compares U(x-F) to , while it should compare U(x) to . (Is the Figure by Bernoulli or does it only appear in a translation?) However, you wrongly insinuate that EUT is based on this wrong figure, while it is not! Utility indifference pricing compares U(x) to and no economist is “confused” about this.

    Menger showed that for every unbounded utility function, there exists a random variable with unbounded expectation. How is that conclusion invalid? How does this “ban” unbounded utility functions (from what and for whom?) Why do you insinuate that Menger was confused when he gives a valid logical argument?


  15. There were formatting mistakes in my reply. The first sentence should read: “I agree that the Figure is wrong, since it compares U(x-F) to E[U(X+G)], while it should compare U(x) to E[U(x-F+G)]”. The last sentence should read: “Utility indifference pricing compares U(x) to E[U(x-F+G)]…” .Here E[] is the expectation operator.


    1. The figure was added in a 1896 German translation, as far as we know (do you know of an earlier source?). It is an accurate visual representation of Bernoulli’s writing. Bernoulli uses geometric a notation of the 1730s that is hard to read, so the figure is very useful.

      The 1896 figure is a little more sensible than the 1954 figure, which is displayed above. The 1954 figure was produced by Menger, at least he helped with the translation and was presumably responsible for the mathematical content. If you look carefully, you’ll see that it’s complete nonsense. If you subtract from each of the possible winnings what Bernoulli says should be the maximum fee one should pay, you end up with a “lottery” where only net gains are possible: meaning the fee is smaller than the smallest amount you will receive from the lottery.

      So, we’ve established:
      1. the figure is nonsense.
      2. the figure is consistent with Bernoulli’s writing (please check the original or translation very carefully to convince yourself).

      Now what about Menger?
      Menger showed something that has nothing to do with his conclusion. His conclusion is that utility functions must be bounded in order to prevent a resurrection of the St. Petersburg paradox. He shows that for any unbounded utility function U(g) there exists a random variable G such that E[U(g)] diverges. That’s basically just restating that U(g) is unbounded. It’s a trivially true statement. However — this statement does not imply that there exists a random variable G such that E[U(x-F+g)]-U(x) diverges positively for any ticket price F. This latter statement is his claim, namely that a “super-Petersburg” paradox can be created (as Samuelson called it). In order to prevent this from happening, I just have to choose logarithmic utility and F=x+g_1, where g_1 is the smallest possible payout. Now I have a negative infinity as the first term of the expectation value-sum, and any positive divergence is weaker.

      I don’t insinuate that Menger was confused, I claim he was confused. The alternative is that he wasn’t confused and deliberately misled people, but I don’t believe that.

      How did Menger end up so confused? Well, he read Bernoulli in the original, and got confused exactly by Bernoulli’s splitting of payouts U(x+g) and fee U(x-F). This split is only valid for linear utility functions, but of course they don’t solve the problem he set out to solve. So: Bernoulli got the computation of the expected net change in utility wrong. Laplace corrected him, but Menger didn’t know that, or ignored it. von Neumann and Morgenstern also came to Laplace’s conclusion that it’s net changes in utility that count, not some weird combination of gains and fees and utilities.

      In what ways did economics get confused by this? I can’t list them all, but for starters, people write that terminal utility is maximized — that’s sort of operationally true but dimensionally nonsense — what’s the logarithm of a dollar? In a careful phrasing, it’s only ever changes in utility that are computed (as von Neumann and Morgenstern correctly point out). This sloppiness causes endless confusion. People say utility when they mean expected utility, or expected utility change — who knows! It’s so confused that researchers see problems with expected utility theory that aren’t there, for instance the lack of reference-point dependence, in response to which models like prospect theory are introduced. As if utility theory wasn’t confused enough we now have to deal with even more arbitrariness and more parameters.

      Hence our approach: throw away utility theory, find out how far you get by eliminating the fiction of expectation values, and re-develop economic theory with a clear understanding of the ergodicity problem: use expectation values only where they belong, and use time averages where they are appropriate. The original problem utility theory was invented for is thereby resolved, and we can develop a clean scientific formalism.


  16. Ole,
    Just wanted to respond to your tweets… (it’s hard capture “tone” in tweets)

    Hopefully I didn’t misrepresent your position or the underlying theories you present
    (I was trying to promote them and not misrepresent them)
    I have the utmost respect for your work… and it is very possible I am a dummy and just don’t understand, economics is not my native tongue.

    I linked to your wonderful article within the context of Johnathan Blows’ tweet
    about “Sunk Cost Fallacy” and how he questioned the “monocle-wearing smart people” & he was looking for a paper talking about how “humans incorrectly calculate EV and thus are irrational”

    Anyways, I would have responded to your tweet, but couldn’t tell if you were aggravated w my ideas or thought them silly, and didn’t want to add more fuel to the fire if that was the case and the tweet chain has gotten so long.

    Perhaps I made more a mess than anything else, but my (limited) understanding is:
    We try to create theories / heuristics to help us in decision making
    We sometimes use the wrong heuristics in the wrong circumstances (because we like them/understand them and they are expedient)
    We also sometimes apply very clean and sterile theories and heuristics to other people calling them “irrational” ( for example thinking “only an irrational person wouldn’t take the 2-to-1 payoff for a coin-flip”)

    …But the world is complex and interconnected, and maybe this “irrational fear” that a person feels before taking an “in their favor” wager is rather a deep understanding of the nature and not some irrational logical mistake… And “if it sounds to good to be true, it probably is” aphorism exists for a reason and is reinforced by our experiences in the world (nature doesn’t offer an immediate 2-for 1 return without some other external investment)

    So happy new year and good luck getting the word out.


    1. Thank you, Eric, very kind. Apologies if I sounded exhausted.

      Here’s the twitter link and here’s what we were discussing.

      Taking an expectation value always has the following meaning:
      * create in your imagination a collection of systems that are identical except for the state they’re in.
      * give each state a weight.
      * use this construct to compute the weighted average of some quantity that depends on the state.

      A priori, this is a completely abstract procedure. Two physical interpretations are common
      1) there really is a collection of systems, e.g. an insurance firm insuring 1,000,000 drivers. Or 10^23 molecules in a balloon.
      2) the system has the very unusual special property of ergodicity, and the expectation value we’ve computed happens to coincide with the time-average.

      For an individual (which is of course not a collection of individuals) making decisions about, say, investments, the expectation value of his or her wealth is a priori totally irrelevant. Whether in real life or in a game, there is no reason to make decisions that optimize expected wealth (obviously, unless you can prove that there actually is such a reason because of ergodicity or whatever it is).

      In real life people will not optimize expected wealth (I think we agreed there). You then mentioned that in a game that has no further consequences people may optimize expected wealth because the reason they don’t in real life is consequences.

      My point was that this doesn’t do much to rescue the model of expected-wealth optimizers. You’re right that in a game without consequences, there’s no cost (nor benefit) for people to behave as expectation optimizers. But that’s only because their behavior has no consequences at all. They would be just as well/ill/irrelevantly advised to minimize their expected monopoly-money wealth.

      There is a serious part to this discussion. Protocols that maximize expected wealth under multiplicative dynamics often converge to protocols that maximize time-average growth in the limit of small wager. So here’s something funny: expected-wealth maximization is sensible if the decision is of no consequence (vanishing wager). This implies, for instance, that willingness to bet — or rather odds I’m willing to offer — do not directly translate into my beliefs about probabilities. If the wager is significant, so that I’m forced to think about the problem, then all sorts of non-linear effects kick in. If it vanishes, so that I can disregard non-linearities, then — well — it’s just not significant so I can do whatever I like.


  17. This is a misguided critique that completely misunderstands the St. Petersburg paradox and actually tries to distract the reader from the key issues.

    So according to your view, what is the rational price that should be paid to play the original game as described in the 1738 paper?

    Unless you are willing to take a stand and give your answer, none of this is worth my attention. Instead, I can only conclude that you avoid answering that question because you are afraid of all the criticism that this answer, like all other proposed “solutions” to the paradox, end up receiving.

    Sorry to be harsh, but this is intellectual arrogance at its worst. If your answer actually ended up solving a 300 year problem that had befuddled Bernoulli, Euler, Samuelson and Arrow, then your paper on this subject which was published almost 8 years ago would have gotten more than the 9 citations that Google Scholar says you have gotten.

    Instead it has been pretty much ignored, and rightly so.

    Liked by 1 person

      1. That seems like a weasel response. You haven’t answered Eric Estovales’s question. So what is the no arbitrage price for playing (or offering) this game???


      2. I thought this comment was either a joke or online abuse. But I can answer your question, although “arbitrage” is not the right concept here.
        The maximum price, according to the Ergodicity Economics model, that a player is willing to pay is the price where his/her time-average growth rate changes from positive to negative. The dollar value of that price depends, of course, on the player’s wealth and the dynamics acting on said wealth.


  18. In the light the ergodic theory, how should one interpret the yield curve shape? Actually, does the yield curve say something about the future risk pricing at all? Going even deeper: do a single future contract carry some information about the asset in the future or its risk?


  19. Not sure that makes much sense. There are two sides to every transaction.

    A lottery operator can design a lottery in such a way that the expected loss to the lottery operator is either greater than or less than the price at which the time-average growth rate changes from positive to negative for all the players. As long as there is a differential between these two values, you create an arbitrageable situation.


  20. I really enjoyed reading this Ole. So thanks for your work. I enjoyed it as I enjoy most things I come across about the intersection between mathematical theory and human behaviour. I’m not an economist but I do enjoy reading about it. I have studied maths and physics to a level where I (mostly) get your point, and I particularly enjoy writings that have the level of accessibility – one where I can work out myself what sense it makes, to me.
    What troubles me about all this is that the idea of a mathematical function that encapsulates expected utility as a means of measuring behaviour vs the cost of the behaviour largely ignores most people’s fundamental ability to calculate or even understand such things. We don’t understand risk, hence survivorship bias for example. And even when the risk/reward is clear, we still ignore the knowledge in favour of other modes of decision making.
    Some thoughts:
    The fact that actual lotteries make a profit and support charities is simple evidence that the expected return on buying a ticket is less than zero. This is obviously true and yet lotteries thrive.
    Because an individual can’t (or likely won’t) buy a statistically significant number of tickets over a life-time the real reason is the value of the longshot, not the probability of it, or the expected value.
    Another reason people buy lottery tickets is because everyone else does – if you’re not in you can’t win.
    A gambler, making many many bets, would be well advised to understand the expected outcome of their ‘system’ of betting if they are to be successful, and equally well-advised not to make any bets if the system doesn’t have a positive expectation value.
    But the deeper problem with the lottery analogy for economic is this. If economics is a lottery, you don’t have a choice about buying a ticket, but you may have a choice about which ticket to buy/lottery to participate in. And usually the real goal is more about surviving long enough to ‘buy another ticket’ that it is about increasing expected utility with today’s ticket.
    What I’m getting at is that what I learned in second-level economics is that the foundation stone of an economic system is excess production (beyond what’s needed for subsistence). And the basic unit of that (my lottery ticket) is a day’s productive capacity (minus the subsistence component). This is what I mean by not having a choice about buying a ticket. Today I either produce or I lose today’s opportunity to produce. There is no not-buying-a-ticket option. But I can choose what I try to produce. My point is a mathematical model built on which ticket to buy not whether to buy a ticket would be more realistic. My other point is that often the decisions I make about how I spend today’s productive capacity are more focused on staying in the game, than it is on winning today (a simple example being going into debt to gain an education while not working).
    While this analysis looks a how one person spends one day, it seems just as applicable to buy a share, or deciding to produce an item for sale, not least because the expected time value of money, like personal output is negative, if the money or the personal output is not put at risk.

    Thanks again. And it’s great to see you replying to comments.

    Shane Holohan


  21. though I was quite hesitant to write here , given some of the comments above are really irrationally ignorant , I would like to ask all of you , why you are assuming that price of a ticket or whatever other gadget like a microchip or even a membership in some nudity beach should reflect all the information needed ? the issue here-among other things- is that , many theories within economics assume that prices will be the ultimate source of a truth as to why we took this and that decision , in engineering , this is equivalent to “an inverse problem ” , price or/and quantity associated can’t be used as – in isolation – to understand the human economic behavior , it is much more complicated than that , in a way prices are used as more of a voting tool as well as measuring tool , and secondly we can’t use “schrodenger’s type of multiverse time average” to measure someone’s change in wealth , in fact the very existence of insurance markets should indicate to us that we should use “more consistent way to measure” risk and return in one simple , consistent way and I can’t see any other consistent way to arrive at that than by looking in to the formula above i.e. geometric average (BEV) or cagr as it is only one formula in which both risk and return can be represented and measured , relative to say how risk is measured in finance textbooks ” mu – (sigma)^2/2 ” that is a very simplistic way to measure risk , which don’t reflect the actual payoff associated with each economic or financial decision. in fact if you go and write a program on python to find the weights (say as they do in financial portfolios) which maximize your target (chosen by the user), these weights will differ hugely from one program to another , all depending on metric(s) you want to optimize, and not only that , it depends on the motivations / incentives of the person who designing such program , weather he is an agent or owner of the money makes a huge difference , but not any more if one uses the right metric and communicate it transparently to the owner , as in this situation there is no need to go in to conflict of interests as it is the case with hedge funds industry, I really recommend all of you to go and a have read at ” SAFE HAVEN:Investing for Financial Storms ” as it may help you see the problem from the right perspective.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s