Reproducibility and weak ergodicity breaking

14 February 2017

The term “weak ergodicity breaking” helps categorize things by their ergodic properties:

ergodic
strongly non-ergodic
weakly non-ergodic

This list is chronological, in that there used to be an implicit belief that everything was ergodic (from 1654), then the realization that some things are strongly non-ergodic (1850s — 1870s), and then the discovery that things can be a lot more subtle (with Jean-Philippe Bouchaud coining the term “weak ergodicity breaking” in 1992).

I will give one example for each category.

ergodic: a stochastic process whose value is a new instance of a random variable at each moment in time. A physical process that would be modeled this way is tossing a coin repeatedly and recording +1 for heads, -1 for tails. The time series will be well modeled as instances of a Bernoulli variable (+1 or -1, each with 50% probability).
strongly non-ergodic: a stochastic process that is confined to some sub-space of its actual range. A physical process would be a coin tossed once, with the resulting time series being either +1 or -1 for all eternity — it will only be one of those possibilities, so the time series will be flat, not jagged.
weakly non-ergodic: a stochastic process whose distribution becomes either infinitely broad or infinitely narrow as time goes by. Physically: a coin tossed repeatedly, like in 1. but with the resulting points added together over time. The resulting time series will be a random walk.

The models we’ve found most useful in our economics research are random-walk like, i.e. they display weak ergodicity breaking.

All three processes have something to do with tossing a coin. You could say that the beating heart of all of them is the coin toss. Clearly then, the question whether a coin toss is ergodic is too imprecise — it generates all kinds of processes. And this explains our choice of nomenclature: it’s best in these discussions to be very precise — we specify the observable we’re talking about, and we specify the ergodic property of that observable. For example: the steps in a random walk have the ergodic property that their time average equals their ensemble average.

Reproducibility

Why this pedantic nomenclature?

Science is in a “reproducibility crisis”, often connected with the term “p-hacking”. These problems have to do with the availability of more data. More data implies more structure in data — in the sense of a statistical truism, not necessarily in the sense of scientific progress.

The study of ergodicity breaking (weak or strong) has something to say about reproducibility. If I follow a time series of an ergodic observable, I just have to average for long enough to get close to the value towards which its time average converges (see process 1 above). But what if I average a non-ergodic observable over time? Well, that’s the thing with non-ergodic obsevables: their time averages don’t necessarily converge to anything meaningful or useful.

Take the three examples

The time average of the repeated coin toss will converge to zero, which is also the expectation value. Every time I run the experiment, I will find the same result (within statistical uncertainties, of course).
The time average of the single coin toss will either be +1 or -1, but it will not be reproducible across different runs of the experiment.
The time average of the random walk is a random number whose variance increases with time. It — that is, the time average itself — is called the “random acceleration process” and has been studied a little bit in statistical mechanics. It’s certainly not reproducible — doing the experiment again will generate a different time average each time.

Case 3. is really nasty: this time average (yellow line in the figure) becomes smoother and smoother as time goes by, and the statistical scientist (one who looks at data without a mechanistic model) can be misled into thinking that it has stabilized to its “equilibrium value” — but it has no equilibrium value!

If you like teasing people, a great way to cheer yourself up is this: make someone guess what the infinite-time average of a random walk is. “If you wait for long enough, what value does it converge to? I’ll give you a hint: it’s not biased, up is as likely as down, and the random walk is known to be recurrent — it always returns to zero, you just have to wait.” Come on! Intuition says the time average is zero. But it’s wrong — the correct answer is that it does not exist.

So this makes me wonder: could this be part of the reproducibility crisis? It seems quite easy to cook up a scheme that would fool people into believing — wrongly — that the time average of a random walk has stabilized. A random walk is a fabulously general model, so if it can happen here, then it can happen in many actual real-world experiments.

Again, the nomenclature: the random walk (process 3) is not ergodic, but its steps (process 1) are ergodic. Because its steps are ergodic, so is, for instance, its square deviation over some fixed interval. Finding meaningful ergodic observables for non-ergodic processes is a hugely important part of doing ergodicity economics, and of doing science in general.

p.s. We have now written up a little model of measuring a non-ergodic observable, available here: arXiv:1706.07773. We calculate what happens if we average a Brownian motion over time. The time average becomes smoother, i.e. we will eventually see no more changes in it, which may lead us to believe that it has converged to some fundamentally true value. But repeating the experiment will yield a different value. Even though the time average becomes smoother, its distribution across the ensemble becomes broader.

Ergodicity Economics

Reproducibility