Numerical stability of taking the `mean` of outputs from the simulation of a discrete stochastic dynamical system

Question

I am writing a simulation for a discrete stochastic dynamical system. Since the simulation is stochastic, I need to run the simulation multiple times and then average the values of each timestep. I have some simple julia code below to demonstrate the process.
My question was really about numerical stability when averaging data from say 100, 1000, 10000 simulations. The simulation is discrete and not continuous, so I don't need to worry as much about the fine grained discretization, where error accumulation really matters. But, I was not sure if there are issues with round-off error accumulating over time even for something as simple as a mean?
In the code below, I keep a running mean of some simulation output v. So for each run through the simulation, I just take the average of the running mean and the new value--in this case a random number rand(). Is this a numerically stable way to do this, or do I need to allocate and save each simulation to some large array or file, and then average them at the end?
n2 = 5
v = 0.0
for i in 1:n2
    v = mean([v, rand()])
end

Any suggestions would really be appreciated. I am trying to find the right balance between numerical stability versus limiting unnecessary allocations of memory. Thanks.

Wolfgang Bangerth · Accepted Answer

It turns out that stable summation of numbers is a topic that is still being researched today -- but you can get the basics by looking up "Kahan's summation algorithm".
That said, there really is only a stability issue if you have numbers of widely varying size. In that case, you need to do the summation in a particular order -- intuitively from small to large -- to avoid catastrophic cancellation when you add a small number to an already large partial sum. But it seems like that will not actually be a problem for you: it may be if you add up random numbers, but one might suspect that the numbers you get out of your simulations are all of roughly the same order of magnitude, and in that case you can just add them up one after the other. That has the advantage that you don't have to store any numbers but just the current running sum or the current running average.

Numerical stability of taking the `mean` of outputs from the simulation of a discrete stochastic dynamical system

One Answer

Add your own answers!

Ask a Question