Hypothesis testing with LogLikelihood methods in Mathematica

Question

I'm trying to understand Likelihood methods and hypothesis. To this end I am trying to construct small examples that I can play with. Let's say I have some data which I know (or suspect follows the function)
$$f(x) = (x + x_0)^{2}$$
and I want to find out the value of the parameter $x_{0}$ and the associated error using likelihood methods.
Let us then make some pretend experimental data:
f[x0_, x_] := (x + x0)^2
  
ExperimentData = Table[{x, f[-1.123, x] + RandomVariate[NormalDistribution[0, 0.25]]}, {x, 0, 3, 0.1}];

Then let us construct some test data where I "guess" my parameter $x_{0}$. I replace $x_{0}$ with the parameter $theta$ to represent my test value:
TestData = 
Table[
        {[Theta], Table[{x, f[[Theta], x]}, {x, 0, 3, 0.1 }]},
        {[Theta], 0.5, 1.6, 0.1}
     ];

How can I use LogLikelihood to make to a hypothesis test, Using my TestData? The motivation is if I cannot construct a pure function, for example if I generate my test data from a numeric intergeneration.
My approach so far is to maximise the log-likelihood of the "residuals"
X = ExperimentData[[All, 2]];
MLLTest = 
  Table[
        [Theta] = TestData[[i, 1]];        
        F = TestData[[i, 2]][[All, 2]];
        MLL = 
    FindMaximum[
      LogLikelihood[NormalDistribution[[Mu], [Sigma]], 
       X - F], {{[Mu], 0}, {[Sigma], 0.25}}][[1]];
        {[Theta], MLL},
        {i , 1, Length[TestData]}
    ];

Then if I plot the Maximum Log-Likelihood as a function of my guess parameter $theta$.
However this is clearly wrong, so I think I misunderstand something about the Log-Likeihood in this context.

Small clarification: While in the example I have shown this can be solved without the need for test data, I am using this as a toy model for cases where the function $f(x)$ is some integral with no closed form solution. Meaning I would need to numerically calculate $f(x)$ for a given parameter value, then compare this against my experimentally measured data.

Second Attempt
It's possible I am chasing a red herring here, but in an attempt to try and describe what I want to achieve, here's a second example. First my "Experiment Data":
ExperimentData = 
Table[
        {x, f[-0.5, x] +  RandomVariate[NormalDistribution[0, 0.02 ]]},
        {x, 0, 1, 0.025}
    ];

Next my test data, in practice this wouldn't come from such a trivial function as defined above, but perhaps from a model that I may only calculate numerically:
TestData = 
Table[
        {
            x0, Table[f[x0, x], {x, 0, 1, 0.025}]
        },
        {x0, -1, 0, 0.1}
    ];

Note that I generate data for different values of $x_0$.  Next, my actual evaluation:
X = ExperimentData[[All,2]];
test = 
Table[
        x0Test = TestData[[i, 1]];
        F = TestData[[i, 2]];
        R = F - X;
        
        MLL = FindMaximum[{LogLikelihood[NormalDistribution[M, S], F - X], S > 0}, {M, S}][[1]];
        {x0Test, MLL},
        {i, 1, Length[TestData]}
    ]

If I plot the MLL as a function of test parameter I get:

Note that the maximum occurs around my true value. Superficially, this is similar to a Chi-Square test.
If my approach is valid, how can I properly extract a parameter estimate and error with this method?

JimB · Answer

There is no need (or reason) to create TestData.  The parameter x0 can be directly estimated from ExperimentData.  Also you likely have 2 parameters to estimate:  x0 and the error variance (unless you're able to specify that as known which is rare).
(* Generate data *)
SeedRandom[12345]; ExperimentData = 
 Table[{x, f[-1.123, x] + RandomVariate[NormalDistribution[0, 0.25]]}, {x, 0, 3, 0.1}];

(* Log of likelihood *)
logL = LogLikelihood[NormalDistribution[0, σ], 
   ExperimentData[[All, 2]] - (ExperimentData[[All, 1]] + x0)^2];

(* Maximum likelihood estimates of x0 and σ *)
mle = FindMaximum[{logL, σ > 0}, {x0, σ}]
(* {0.00637381, {x0 -> -1.11687, σ -> 0.241921}} *)

What you've described (in your simplified example) is a nonlinear regression which can be performed with NonlinearModelFit if the error structure has a common (meaning identical variance for all observations) univariate normal distribution.  I would imagine you're contemplating other distributions.

Steffen Jaeschke · Answer

Work with something like this:
EstimatedDistribution[ExperimentData[[All, 2]], 
 NormalDistribution[[Alpha], [Beta]], 
 ParameterEstimator -> "MethodOfMoments"]

(NormalDistribution[0.084346, 0.0787433])
EstimatedDistribution
"Method-of-moment-based estimators may not satisfy all restrictions on parameters." but if they do it is good.
EstimatedDistribution[ExperimentData[[All, 2]], 
 NormalDistribution[[Alpha], [Beta]], 
 ParameterEstimator -> "MaximumLikelihood"]

(NormalDistribution[0.084346, 0.0787433])
is the close related same with the desired "MaximumLikelihood" so maximize the log-likelihood function.

Hypothesis testing with LogLikelihood methods in Mathematica

2 Answers

Add your own answers!

Ask a Question