Calculus of variations: meaning of infinitesimal variation $delta$ and action minimum

Question

So I am studying classical mechanics through the MIT 8.223 notes, and encountered the derivation of the Euler Lagrange equation. There is a part I don't quite understand, which resides in the actual meaning of the $delta$ symbol here. We define the action $S[q(t)]$ as the integral from $t_1$ to $t_2$ of $L(q,dot q,t)$:
$$S[q(t)] = int_{t_1}^{t_2}L(q,dot q,t) dt.$$
We also define a new slightly perturbed function $q(t) + delta q(t)$ and the variation of the action $delta S$ as the difference between the action evaluated at the perturbed and initial functions, respectively (the lagrangian function is the same for both)
$$delta S = S[q+delta q]-S[q] = int_{t_1}^{t_2}L(q + delta q,dot q + delta dot q,t) dt - int_{t_1}^{t_2}L(q,dot q,t) dt. $$
It is then said that:
$$ delta S = delta int_{t_1}^{t_2}L(q,dot q,t) dt  = int_{t_1}^{t_2} delta L(q,dot q,t) dt. $$
Then, by using the chain rule:
$$int_{t_1}^{t_2} delta L(q,dot q,t) dt = int_{t_1}^{t_2} frac{partial L}{partial q} delta q +  frac{partial L}{partial dot q} delta dot q  dt.$$
The derivation goes on, but this is enough to answer my question. I get everything until the definition of $ delta S$, here $delta$ just acts on two places, to define $delta q$ , which is a slight perturbation to the original function (but still a function of $t$, we can even take derivatives of it) and to define $delta S$, which has a straightforward definition given above, it is just the difference of the functional at the perturbed and original functions.

The thing I don't get is the use of  $delta$ afterwards, it is brought into the integral as if it were a new kind of derivative and it even acts on $L$. However, this use of $delta$ hasn't been defined. So what is this "operator" exactly and why can it act both to define the perturbations on the action and the generalized coordinate and to operate on functions?

Another shorter question: why is $delta S = 0$? I know it might seem weird, but to me it seems like it should be greater than zero, if we were looking for a minimum, since we said that it is the difference between the action evaluated at the different perturbed and non perturbed functions, and the action at the original function is a minimum, thus the action at any other function is greater than that value. Shouldn't that make it greater than zero?

Vicky · Accepted Answer

Regarding your question about $delta$ and the $t$-dependence of $q$. First of all, $delta$ means variation which is different from derivation. In other words,
$$
delta L({x_i}) = sum_j frac{partial L}{partial x_j}delta x_j 
$$
where $delta x_j$ is a variation of $x_j$, not in time but a change of its form. E.g., if $x_j^{(1)} = x_j(t = 0) + 5t$ and $x_j^{(0)} = x_j(0) + 5(1 - 0.00001)t$, then $delta x_j$ could be $delta x_j = x_j^{(1)} - x_j^{(0)} = 0.0005t$. We have not changed $t$ but the function that $x_j$ can be (its form): the thing you've been calling trajectory since high school.
Now you can understand that $delta L neq frac{dL}{dx}$ or equivalent things. $delta$ is defined as the change of $S$ or $L$ when you change the trajectory your body is following, not when you change the time.
Secondly, $delta S = 0$ is not impossed to get a minimum but to get a singular point (i.e. a maximum, minimum or saddle point) due to all partial derivatives are zero then. You make it equal to zero because you know, since Euler and Lagrange, that the Euler-Lagrange equations give you the classical trajectory of the body under study. As far as I know (but I could be wrong), it wasn't until Feynman that we know that classically $delta S = 0$ implies a minimum. But that comes from the path-integral formulation of quantum mechanics which is a thing for another question. Nevertheless, for completeness I'll give you a little insight. In quantum mechanics, the probability $P$ of a process comes as
$$
P sim e^{-S/hbar}
$$
So only the smallest actions will give you relevant contributions to $P$ (yeah in QM, more than one count so your classical approximation, your classical trajectory, will be the one in the minimum: the smallest of the smallest for having the highest $P$).

d_b · Answer

I address question 1 only.
The standard notation is indeed unfortunate. First of all, let's dispense with the "$delta x$" notation. The $delta$ is $delta S$ and in "$delta x$" mean completely different things. As I'll explain shortly, we can think of the $delta$ in $delta S$ as an operation applied to the action $S$, but "$delta x$" is one inseparable symbol meant to stand for an infinitesimal variation in the path. It is not $delta$ applied to $x$. So let's instead write this infinitesimal variation as $epsilon$.
Now, given an action functional $S(x)$, $delta S$ stands for the derivative of $S$ with respect to variations in the path $x$. Specifically,
begin{align}
S(x+epsilon) - S(x) = delta S + R,
end{align}
where $delta S$ is a linear function of $epsilon$, and $R$ is $O(epsilon^2)$.
Computing this following the usual steps, we find (assuming we choose $epsilon(t_i) = epsilon(t_f)$)
begin{equation}
delta S = int_{t_i}^{t_f}dt left(frac{partial L}{partial x} -frac{d}{dt}frac{partial L}{partialdot{x}}right) epsilon
end{equation}
Then a further unfortunate choice is often made, namely to denote the integrand in this expression as "$delta L$", so that "$delta S = int delta L, dt$". Again, this is a definition of the inseparable symbol "$delta L$", and not an operation applied to the lagrangian.
References: Arnold, Mathematical Methods of Classical Mechanics, Section 12; José and Saletan, Classical Dynamics, Section 3.1

Owen · Answer

To understand the derivation, you shouldn't seek a mathematically precise definition of the $delta$ as an operator. Throughout the derivation it has different mathematical meanings, but the physical meaning is consistent: that of a small change.
We make a small change to $q(t)$ and call that $delta q(t)$. Then we look at how everything else changes to first order, and denote that small change by a $delta$. So we have $delta S$, $delta L$, $delta dot{q}$, etc.
The only new operator here is really the $delta$ on the $S$, which is something like the $nabla$ operator but applied to functionals. Everywhere else that the $delta$ appears it is more like the typical $d$ of usual calculus.
And the fact that $delta leftrightarrow nabla$ on $S$ answers your second question. To find a minimum for a function on vectors we would solve $nabla f = 0$. On functionals we solve $delta S = 0$. Yes, this doesn't mean that the point actually is a minimum: it could be a maximum, or saddle point. That is just an unfortunate mis-naming of the 'Principle of Least Action'; it should really be called the 'Principle of Stationary Action'.

Cleonis · Answer

To discuss derivation of the Euler-Lagrange equation I must first discuss the following lemma:
(To my knowledge this lemma doesn't have a name of its own, possibly it is regarded as trivially evident. In another physics.stackexchange answer I have proposed the name Jacob's lemma, after Jacob Bernoulli.)
To present this lemma let me go back to the problem that inspired the development of calculus of variations: the brachistochrone.
The solution of the brachistochrone problem is a function that minimizes the time to travel from start to end. Take the solution of the problem, and divide it in two sections. Each subsection of the solution has the same property as the global solution: it is minimal. You can continue subdividing indefinitely, the property of being minimal carries over indefinitly, so th extends to infinitisimally short subdivisions. This connects variational and differential calculus.
The above reasoning is a proof of existence:
If you can state a problem in a variational form (fixed start and end points, varying in between), and the solution is an extremum (minimum or maximum), then the solution of that problem can also be found with a differential equation.
I have used the brachistochrone problem as an example, this reasoning generalizes to all cases; the extremum can be either a maximum or a minimum.

The Euler-Lagrance equation
With the above in place I can turn to the Euler-Lagrange equation. The Euler-Lagrange equation (a differential equation) accepts any problem stated in variational form, and transforms it to a problem stated in terms of differential calculus.
I recommend the derivation of the Euler-Lagrange equation by Preetum Nakkiran. Preetum Nakkiran points out that since the equation expresses a local condition it should be possible to derive it using local reasoning only.
This derivation with local reasoning only has the following advantage: all of the steps have an intuitive meaning.
The derivation that you encountered in your learning material, with global variation of the trial trajectory, is unneccesarily elaborate.

Classical mechanics
In terms of Lagrangian mechanics the true trajectory is the one trajectory that among the range of all trial trajectories has an extremum of the action.
The diagram below shows a sequence of 7 frames, each shown 3 seconds (animated GIF)
The sequence demonstrates the case of uniform acceleration.
Black curve: the trial trajectory
Red curve: kinetic energy
Green curve: minus potential energy
Note that in order to demonstrate the concept of Action the curve for the potential energy is upside down; it's the minus potential energy.
As the trial trajectory is varied: when the trial trajectory hits the true trajectory the red curve and the green curve are parallel everywhere.
That is, this method uses the work-energy theorem to identify the true trajectory.
The lower-right quadrant shows the two integrals that together make up the action of classical Lagrangian mechanics

Calculus of variations: meaning of infinitesimal variation $delta$ and action minimum

4 Answers

Add your own answers!

Ask a Question